Re: [R] recoding data with loops

From: Erik Iverson <iverson_at_biostat.wisc.edu>
Date: Mon, 19 May 2008 18:49:12 -0400

Got it, I did not know of the 'recode' function in car.

So you would like to recode those specific columns then? Once again, we can do it without a loop, this time with the help of a function called lapply, which applies a function to each item in a list in turn.

Try:

reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2") reversed_varnames <-paste("R", reverse_me_varnames, sep = "")

## See ?paste

mdf[reversed_varnames] <-

   lapply(mdf[reverse_me_varnames],

          function(x) recode(x, recodes = "5:7=NA; 1=4; 2=3; 3=2; 4=1;",
                 as.factor.result = FALSE))

Now what does this actually mean? To the left of '<-' is simply the new columns of our data.frame. We want to then use lapply to do some function to a list of objects. The first argument to lapply is that list. In this case, it is simply the columns of the data.frame you want reversed. A data.frame is a list in R. See ?list and ?data.frame. Then, the next argument to lapply is a function that we want to perform on each element in our list. So, we create a function that accepts as input a variable I simply call 'x'. This 'x' is going to be an item from the list we passed lapply, which is one of the columns of mdf in 'reverse_me_varnames'.

We then use the recode function in the car package to recode x, in a similar way to what you tried before. This function of x we define will get called three times in the above example, once for each of reverse_me_varnames. It will then assign those three new columns to the left-hand side of the <- operator, which are three newly-named columns.

To see why what you tried before did not work, with the for loop, try:

mdf$HEQUAL

contrasted with

t1 <- c("HEQUAL")
mdf$t1

 From the help for ?Extract, $ does not allow 'computed' indices.

I hope this helps!

Erik

Donald Braman wrote:
> Erik,
>
> Your example was just what I needed to generate the data -- many, many
> thanks! The names() function was something I had not grasped fully. I
> now have this and it works very nicely:
>
> var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1", "EDISCRIM",
> "HREVDIS2")
> mdf <- data.frame(replicate(length(var_list), sample(7,100, replace =
> TRUE))) ## generate random data
> names(mdf) ## default names
> names(mdf) <- var_list ## use our names
> mdf
>
> I'm still trying to figure out how to recode (using the car package)
> data into new variables using a similar loop. Basically, I'm not sure
> how to call the variable name and append it to the dataframe name in a
> loop. In Stata I'd do this using single quotes, but clearly that's not
> how R works. I tried several variations on this:
>
> reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
> reversed_varnames <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
> for(i in 1:length(reverse_me_varnames))
> {mdf$reversed_varnames[i] <- recode(mdf$reverse_me_varnames[i],
> '5:7=NA; 1=4; 2=3; 3=2; 4=1;', as.factor.result=FALSE)
>
> While I don't get an error message, the data don't change. Any advice
> on reverse coding non-continguous variables?
>
>
>
> On Mon, May 19, 2008 at 4:12 PM, Donald Braman <donald.braman_at_gmail.com
> <mailto:donald.braman_at_gmail.com>> wrote:
>
> Many thanks --
>
> You are right; I had rnorm() and sample() mixed up in my code. I'll
> work on generating a normal ordinal sample next.
>
> Cheers, Don
>
>
> On Mon, May 19, 2008 at 4:07 PM, Erik Iverson
> <iverson_at_biostat.wisc.edu <mailto:iverson_at_biostat.wisc.edu>> wrote:
>
> Hello -
>
>
> Donald Braman wrote:
>
> # I'm new to R and am trying to get the hang of how it handles
> # dataframes & loops. If anyone can help me with some simple
> tasks,
> # I'd be much obliged.
>
> # First, i'd like to generate some random data in a dataframe
> # to efficiently illustrate what I'm up to.
> # let's say I have six variables as listed below (I really
> # have hundreds, but a few will illustrate the point).
> # I want to generate my dataframe (mdf)
> # with the 6 variables X 100 values with rnorm(7).
> # How do I do this? I tried many variations on the following:
>
> var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
> "EDISCRIM",
> "HREVDIS2")
> for(i in 1:length(var_list)) {var_list[1] <- rnorm(100)}
> mdf <- data.frame(cbind(varlist[1:length(var_list)])
> mdf
>
> There are many ways to do this. Do you mean that you want 6
> columns, 100 observations in each column, each a sample from a
> normal distribution with mean = 7 and sd = 1? You can do this
> without looping in one of several ways. If you are coming from
> a SAS environment (my guess since you talk of looping over
> data.frames), you may be used to looping through a data object.
> In R, you can usually avoid this since many functions are
> vectorized, or take a 'whole object' approach.
>
>
> var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
> "EDISCRIM", "HREVDIS2")
>
> mdf <- data.frame(replicate(6, rnorm(100, 7))) ## generate
> random data
> names(mdf) ## default names
> names(mdf) <- var_list ## use our names
>
>
>
> # Then, I'd like to recode the variables that begin with the
> letter "H".
> # I've tried many variations of the following, but to no avail:
>
> reverse_list <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
> reversed_list <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
> for(i in 1:length(reverse_list))
> {mdf[ ,e_reversed_list][[i]] <- recode(mdf[
> ,e_reverse_list][[i]],
> '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE)
>
>
> I'm not quite sure what you are after here. What do you mean by
> recode? What package is your 'recode' function located in?
>
> It appears that you may be under the impression that the
> data.frame contains integers, but certainly it will not since it
> was generated with rnorm? sample can generate a samples of the
> type you may be after, for example,
>
> > sample(7, 100, replace = TRUE)
>
> Best,
> Erik Iverson
>
>
>
>
> --

> Donald Braman
> http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
> http://research.yale.edu/culturalcognition
> http://ssrn.com/author=286206
>
>
>
>
> --
> Donald Braman
>
http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
> http://research.yale.edu/culturalcognition
> http://ssrn.com/author=286206



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 19 May 2008 - 23:49:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 20 May 2008 - 05:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive