Re: [R] Keep value lables with data frame manipulation

From: Marc Schwartz (via MN) <mschwartz_at_mn.rr.com>
Date: Thu 13 Jul 2006 - 04:14:22 EST

On Wed, 2006-07-12 at 17:41 +0100, Jol, Arne wrote:
> Dear R,
>
> I import data from spss into a R data.frame. On this rawdata I do some
> data processing (selection of observations, normalization, recoding of
> variables etc..). The result is stored in a new data.frame, however, in
> this new data.frame the value labels are lost.
>
> Example of what I do in code:
>
> # read raw data from spss
> rawdata <- read.spss("./data/T50937.SAV",
> use.value.labels=FALSE,to.data.frame=TRUE)
>
> # select the observations that we need
> diarydata <- rawdata[rawdata$D22==2 | rawdata$D22==3 | rawdata$D22==17 |
> rawdata$D22==18 | rawdata$D22==20 | rawdata$D22==22 |
> rawdata$D22==24 | rawdata$D22==33,]
>
> The result is that rawdata$D22 has value labels and that diarydata$D22
> is numeric without value labels.
>
> Question: How can I prevent this from happening?
>
> Thanks in advance!
> Groeten,
> Arne

Two things:

  1. With respect to your subsetting, your lengthy code can be replaced with the following:

  diarydata <- subset(rawdata, D22 %in% c(2, 3, 17, 18, 20, 22, 24, 33))

See ?subset and ?"%in%" for more information.

2. With respect to keeping the label related attributes, the 'value.labels' attribute and the 'variable.labels' attribute will not by default survive the use of "[".data.frame in R (see ?Extract and ?"[.data.frame").

On the other hand, based upon my review of ?read.spss, the SPSS value labels should be converted to the factor levels of the respective columns when 'use.value.labels = TRUE' and these would survive a subsetting.

If you want to consider a solution to the attribute subsetting issue, you might want to review the following post by Gabor Grothendieck in May, which provides a possible solution:

  https://stat.ethz.ch/pipermail/r-help/2006-May/106308.html

and this post by me, for an explanation of what is happening in Gabor's solution:

  https://stat.ethz.ch/pipermail/r-help/2006-May/106351.html

HTH, Marc Schwartz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 13 04:22:53 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 13 Jul 2006 - 20:13:10 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.