Re: [R] unequal number of observations for longitudinal data

From: gallon li <gallon.li_at_gmail.com>
Date: Sat 27 Jan 2007 - 11:54:13 GMT

what if i want to order within each id by their time? is there such an option? (right now some observation at time 100 were placed before 50 b/c R treated 1 first)

On 1/27/07, Chuck Cleland <ccleland@optonline.net> wrote:
>
> gallon li wrote:
> > Two questions:
> >
> > 1. How do I replace "NA" with 0?
>
> df.long2$x <- replace(df.long2$x, is.na(df.long2$x), 0)
>
> ?replace
>
> > 2. How can I sort the observations by their id instead of by time?
> (actually
> > i can see what you produced is automatically sorted by id; but in my
> case,
> > the output data is sorted by time)
>
> df.long2 <- df.long2[order(df.long2$id),]
>
> or better ...
>
> df.long2 <- df.long2[order(row.names(df.long2)),]
>
> df.long2
> id time x
> 1.1 1 1 0.6375135
> 1.2 1 2 0.1651258
> 1.3 1 3 0.0000000
> 1.4 1 4 0.0000000
> 1.5 1 5 0.3210223
> 2.1 2 1 0.9878134
> 2.2 2 2 0.8909020
> 2.3 2 3 0.7747615
> 2.4 2 4 0.3834130
> 2.5 2 5 0.9853269
> 3.1 3 1 0.0000000
> 3.2 3 2 0.3586109
> 3.3 3 3 0.0000000
> 3.4 3 4 0.8310539
> 3.5 3 5 0.0000000
>
> R-FAQ 7.23 How can I sort the rows of a data frame?
>
> http://finzi.psych.upenn.edu/R/doc/manual/R-FAQ.html
>
> > On 1/27/07, Chuck Cleland <ccleland@optonline.net> wrote:
> >> gallon li wrote:
> >>> i have a large longitudinal data set. The number of observations for
> >> each
> >>> subject is not the same across the sample. The largest number of a
> >> subject
> >>> is 5 and the smallest number is 1.
> >>>
> >>> now i want to make each subject to have the same number of
> observations
> >> by
> >>> filling zero, e.g., my original sample is
> >>>
> >>> id x
> >>> 001 10
> >>> 001 30
> >>> 001 20
> >>> 002 10
> >>> 002 20
> >>> 002 40
> >>> 002 80
> >>> 002 70
> >>> 003 20
> >>> 003 40
> >>> 004 ......
> >>>
> >>> now i wish to make the data like
> >>>
> >>> id x
> >>> 001 10
> >>> 001 30
> >>> 001 20
> >>> 001 0
> >>> 001 0
> >>> 002 10
> >>> 002 20
> >>> 002 40
> >>> 002 80
> >>> 002 70
> >>> 003 20
> >>> 003 40
> >>> 003 0
> >>> 003 0
> >>> 003 0
> >>> 004 ......
> >>>
> >>> so that each id has exactly 5 observations. is there a function which
> >> can
> >>> allow me do this quickly?
> >> Filling in with zeros seems like a bad idea, but here is an approach
> >> to filling in with NAs. I will leave replacing the NAs with zeros to
> you.
> >>
> >> df.long <- data.frame(id = c(1,1,1,2,2,2,2,2,3,3), x = runif(10),
> >> time = c(1,2,5,1,2,3,4,5,2,4))
> >>
> >> df.long
> >> id x time
> >> 1 1 0.72888215 1
> >> 2 1 0.60893548 2
> >> 3 1 0.41347690 5
> >> 4 2 0.79388248 1
> >> 5 2 0.05810054 2
> >> 6 2 0.02451654 3
> >> 7 2 0.85464775 4
> >> 8 2 0.15970365 5
> >> 9 3 0.22856183 2
> >> 10 3 0.38291471 4
> >>
> >> df.wide <- reshape(df, idvar = "id", v.names = "x", direction="wide")
> >>
> >> df.wide
> >> id x.1 x.2 x.5 x.3 x.4
> >> 1 1 0.6375135 0.1651258 0.3210223 NA NA
> >> 4 2 0.9878134 0.8909020 0.9853269 0.7747615 0.3834130
> >> 9 3 NA 0.3586109 NA NA 0.8310539
> >>
> >> df.long2 <- reshape(df.wide, direction="long")
> >>
> >> df.long2
> >> id time x
> >> 1.1 1 1 0.6375135
> >> 2.1 2 1 0.9878134
> >> 3.1 3 1 NA
> >> 1.2 1 2 0.1651258
> >> 2.2 2 2 0.8909020
> >> 3.2 3 2 0.3586109
> >> 1.5 1 5 0.3210223
> >> 2.5 2 5 0.9853269
> >> 3.5 3 5 NA
> >> 1.3 1 3 NA
> >> 2.3 2 3 0.7747615
> >> 3.3 3 3 NA
> >> 1.4 1 4 NA
> >> 2.4 2 4 0.3834130
> >> 3.4 3 4 0.8310539
> >>
> >> This assumes that your data in the "long" format has a time variable.
> >> See the help page for reshape() for more details.
> >>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help@stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >> --
> >> Chuck Cleland, Ph.D.
> >> NDRI, Inc.
> >> 71 West 23rd Street, 8th floor
> >> New York, NY 10010
> >> tel: (212) 845-4495 (Tu, Th)
> >> tel: (732) 512-0171 (M, W, F)
> >> fax: (917) 438-0894
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Jan 27 22:58:35 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 27 Jan 2007 - 13:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.