[R] Subset data in long format

From: Doran, Harold <HDoran_at_air.org>
Date: Wed 07 Jun 2006 - 07:07:57 EST

I have data in a "long" format where each row is a student and each student occupies multiple rows with multiple observations. I need to subset these data based on a condition which I am having difficulty defining.

The dataset I am working with is large, but here is a simple data structure to illustrate the issue

tmp <- data.frame(id = 1:3, matrix(rnorm(30), ncol=10) ) long <- reshape(tmp, idvar='id', varying=list(names(tmp)[2:11]), v.names=('item'),timevar='position' , direction='long') long <- long[order(long$id) , ]
long <- long[c(-2,-13),]

What I need to do is subset these data so I have the first 6 rows for each unique ID. The problem is that the data are unbalanced in that each ID has a different number of observations (which I why I removed obs 2 and 13).

If the data were balanced, the subset would be trivial and I could just do

long <- subset(long, position < 7)

However, the data are not balanced. Consequently, if I were to do this for the unbalanced data I would not have the first 6 obs for the first ID. I would only have the first 5. Theoretically, what I want for id1(and for each unique id) is this

ID1 <- subset(long, id==1)

However, the goal is to subset the entire dataframe at once such that the subset returns a new dataframe with the first 6 rows for each unique id. Is there a feasible method for doing this subset that anyone can suggest? My actual dataset has more than 24,000 unique ids, so I am hoping to avoid looping through this if possible.


        [[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 07 07:15:15 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 07 Jun 2006 - 08:10:53 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.