[R] Filtering out a data.frame

From: Jeff08 <jefferyding_at_gmail.com>
Date: Mon, 07 Jun 2010 20:07:31 -0700 (PDT)

Sample Data.Frame format

Name is Returns.nodup

            X id ticker date_ adjClose totret RankStk 427225 427225 00174410 AHS 2001-11-13 21.66 100 1235 "id" uniquely defines a row

What I am trying to do is filter out id's that have less than 1500 data points (by date)

First, I used

total<-by(Returns.nodup, Returns.nodup$id,nrow)

which subsetted by ID and calculated the number of data points for each ID

Now I am trying to figure out a way to use this to filter out the original data.frame (Returns.nodup)

I have tried using the following, but it is VERY slow:

z<-unlist(lapply(1:length(y), function(i) which(a$id==y[i]) )) Returns.filtered<-Returns.nodup[z,]

Is there a faster way to do this?

View this message in context: http://r.789695.n4.nabble.com/Filtering-out-a-data-frame-tp2246814p2246814.html
Sent from the R help mailing list archive at Nabble.com.

R-help_at_r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 08 Jun 2010 - 04:16:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jun 2010 - 04:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive