Re: [R] subset data frame with condition

From: Petr Savicky <savicky_at_praha1.ff.cuni.cz>
Date: Fri, 18 Mar 2011 19:19:58 +0100

On Fri, Mar 18, 2011 at 10:48:44AM -0700, Nicolas Gutierrez wrote:
> Hello,
>
> One more question.. I have the data.frame "pop":
>
> xloc yloc gonad ind Ene W Area
> 1 23 20 516.74 1 0.02 20.21 1
> 2 23 20 1143.20 1 0.02 20.21 1
> 3 23 20 250.00 1 0.02 20.21 1
> 4 22 15 251.98 1 0.02 18.69 2
> 5 22 15 598.08 1 0.02 18.69 2
> 6 21 19 250.00 1 0.02 20.21 3
> 7 22 20 251.98 1 0.02 18.69 4
> 8 22 20 598.08 1 0.02 18.69 4
>
> and I need to extract 50% (or rounded) of the rows for each Area (from
> Area 1 to 3 only):
>
> xloc yloc gonad ind Ene W Area
> 1 23 20 516.74 1 0.02 20.21 1
> 2 23 20 1143.20 1 0.02 20.21 1
> 4 22 15 251.98 1 0.02 18.69 2
> 6 21 19 250.00 1 0.02 20.21 3
>
> I did this within a loop, but considering my data.frame has more than
> 10,000 rows and within other loops it makes my code run forever! Any
> hints? Thanks!!

Hello.

Let me use a data frame with one column only, but more rows. The following code contains a cycle over 1:3, but otherwise is vectorized.

  pop <- data.frame(Area=c(1,1,1,1,1,2,2,2,2,3,3,3,4,4))   final <- rep(FALSE, times=nrow(pop))
  for (k in 1:3) {

      is.k <- pop$Area == k
      accept <- is.k & (cumsum(is.k) <= ceiling(sum(is.k)/2))
      final <- final | accept

  }
  pop[final, , drop=FALSE] # "drop=" not needed, if there are more columns

Hope this helps.

Petr Savicky.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 18 Mar 2011 - 18:26:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 18 Mar 2011 - 19:20:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive