# Re: [R] extracting a percentage of data by random

From: Chang Liu <changisme_at_hotmail.com>
Date: Wed, 05 Mar 2008 17:37:39 -0800

Thank you! That was very helpful indeed!

Karen> Subject: RE: [R] extracting a percentage of data by random> Date: Thu, 6 Mar 2008 11:20:16 +1000> From: Bill.Venables_at_csiro.au> To: changisme_at_hotmail.com; r-help_at_r-project.org> > You don't need any explicit loops at all. Here is a demo of one way to> do it:> > > set.seed(23) # on Windows> > dat <- data.frame(age = factor(sample(1:4, 200, rep = T)), y => runif(200))> > head(dat) # ages are in random order > age y> 1 3 0.64275524> 2 1 0.56125314> 3 2 0.82418228> 4 3 0.97050933> 5 4 0.02827508> 6 2 0.72291636> > with(dat, table(age)) # how many in each age group> age> 1 2 3 4 > 37 55 44 64 > > ind <- lapply(split(1:nrow(dat), dat\$age),> function(x) sample(x, round(length(x)/10))) # the trick> > ind> \$`1`> [1] 135 2 188 133> > \$`2`> [1] 124 33 140 162 25 13> > \$`3`> [1] 115 79 27 44> > \$`4`> [1] 58 129 84 198 72 109> > > sample_dat <- dat[sort(unlist(ind)), ] # with indices, select data> > sample_dat> age y> 2 1 0.5612531> 13 2 0.7339141> 25 2 0.9548750> 27 3 0.7419931> 3!  3 2 0.6965722> 44 3 0.5363812> 58 4 0.5464051> 72 4 0.2785669> 79 3 0.6453164> 84 4 0.1203811> 109 4 0.9154706> 115 3 0.2118767> 124 2 0.3056171> 129 4 0.7635097> 133 1 0.6474702> 135 1 0.2466226> 140 2 0.6292326> 162 2 0.5338671> 188 1 0.9882631> 198 4 0.1983350> > > > > Bill Venables> CSIRO Laboratories> PO Box 120, Cleveland, 4163> AUSTRALIA> Office Phone (email preferred): +61 7 3826 7251> Fax (if absolutely necessary): +61 7 3826 7304> Mobile: +61 4 8819 4402> Home Phone: +61 7 3286 7700> mailto:Bill.Venables@csiro.au> http://www.cmis.csiro.au/bill.venables/ > > -----Original Message-----> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]> On Behalf Of Chang Liu> Sent: Thursday, 6 March 2008 10:50 AM> To: r-help@r-project.org> Subject: [R] extracting a percentage of data by random> > > Hello Gurus:> > If I have a dataframe with one of the variables called "age" for> example, and I want to extract a random 10% of the observations from> each "age" !  group of the entire data frame. Do I have to double loop to> split the

``` data and then loop again to assign random numbers? Or is there> a better way to do this?> > Thanks!> Karen> > > > _________________________________________________________________> > > [[alternative HTML version deleted]]> > ______________________________________________> R-help@r-project.org mailing list> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.> >
_________________________________________________________________

[[alternative HTML version deleted]]

______________________________________________
```
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 06 Mar 2008 - 01:40:49 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 06 Mar 2008 - 05:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.