Date: Fri 13 Oct 2006 - 14:35:37 GMT

Thank you, Alex! That's exactly what I was looking to do. I'm going to remove the loops and use your apply function approach. Best regards and much thanks, brian

On 10/13/06, Alex Brown <alex@transitive.com> wrote:

I thought at first that you could use a weighted sample (the sample

function) but, you can't since it doesn't take proper account of
replacement if you try that.
**>
You can use the list approach, but through the power of R, you don't
need a lot of loops to do it...
**>
I can't speak for the efficiency of this approach in terms of cpu cycle.
**>
In short:
**>
apply(z2,2,function(x)sample(rep(names(x),x),100))
**>
In long:
**>
#let's load the data:
**>
z = scan(,"",sep="\n")
sample.1 sample.2 sample.3
red.candy 400 300 2500
green.candy 100 0 200
black.candy 300 1000 500
**>
#and turn into a table
**>
z2 = read.table(textConnection(z), header=TRUE, row.names=1)
**>
# let's create a functon to expand a sample column into individuals:
**>
expand <- function(x) rep(names(x), x)
**>
# test it on a smaller set:
**>
ex <- expand( c( red = 2, blue = 3) )
**>
ex
[1] "red" "red" "blue" "blue" "blue"
**>
# and sample 2 things from that:
**>
sample( ex, 2 )
**>
# combine the two
**>
samplex <- function( x, size ) sample(expand(x), size )
**>
samplex( c( red = 2, blue = 3), size = 2 )
**>
# ok, now we use the apply function to apply this to each column
**>
apply(z2, 2, samplex, size = 2 )
**>
# you wanted 100?
**>
apply(z2, 2, samplex, size = 100 )
**>
# all done.
**>
#You should note that if there are less than 100 (samplenumber)
candies in any given sample, this function will fail.
# eg:
**>
apply(z2, 2, samplex, size = 2000 )
**>
Error in sample(length(x), size, replace, prob) :
cannot take a sample larger than the population
when 'replace = FALSE'
**>
-Alex
**>
On 11 Oct 2006, at 15:10, Brian Frappier wrote:
**>
> Hi Petr,
**> >
> Thanks for your response. I have data that looks like the following:
**> >
> sample 1 sample 2 sample 3 ....
> red candy 400 300 2500
> green candy 100 0 200
> black candy 300 1000 500
**> >
> I don't want to randomly select either the samples (columns) or the
> "candy"
> types (rows), which sample as you state would allow me. Instead, I
> want to
> randomly sample 100 candies from each sample and retain info on their
> associated type. I could make a list of all the candies in each
> sample:
**> >
> sample 1
> red
> red
> red
> red
> green
> green
> black
> red
> black
> ...
**> >
> and then randomly sample those rows. Repeat for each sample. But,
> I am not
> sure how to do that without alot of loops, and am wondering if
> there is an
> easier way in R. Thanks! I should have laid this out in the first
> email...sorry.
**> >
**> >
> On 10/11/06, Petr Pikal <petr.pikal@precheza.cz> wrote:
**> >>
> Hi
**> >>
> I am not experienced in Matlab and from your explanation I do not
> understand what exactly do you want. It seems that you want randomly
> choose a sample of 100 rows from your martix, what can be achived by
> sample.
**> >>
> DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300)
> DF[sample(1:100, 10),]
**> >>
> If you want to do this several times, you need to save your result
> and than it depends on what you want to do next. One suitable form is
> list of matrices the other is array and you can use for loop for
> completing it.
**> >>
> HTH
> Petr
**> >>
**> >>
> On 10 Oct 2006 at 17:40, Brian Frappier wrote:
**> >>
> Date sent: Tue, 10 Oct 2006 17:40:47 -0400
> From: "Brian Frappier" <brian.frappier@gmail.com>
> To: r-help@stat.math.ethz.ch
> Subject: [R] rarefy a matrix of counts
**> >>
>> Hi all,
**> >>>
>> I have a matrix of counts for objects (rows) by samples
>> (columns). I
>> aimed for about 500 counts in each sample (I have about 80 samples)
>> and would now like to rarefy these down to 100 counts in each sample
>> using simple random sampling without replacement. I plan on
>> rarefying
>> several times for each sample. I could do the tedious looping
>> task of
>> making a list of all objects (with its associated identifier) in
>> each
>> sample and then use the wonderful "sampling" package to select a
>> sub-sample of 100 for each sample and thereby get a logical
>> vector of
>> inclusions. I would then regroup the resulting logical vector
>> into a
>> vector of counts by object, rinse and repeat several times for each
>> sample.
**> >>>
>> Alternately, using the same list, I could create a random index of
>> integers between 1 and the number of objects for a sample (without
>> repeats) and then select those objects from the list. Again, rinse
>> and repeat several time for each sample.
**> >>>
>> Is there a way to directly rarefy a matrix of counts without
>> having to
>> create a list of objects first? I am trying to switch to R from
>> Matlab and am trying to pick up good programming habits from the
>> start.
**> >>>
>> Much appreciation!
**> >>>
**> >>>
**> >>
> Petr Pikal
> petr.pikal@precheza.cz
**> >>
**> >>
**> >
**> >
**>
**>
