Re: [R] Complex sampling?

From: <rex.dwyer_at_syngenta.com>
Date: Wed, 09 Mar 2011 13:28:11 -0500

It sounds like you want a bunch of random permutations of 1:7. Try order(runif(7))
If you need, say, 10 of them:
as.vector(sapply(1:10,function(i) order(runif(7)))) Is it more complicated than that?

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Hosack, Michael Sent: Wednesday, March 09, 2011 1:02 PM
To: r-help_at_R-project.org
Subject: [R] Complex sampling?

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Hosack, Michael
> Sent: Wednesday, March 09, 2011 7:34 AM
> To: r-help at R-project.org
> Subject: [R] Complex sampling?
>
> R users,
>
> I am trying to generate a randomized weekday survey schedule that ensures
> even coverage of weekdays in
> the sample, where the distribution of variable DOW is random with respect
> to WEEK. To accomplish this I need
> to randomly sample without replacement two weekdays per week for each of
> 27 weeks (only 5 are shown).

This seems simple enough, sampling without replacement.

However,
> I need to sample from a sequence (3:7) that needs to be completely
> depleted and replenished until the
> final selection is made. Here is an example of what I want to do,
> beginning at WEEK 1. I would prefer to do
> this without using a loop, if possible.
>
> sample frame: [3,4,5,6,7] --> [4,5,6] --> [4],[1,2,3,(4),5,6] -->
> [1,2,4,5,6] --> for each WEEK in dataframe

OK, now you have me completely lost. Sorry, but I have no clue as to what you just did here. I looks like you are trying to describe some transformation/algorithm but I don't follow it.

I could not reply to this email because it not been delivered to my inbox, so I had to copy it from the forum. I apologize for the confusion, this would take less than a minute to explain in conversation but an hour to explain well in print. Two DOW_NUMs will be selected randomly without replacement from the vector 3:7 for each WEEK. When this vector is reduced to a single integer that integer will be selected and the vector will be restored and a single integer will then be selected that differs from the prior selected integer (i.e. cannot sample the same day twice in the same week). This process will be repeated until two DOW_NUM have been assigned for each WEEK. That process is what I attempted to illustrate in my original message. This is beyond my current coding capabilities.

>
> Randomly sample 2 DOW_NUM without replacement from each WEEK ( () = no two
> identical DOW_NUM can be sampled
> in the same WEEK)
>
> sample = {3,7}, {5,6}, {4,3}, {1,5}, --> for each WEEK in dataframe
>

So, are you sampling from [3,4,5,6,7], or [1,2,4,5,6], or ...? Can you show an 'example' of what you would like to end up given your data below?

>
> Thanks you,
>
> Mike
>
>
> DATE DOW DOW_NUM WEEK
> 2 2011-05-02 Mon 3 1
> 3 2011-05-03 Tue 4 1
> 4 2011-05-04 Wed 5 1
> 5 2011-05-05 Thu 6 1
> 6 2011-05-06 Fri 7 1
> 9 2011-05-09 Mon 3 2
> 10 2011-05-10 Tue 4 2
> 11 2011-05-11 Wed 5 2
> 12 2011-05-12 Thu 6 2
> 13 2011-05-13 Fri 7 2
> 16 2011-05-16 Mon 3 3
> 17 2011-05-17 Tue 4 3
> 18 2011-05-18 Wed 5 3
> 19 2011-05-19 Thu 6 3
> 20 2011-05-20 Fri 7 3
> 23 2011-05-23 Mon 3 4
> 24 2011-05-24 Tue 4 4
> 25 2011-05-25 Wed 5 4
> 26 2011-05-26 Thu 6 4
> 27 2011-05-27 Fri 7 4
> 30 2011-05-30 Mon 3 5
> 31 2011-05-31 Tue 4 5
> 32 2011-06-01 Wed 5 5
> 33 2011-06-02 Thu 6 5
> 34 2011-06-03 Fri 7 5
>
> DF <-
> structure(list(DATE = structure(c(15096, 15097, 15098, 15099,
> 15100, 15103, 15104, 15105, 15106, 15107, 15110, 15111, 15112,
> 15113, 15114, 15117, 15118, 15119, 15120, 15121, 15124, 15125,
> 15126, 15127, 15128), class = "Date"), DOW = c("Mon", "Tue",
> "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri", "Mon",
> "Tue", "Wed", "Thu", "Fri", "Mon", "Tue", "Wed", "Thu", "Fri",
> "Mon", "Tue", "Wed", "Thu", "Fri"), DOW_NUM = c(3, 4, 5, 6, 7,
> 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7),
> WEEK = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
> 4, 4, 4, 4, 5, 5, 5, 5, 5)), .Names = c("DATE", "DOW", "DOW_NUM",
> "WEEK"), row.names = c(2L, 3L, 4L, 5L, 6L, 9L, 10L, 11L, 12L,
> 13L, 16L, 17L, 18L, 19L, 20L, 23L, 24L, 25L, 26L, 27L, 30L, 31L,
> 32L, 33L, 34L), class = "data.frame")
>

Dan

Daniel Nordlund
Bothell, WA USA



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 09 Mar 2011 - 18:34:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Mar 2011 - 19:20:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive