Re: [R] set.seed ( ) function

From: Niels Richard Hansen <Niels.R.Hansen+lists_at_math.ku.dk>
Date: Fri, 22 Apr 2011 10:43:59 -0400

Tal

Let me express some concern about using words like "true" or "real" in relation to random number generation - for exactly the same reasons as mentioned here:

http://xianblog.wordpress.com/2010/09/07/truly-random/

Device random number generators (whether provided via web-services or not) should be regarded with as much skepticism as algorithmic generators, and they typically don't have a set.seed() function for reproducibility -- you would have to store the entire sequence.

On 22/04/11 04.28, Tal Galili wrote:
> BTW, Ken Kleinman recently wrote a post on how to get a "real" random
> numbers (into R) from a web-service:
> http://www.r-bloggers.com/example-8-35-grab-true-not-pseudo-random-numbers-passing-api-urls-to-functions-or-macros/
>
> <http://www.r-bloggers.com/example-8-35-grab-true-not-pseudo-random-numbers-passing-api-urls-to-functions-or-macros/>
> Cheers,
> Tal
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili_at_gmail.com | 972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
> ----------------------------------------------------------------------------------------------
>
>
>
>
> On Fri, Apr 22, 2011 at 6:47 AM, Joshua Wiley<jwiley.psych_at_gmail.com>wrote:
>
>> On Thu, Apr 21, 2011 at 8:34 PM, Penny Bilton<pennybilton_at_xnet.co.nz>
>> wrote:
>>> Hi Josh,
>>>
>>> Thanks for your reply.
>>>
>>> The problem is have is in trying to retain the proportions of 2 groups in
>> my
>>> data while sampling into training and test sets. I find that different
>>> arguments for set.seed give very different proportions of my 2 groups in
>>> the training and test sets.
>>
>> Sure, just because numbers are random does not guarantee that equal
>> numbers from both groups will be sampled. Perhaps you are looking for
>> some sort of constrained random sampling like sampling x from group 1
>> and x from group 2? If so, try calling sample() separately on each
>> group (for help applying the same function to different groups, take a
>> look at ?by or ?tapply for example).
>>
>> Josh
>>
>> PS cced back to list
>>
>>>
>>>
>>> Penny.
>>>
>>>
>>>
>>> On 22/04/2011 3:27 p.m., Joshua Wiley wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Thu, Apr 21, 2011 at 8:18 PM, Penny Bilton<pennybilton_at_xnet.co.nz>
>>>> wrote:
>>>>>
>>>>> I am using /set.seed()/ before the /sample/ function.
>>>>>
>>>>> How does the length of the argument of /set.seed()/ and order of the
>>>>> digits affect how the sampling is carried out?
>>>>
>>>> You can use set.seed() to specify a particular seed so that while
>>>> pseudo-random numbers are sampled, you can repeat it. For example:
>>>>
>>>> set.seed(10)
>>>> rnorm(10)
>>>> set.seed(10)
>>>> rnorm(10)
>>>>
>>>>> Specifically, I have used set.seed(123456789). Will this configuration
>>>>> give me a genuinely random sampling??
>>>>
>>>> You will never get truly random sampling from a computer algorithm,
>>>> but it is darn close and more than adequate in the majority of cases.
>>>> 123456789 is just a length 1 vector containing the number 123456789,
>>>> not 9 separate numbers.
>>>>
>>>> Google will be able to give you a lot of information on pseudo-random
>>>> number algorithms as well as the concept of "seeds". Also see
>>>> ?set.seed
>>>>
>>>> Cheers,
>>>>
>>>> Josh
>>>>
>>>>>
>>>>> Thank you in anticipation.
>>>>>
>>>>> Penny.
>>>>>
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help_at_r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> http://www.joshuawiley.com/
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Niels Richard Hansen                     Web:   www.math.ku.dk/~richard	
Associate Professor                      Email: Niels.R.Hansen_at_math.ku.dk
Department of Mathematical Sciences             nielsrichardhansen_at_gmail.com
University of Copenhagen                 Skype: nielsrichardhansen.dk	
Universitetsparken 5                     Phone: +1 510 502 8161	
2100 Copenhagen Ø
Denmark

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 22 Apr 2011 - 14:48:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 22 Apr 2011 - 22:10:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive