Re: [Rd] dict package: dictionary data structure for R

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Tue, 24 Jul 2007 19:32:47 +0200

>>>>> "HenrikB" == Henrik Bengtsson <hb_at_stat.berkeley.edu> >>>>> on Tue, 24 Jul 2007 18:58:04 +0200 writes:

    HenrikB> On 7/23/07, Seth Falcon <sfalcon_at_fhcrc.org> wrote:
>> Bill Dunlap <bill_at_insightful.com> writes:
>> > With environments, if you use a prime number for the size
>> > you get considerably better results. E.g.,
>>
>> > Perhaps new.env() should push the requested size up
>> > to the next prime by default.
>>
>> Perhaps. I think we should also investigate other hashing functions
>> since computing the next prime and doing so for resizes will take
>> longer than not having to do it and it will add complexity to the
>> code.

    HenrikB> An alternative is to hard-wiring primes within a reasonable range:

    HenrikB> http://primes.utm.edu/lists/small/millions/     HenrikB> http://www.math.utah.edu/~pa/math/p10000.html

    HenrikB> Maybe primes close to 2^n are good enough for this problem:

    HenrikB> http://primes.utm.edu/lists/2small/

Yes, I had a similar thought....

Note that you don't need web sites for prime numbers:

my R factorization utilities I had mentioned a few times, e.g., here

      http://tolstoy.newcastle.edu.au/R/help/05/01/10007.html

can give the first few hundred thousand primes quickly enough:

  > source("ftp://stat.ethz.ch/U/maechler/R/prime-numbers-fn.R")

  > system.time(PS3 <- prime.sieve(prime.sieve(prime.sieve())))

     user system elapsed
    0.446 0.006 0.452

  > head(PS3, 20)
   [1] 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71   > tail(PS3, 20)
   [1] 273233 273253 273269 273271 273281 273283 273289 273311 273313 273323   [11] 273349 273359 273367 273433 273457 273473 273503 273517 273521 273527   >

There are more prime / factorization utilities in that simple R source file, but
as I say there, one should really use C code to do this; but then R has become so fast ...

Martin Maechler, ETH Zurich

    HenrikB> Just my $.02

    HenrikB> /Henrik



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 24 Jul 2007 - 17:36:46 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 25 Jul 2007 - 07:37:04 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.