Re: [Rd] efficiency of sample() with prob.

From: Martin Maechler <>
Date: Fri 24 Jun 2005 - 17:02:58 GMT

>>>>> "Bo" == Bo Peng <> >>>>> on Fri, 24 Jun 2005 10:32:45 -0500 writes:

    Bo> On 6/24/05, Prof Brian Ripley <> wrote:
>> `Research' involves looking at all the competitor methods, devising a
>> near-optimal strategy and selecting amongst methods according to that
>> strategy. It is not a quick fix we are looking for but something that
>> will be good for the long term.

    Bo> I am sorry but I am afraid that I do not have enough time and
    Bo> background knowledge
    Bo> to do a thorough research in this area.

which I think is well understandable.

    Bo> I have tried bisection search method and the alias
    Bo> method, the latter has greatly improved the performance
    Bo> of my bioinformatics application. Since this method is
    Bo> the only one mentioned in Knuth's book, I have no idea
    Bo> about other alternatives.

I think you've also explored the space of possible inputs a bit and have suggested that the alias method was "uniformly" better than the current one, i.e. always better, sometimes only slightly but sometimes considerably (and never worse). If this (uniform improvement) can be ``proven'' in some way, {and that maybe a considerable "if", I haven't started to go in there} and because the algorithm is relatively simple {i.e., there's not much code added to the current one}, I'd think that we (R-core) should incorporate the algorithm for the time being, until someone has time for the ``real research'' and provide even better algorithm(s).
I don't see why the phrase

   "the good is the enemy of the better" should apply in this situation.

Martin Maechler, ETH Zurich

    Bo> Attached is a slightly improved version of the alias method.

(deleted for this reply).

    Bo> It may be helpful to people having similar problems.

    Bo> Thanks.

    Bo> --
    Bo> Bo Peng
    Bo> Department of Statistics
    Bo> Rice University.

______________________________________________ mailing list Received on Sat Jun 25 03:12:17 2005

This archive was generated by hypermail 2.1.8 : Mon 24 Oct 2005 - 22:27:22 GMT