From: Bo Peng <ben.bob_at_gmail.com>

Date: Thu 23 Jun 2005 - 21:37:35 GMT

int i, j, k;

double rU; /* U[0,1)*n */

}

}

}

}

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Jun 24 07:42:08 2005

Date: Thu 23 Jun 2005 - 21:37:35 GMT

> We suggest that you take up your own suggestion, research this area and

*> offer the R project the results of your research for consideration as your
**> contribution.
*

I implemented Walker's alias method and re-compiled R. Here is what I did:

- replace function ProcSampleReplace in R-2.1.0/src/main/random.c with the following one

static void ProbSampleReplace(int n, double *p, int *perm, int nans, int *ans) {

/* allocate memory for a, p and HL */ double * q = Calloc(n, double);

int * a = Calloc(n, int); int * HL = Calloc(n, int); int * H = HL; int * L = HL+n-1;

int i, j, k;

double rU; /* U[0,1)*n */

/* set up alias table */

/* initialize q with n*p0,...n*p_n-1 */
for(i=0; i<n; ++i)

q[i] = p[i]*n;

/* initialize a with indices */

for(i=0; i<n; ++i)

a[i] = i;

/* set up H and L */

for(i=0; i<n; ++i) {

if( q[i] >= 1.) *H++ = i; else *L-- = i;

}

while( H != HL && L != HL+n-1) {

j = *(L+1); k = *(H-1); a[j] = k; q[k] += q[j] - 1; L++; /* remove j from L */ if( q[k] < 1. ) { *L-- = k; /* add k to L */ --H; /* remove k */ }

}

/* generate sample */

for (i = 0; i < nans; ++i) {

rU = unif_rand() * n;

k = (int)(rU); rU -= k; /* rU becomes rU-[rU] */ if( rU < q[k] ) ans[i] = k+1; else ans[i] = a[k]+1;

}

Free(HL); Free(a); Free(q);

}

2. make and make install

3. test the new sample function by code like

4. run the following code in current R 2.1.0 and updated R.

size\p 10 10^2 10^3 10^4 10 2.4/1.6 0.32/0.22 0.20/0.08 0.24/0.06 10^2 3.1/2.6 0.48/0.28 0.28/0.06 0.30/0.06 10^3 11.8/11.1 1.84/1.14 0.94/0.18 0.96/0.08 10^4 96.8/96.6 15.34/9.68 7.54/1.06 7.48/0.16 run: 10000 1000 100 10 times

We can see that the alias method is quicker than the linear search method in all cases. The performance difference is greatest (>50 times) when the original algorithm need to search in a long prob vector.

I have not thoroughly tested the new function. I will do so if you (the developers) think that this has the potential to be incorporated into R.

Thanks.

Bo Peng

Department of Statistics

Rice University

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Jun 24 07:42:08 2005

*
This archive was generated by hypermail 2.1.8
: Mon 20 Feb 2006 - 03:21:09 GMT
*