From: Thomas Lumley <tlumley_at_u.washington.edu>

Date: Thu 17 Mar 2005 - 02:16:49 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Mar 17 02:22:40 2005

Date: Thu 17 Mar 2005 - 02:16:49 EST

On Tue, 15 Mar 2005, Liaw, Andy wrote:

>> From: Adaikalavan Ramasamy

*>>
**>> You will need to _apply_ the t-test row by row.
**>>
**>> apply( genes, 1, function(x) t.test( x[1:2], x[3:4] )$p.value )
**>>
**>> apply() is a C optimised version of for. Running the above code on a
**>> dataset with 56000 rows and 4 columns took about 63 seconds on my 1.6
**>> GHz Pentium machine with 512 Mb RAM. See help("apply") for
**>> more details.
**>
**> That's not true. In R, there's a for loop hidden inside apply() (just look
**> at the source). In S-PLUS, C level looping is done in some situations, and
**> for others lapply() is used.
**>
*

It's slightly more complicated than this. lapply() really is a C-level loop and apply() eventually calls it.

Now, whatever happends inside apply(), it still true that t.test() has to be called 56,000 times, providing a lower bound on the time apply() can take. In this case I would be very surprised if apply() saved any time. What would save time is writing a stripped-down t-test function, especially as only the p-value is being used.

The real problem with apply is that when the objects involved are large, apply() can be substantially slower because of greater memory use. As a concrete example, an apply() on a 10000x757 set of replicate weights in the survey package used half as much memory when turned into a for() loop. As a result it ran several times faster on my laptop (where it was paging heavily) and slightly faster on my desktop (which has rather more memory).

-thomas

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Mar 17 02:22:40 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:30:49 EST
*