[Rd] data frame subset patch

From: Vladimir Dergachev <vdergachev_at_rcgardis.com>
Date: Tue 28 Nov 2006 - 20:54:54 GMT

Hi all,

   Here is a patch that significantly speeds up `[.data.frame` operator. It applies cleanly to both 2.4.0 and svn trunk. Make check was OK for 2.40. (for svn trunk it fails even without this patch.. ).

   What it does - we get rid of class and attr statements that modify incoming data frame and use explicit calls to .subset and .subset2 instead.

Test case:

N<-100000
T<-data.frame(a=1:N, b=rnorm(N), c=as.character(round(runif(N)*10))) system.time({X<-0 ; for(i in 1:1000)X<-X+T[i,2]})

Without patch the output on my system is
[1] 8.488 2.436 10.926 0.000 0.000

With this patch the output is:
[1] 1.084 0.624 1.707 0.000 0.000

                    thank you !

                             Vladimir Dergachev

______________________________________________

R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Nov 29 08:03:17 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 28 Nov 2006 - 21:30:52 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.