Re: [R] Applying user function over a large matrix

From: Ray Brownrigg <Ray.Brownrigg_at_mcs.vuw.ac.nz>
Date: Wed, 30 Apr 2008 10:18:50 +1200

In addition to Tony's suggestion, have a look at the following sequence, which I suspect is because the call to apply will duplicate your 1.5GB matrix, whereas the for loop doesn't [I stand to be corrected here].

> x <- matrix(runif(210000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})

   user system elapsed
  0.079 0.000 0.079
> unix.time(apply(x, 2, sum))

   user system elapsed
   0.10 0.01 0.11
> x <- matrix(runif(2100000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})

   user system elapsed
  0.791 0.010 0.801
> unix.time(apply(x, 2, sum))

   user system elapsed
  1.096 0.011 1.107
> x <- matrix(runif(21000000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})

   user system elapsed
  7.825 0.011 7.840
> unix.time(apply(x, 2, sum))

   user system elapsed
 15.431 0.142 15.592

>

Also, preliminary checking using the top utility shows the for loop requires just over half the memory of the apply() call. This is on a NetBSD system with 2GB memory.

HTH,
Ray Brownrigg

On Wed, 30 Apr 2008, Tony Plate wrote:
> It's quite possible that much of the time spent in loess() is setting up
> the data (i.e., the formula, terms, model.frame, etc.), and that much of
> that is repeated identically for each call to loess(). I would suggest
> looking at the code of loess() and work out what arguments it is calling
> simpleLoess() with, and then try calling stats:::simpleLoess() directly.
> (Of course you have to be careful with this because this is not using the
> published API).
>
> -- Tony Plate
>
> Sudipta Sarkar wrote:
> > Respected R experts,
> > I am trying to apply a user function that basically calls and
> > applies the R loess function from stat package over each time
> > series. I have a large matrix of size 21 X 9000000 and I need
> > to apply the loess for each column and hence I have
> > implemented this separate user function that applies loess
> > over each column and I am calling this function foo as follows:
> > xc<-apply(t,2,foo) where t is my 21 X 9000000 matrix and
> > loess. This is turning out to be a very slow process and I
> > need to repeat this step for 25-30 such large matrix chunks.
> > Is there any trick I can use to make this work faster?
> > Any help will be deeply appreciated.
> > Regards
> >
> >
> > Sudipta Sarkar PhD
> > Senior Analyst/Scientist
> > Lanworth Inc. (Formerly Forest One Inc.)
> > 300 Park Blvd., Ste 425
> > Itasca, IL
> > Ph: 630-250-0468
> >



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 29 Apr 2008 - 22:23:19 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 30 Apr 2008 - 00:30:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive