From: Ray Brownrigg <Ray.Brownrigg_at_mcs.vuw.ac.nz>

Date: Wed, 30 Apr 2008 10:18:50 +1200

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 29 Apr 2008 - 22:23:19 GMT

Date: Wed, 30 Apr 2008 10:18:50 +1200

In addition to Tony's suggestion, have a look at the following sequence, which I suspect is because the call to apply will duplicate your 1.5GB matrix, whereas the for loop doesn't [I stand to be corrected here].

*> x <- matrix(runif(210000), 21)
**> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
*

sum(x[, i])})

user system elapsed

0.079 0.000 0.079

*> unix.time(apply(x, 2, sum))
*

user system elapsed

0.10 0.01 0.11

*> x <- matrix(runif(2100000), 21)
**> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
*

sum(x[, i])})

user system elapsed

0.791 0.010 0.801

*> unix.time(apply(x, 2, sum))
*

user system elapsed

1.096 0.011 1.107

*> x <- matrix(runif(21000000), 21)
**> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
*

sum(x[, i])})

user system elapsed

7.825 0.011 7.840

*> unix.time(apply(x, 2, sum))
*

user system elapsed

15.431 0.142 15.592

*>
*

Also, preliminary checking using the top utility shows the for loop requires just over half the memory of the apply() call. This is on a NetBSD system with 2GB memory.

**HTH,
**

Ray Brownrigg

On Wed, 30 Apr 2008, Tony Plate wrote:

> It's quite possible that much of the time spent in loess() is setting up

*> the data (i.e., the formula, terms, model.frame, etc.), and that much of
**> that is repeated identically for each call to loess(). I would suggest
**> looking at the code of loess() and work out what arguments it is calling
**> simpleLoess() with, and then try calling stats:::simpleLoess() directly.
**> (Of course you have to be careful with this because this is not using the
**> published API).
**>
**> -- Tony Plate
**>
**> Sudipta Sarkar wrote:
**> > Respected R experts,
**> > I am trying to apply a user function that basically calls and
**> > applies the R loess function from stat package over each time
**> > series. I have a large matrix of size 21 X 9000000 and I need
**> > to apply the loess for each column and hence I have
**> > implemented this separate user function that applies loess
**> > over each column and I am calling this function foo as follows:
**> > xc<-apply(t,2,foo) where t is my 21 X 9000000 matrix and
**> > loess. This is turning out to be a very slow process and I
**> > need to repeat this step for 25-30 such large matrix chunks.
**> > Is there any trick I can use to make this work faster?
**> > Any help will be deeply appreciated.
**> > Regards
**> >
**> >
**> > Sudipta Sarkar PhD
**> > Senior Analyst/Scientist
**> > Lanworth Inc. (Formerly Forest One Inc.)
**> > 300 Park Blvd., Ste 425
**> > Itasca, IL
**> > Ph: 630-250-0468
**> >
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 29 Apr 2008 - 22:23:19 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 30 Apr 2008 - 00:30:33 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*