From: Douglas Bates <bates_at_stat.wisc.edu>

Date: Fri 09 Jun 2006 - 17:41:57 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 09 17:53:07 2006

Date: Fri 09 Jun 2006 - 17:41:57 EST

On 6/9/06, Rafael A. Irizarry <ririzarr@jhsph.edu> wrote:

*> Hi!
**>
*

> I have used the Matrix package (Version: 0.995-10) successfully

*> to obtain the OLS solution for a problem where the design matrix X is
**> 44000x6000. X is very sparse (about 80000 non-zeros elements).
**>
**> Now I want to do WLS: (X'WX)^-1X'Wy
**>
**> I tried W=Diagonal(length(w),w) and
**> wX=solve(X,W)
**>
**> but after various minutes R gives a not enough
**> memory error (Im using a 64bit machine with 16Gigs of RAM).
*

That's an interesting question, Rafael, and very timely. I happen to be visiting Martin Maechler this week and he mentioned to me just a few minutes ago that we should add the capability for sparse least squares calculations to the Matrix package.

Just for clarification, did you really mean wX = solve(X, W) above or were you thinking of wX = crossprod(X, W)?

I would suggest storing the square root of the diagonal matrix W as a sparse matrix

*> w <- abs(rnorm(88000))
**> ind <- seq(along = w) - 1:1
**> W <- as(new("dgTMatrix", i = ind, j = ind, x = sqrt(w), Dim = c(length(w), length(w))), "dgCMatrix")
**> str(W)
*

Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

..@ i : int [1:88000] 0 1 2 3 4 5 6 7 8 9 ... ..@ p : int [1:88001] 0 1 2 3 4 5 6 7 8 9 ... ..@ Dim : int [1:2] 88000 88000 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:88000] 0.712 0.989 1.032 0.348 0.573 ... ..@ factors : list()

(The 1:1 expression in the calculation of ind is a cheap trick to get an integer value of 1 so that ind stays integer). Now you should be able to create wX <- W %*% X and wy <- W %*% y very quickly.

> I ended up doing this:

*> wX=Matrix(as.matrix(X)*sqrt(w),sparse=TRUE)
**> coefs1=as.vector(solve(crossprod(wX),crossprod(X,w*y)))
**>
**> which takes about 1-2 minutes, but it seems a better way, using the
**> diagonal matrix, should exist. If there is I'd appreciate hearing it. If
**> not, Im happy waiting 1-2 minute x #of iters.
**>
**>
**> Thanks,
**> Rafael
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 09 17:53:07 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Sun 11 Jun 2006 - 05:34:52 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*