Re: [R] X'W in Matrix

From: Douglas Bates <bates_at_stat.wisc.edu>
Date: Fri 09 Jun 2006 - 17:41:57 EST

On 6/9/06, Rafael A. Irizarry <ririzarr@jhsph.edu> wrote:
> Hi!
>
> I have used the Matrix package (Version: 0.995-10) successfully
> to obtain the OLS solution for a problem where the design matrix X is
> 44000x6000. X is very sparse (about 80000 non-zeros elements).
>
> Now I want to do WLS: (X'WX)^-1X'Wy
>
> I tried W=Diagonal(length(w),w) and
> wX=solve(X,W)
>
> but after various minutes R gives a not enough
> memory error (Im using a 64bit machine with 16Gigs of RAM).

That's an interesting question, Rafael, and very timely. I happen to be visiting Martin Maechler this week and he mentioned to me just a few minutes ago that we should add the capability for sparse least squares calculations to the Matrix package.

Just for clarification, did you really mean wX = solve(X, W) above or were you thinking of wX = crossprod(X, W)?

I would suggest storing the square root of the diagonal matrix W as a sparse matrix

> w <- abs(rnorm(88000))
> ind <- seq(along = w) - 1:1
> W <- as(new("dgTMatrix", i = ind, j = ind, x = sqrt(w), Dim = c(length(w), length(w))), "dgCMatrix")
> str(W)

Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

  ..@ i       : int [1:88000] 0 1 2 3 4 5 6 7 8 9 ...
  ..@ p       : int [1:88001] 0 1 2 3 4 5 6 7 8 9 ...
  ..@ Dim     : int [1:2] 88000 88000
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:88000] 0.712 0.989 1.032 0.348 0.573 ...
  ..@ factors : list()

(The 1:1 expression in the calculation of ind is a cheap trick to get an integer value of 1 so that ind stays integer). Now you should be able to create wX <- W %*% X and wy <- W %*% y very quickly.

> I ended up doing this:
> wX=Matrix(as.matrix(X)*sqrt(w),sparse=TRUE)
> coefs1=as.vector(solve(crossprod(wX),crossprod(X,w*y)))
>
> which takes about 1-2 minutes, but it seems a better way, using the
> diagonal matrix, should exist. If there is I'd appreciate hearing it. If
> not, Im happy waiting 1-2 minute x #of iters.
>
>
> Thanks,
> Rafael
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 09 17:53:07 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 11 Jun 2006 - 05:34:52 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.