[R] Quickly reading data into the Matrix packages sparse formats

From: Paul Bailey <pdbailey_at_umd.edu>
Date: Mon, 16 Jun 2008 21:31:23 -0400

I have data set that I wish to solve with the Matrix package's sparse matrix functionality. The speed improvements that it has achieved are amazing, with my dense matrix solutions never taking really long enough to time in what I've been able to time so far. However, before I can solve my full linear model, I need to be able to read in all the data, and therein lies the rub. There are two ways that I see to read it in:

(1) generate a dense X matrix and then convert it to a sparse matrix
using i.e.

R> require(Matrix)
R> Xsparse <- as(X,"dgCMatrix")

(2) make a new sparse X matrix and then populate it.
R> require(Matrix)
R> Xsparse <- Matrix(0,nrow=n,ncol=m,sparse=T)

then for relevant cells:
R> Xsparse[i,j] <- v

But both of these methods are painfully slow. method 1 takes many times as long as the actual solving and what's worse, ends up being only about 1/2 as time consuming as sparse solvers when all is told. It also requires that a dense version of X approximately fit in memory. method 2 is significantly slower still, taking more than a factor of 10 longer than the dense solver. For 2 I tried dgCMatrix and dgTMatrix with little difference. I've searched though the documentation on the Matrix package, and there is no mention of this problem or its potential cure.

Is there some way that I can format the data that will allow for rapid read in, or is there some other possible cure?

Paul Bailey

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 17 Jun 2008 - 03:18:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 17 Jun 2008 - 05:30:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive