[R] Working with massive matrices in R

From: svrieze <vrie0006_at_umn.edu>
Date: Mon, 18 Apr 2011 13:10:19 -0700 (PDT)


Hello,

I'm (eventually) attempting a singular value decomposition of a 3200 x 527829 matrix in R version 2.10.1. The script is as follows: ###---------Begin Script here-------###
library(Matrix)

snps <- 527829                   ## Number of SNPs
N <- 3200                        ## Sample size
y <- rnorm(N, 100,1)               ## simulated phenotype
system.time(
## read in matrix 3200 x 527829
x <- scan("gedi7.raw", what=rep(0,snps), nmax=N*snps, skip=1)) system.time(x <- matrix(x,nrow=N,ncol=snps, byrow=TRUE)) print(object.size(x), units="Mb")
###--------End Script----------------####

The scan function finishes without a problem. "x" is in double precision floating point format and takes up 12886.5Mb of memory at the first object.size() statement.

When I convert it to a matrix I get an error stating that I cannot allocate a vector of size 12.6Gb. I have requested 31Gb of memory on the server. 12.6+ 12.8 = 25.4Gb of used memory. Is it that R is using considerable memory for operations not directly related to storing the matrix objects here? Or is this perhaps a problem of contiguous memory?

Any help is greatly appreciated.

-Scott

--

View this message in context: http://r.789695.n4.nabble.com/Working-with-massive-matrices-in-R-tp3458561p3458561.html Sent from the R help mailing list archive at Nabble.com.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 18 Apr 2011 - 20:26:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 19 Apr 2011 - 05:00:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive