From: Zoltan Kmetty <zkmetty_at_gmail.com>

Date: Tue 09 Jan 2007 - 13:24:09 GMT

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 10 09:33:48 2007

Date: Tue 09 Jan 2007 - 13:24:09 GMT

Unfortunatelly, i have to "fill" all the cells, with numbers..., so I need a better machine, or i have to split the data for smaller parts, but that way is much slower, but i see i dont have other alternative way.

But thanx for your help, because i work with big networks too (10000 vertex), and with this package i sholuld speed up my work.. I have to simulate "little world" networks -> Does anybody knows any package what colud make a network like that type?

2007/1/8, Martin Maechler <maechler@stat.math.ethz.ch>:

*>
*

> >>>>> "UweL" == Uwe Ligges <ligges@statistik.uni-dortmund.de>

*> >>>>> on Sun, 07 Jan 2007 09:42:08 +0100 writes:
**>
**> UweL> Zoltan Kmetty wrote:
**> >> Hi!
**> >>
**> >> I had some memory problem with R - hope somebody could
**> >> tell me a solution.
**> >>
**> >> I work with very large datasets, but R cannot allocate
**> >> enough memoty to handle these datasets.
**> >>
**> >> I want work a matrix with row= 100 000 000 and column=10
**> >>
**> >> A know this is 1 milliard cases, but i thought R could
**> >> handle it (other commercial software like spss could do),
**> >> but R wrote out everytime: not enough memory..
**> >>
**> >> any good idea?
**>
**> UweL> Buy a machine that has at least 8Gb (better 16Gb) of
**> UweL> RAM and proceed ...
**>
**> Well, I doubt that Zoltan wants to *fill* his matrix with all
**> non-zeros. If he does, Uwe and Roger are right.
**>
**> Otherwise, working with a *sparse* matrix, using the 'Matrix'
**> (my recommendation, but I am biased) or 'SparseM' package, might
**> well be feasible:
**>
**> install.packages("Matrix") # if needed; only once for your R
**>
**> library(Matrix) # each time you need it
**>
**>
**> TsparseMatrix <- function(nrow, ncol, i,j,x)
**> {
**> ## Purpose: User friendly construction of sparse "Matrix" from triple
**> ##
**> ----------------------------------------------------------------------
**> ## Arguments: (i,j,x): 2 integer and 1 numeric vector of the same
**> length:
**> ##
**> ## The matrix M will have
**> ## M[i[k], j[k]] == x[k] , for k = 1,2,..., length(i)
**> ## and M[ i', j' ] == 0 for `` all other pairs (i',j')
**> ##
**> ----------------------------------------------------------------------
**> ## Author: Martin Maechler, Date: 8 Jan 2007, 18:46
**> nnz <- length(i)
**> stopifnot(length(j) == nnz, length(x) == nnz,
**> is.numeric(x), is.numeric(i), is.numeric(j))
**> dim <- c(as.integer(nrow), as.integer(ncol))
**> ## The conformability of (i,j) with 'dim' will be checked automatically
**> ## by an internal "validObject()" that is part of new(.):
**> new("dgTMatrix", x = x, Dim = dim,
**> ## our "Tsparse" Matrices use 0-based indices :
**> i = as.integer(i - 1:1),
**> j = as.integer(j - 1:1))
**> }
**>
**> For example :
**>
**> > TsparseMatrix(10,20, c(1,3:8), c(2,9,6:10), 7 * (1:7))
**> 10 x 20 sparse Matrix of class "dgTMatrix"
**>
**> [1,] . 7 . . . . . . . . . . . . . . . . . .
**> [2,] . . . . . . . . . . . . . . . . . . . .
**> [3,] . . . . . . . . 14 . . . . . . . . . . .
**> [4,] . . . . . 21 . . . . . . . . . . . . . .
**> [5,] . . . . . . 28 . . . . . . . . . . . . .
**> [6,] . . . . . . . 35 . . . . . . . . . . . .
**> [7,] . . . . . . . . 42 . . . . . . . . . . .
**> [8,] . . . . . . . . . 49 . . . . . . . . . .
**> [9,] . . . . . . . . . . . . . . . . . . . .
**> [10,] . . . . . . . . . . . . . . . . . . . .
**>
**> But
**>
**> nr <- 1e8
**> nc <- 10
**> set.seed(1)
**> i <- sample(nr, 10000)
**> j <- sample(nc, 10000)
**> x <- round(rnorm(10000), 2)
**>
**> M <- TsparseMatrix(nr, nc, i=i, j=j, x=x)
**>
**> works,
**> e.g. you can
**>
**> x <- 1:10
**> system.time(y <- M %*% x) # needs around 4 sec on one of our better
**> machines
**> y <- as.vector(y)
**>
**> ## but you can become even more efficient, translating from the
**> ## so-called "triplet" to the (recommended) "Csparse"
**> ## representation:
**> M. <- as(M, "CsparseMatrix")
**>
**> object.size(M) / object.size(M.)
**> ## 1.328921; i.e. we saved 33%
**>
**> ## and
**>
**> system.time(y. <- M. %*% x) # much faster (1 sec)
**>
**> identical(as.vector(y.), y)
**>
**>
**> --- --- ---
**>
**> I hope this is useful to you.
**>
**> Martin Maechler,
**> ETH Zurich
**>
*

[[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 10 09:33:48 2007

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 09 Jan 2007 - 23:30:26 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*