Re: [R] clara - memory limit

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Thu 04 Aug 2005 - 03:23:45 EST

>>>>> "Nestor" == Nestor Fernandez <nestor.fernandez@ufz.de>
>>>>> on Wed, 03 Aug 2005 18:44:38 +0200 writes:

    Nestor> I'm trying to estimate clusters from a
    Nestor> very large dataset using clara but the program stops
    Nestor> with a memory error. The (very simple) code and the
    Nestor> error:

    Nestor> mydata<-read.dbf(file="fnorsel_4px.dbf")     Nestor> my.clara.7k<-clara(mydata,k=7)

    >> Error: cannot allocate vector of size 465108 Kb

    Nestor> The dataset contains >3,000,000 rows and 15
    Nestor> columns. I'm using a windows computer with 1.5G RAM;
    Nestor> I also tried changing the memory limit to the
    Nestor> maximum possible (4000M) Is there a way to calculate
    Nestor> clara clusters from such large datasets?

One way to start is reading the help ?clara more carefully and hence use

    clara(mydata, k=7, keep.data = FALSE)
		     ^^^^^^^^^^^^^^^^^^^

But that might not be enough:
You may need 64-bit CPU and an operating system (with system libraries and an R version) that uses 64-bit addressing, i.e., not any current version of M$ Windows.

   Nestor> Thanks a lot.

you're welcome.

Martin Maechler, ETH Zurich



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Aug 04 03:33:40 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:02:45 EST