Re: [R] clara - memory limit

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu 04 Aug 2005 - 03:26:54 EST

On Wed, 3 Aug 2005, Prof Brian Ripley wrote:

>> From the help page:
>
> 'clara' is fully described in chapter 3 of Kaufman and Rousseeuw
> (1990). Compared to other partitioning methods such as 'pam', it
> can deal with much larger datasets. Internally, this is achieved
> by considering sub-datasets of fixed size ('sampsize') such that
> the time and storage requirements become linear in n rather than
> quadratic.
>
> and the default for 'sampsize' is apparently at least nrow(x).

Correction, sorry, in your case 40 + 2*k = 54.

> So you need to set 'sampsize' (and perhaps 'samples') appropriately,

That might be it, but a traceback() showing where the error is occurring would help. Another possible place is in the initial manipulations scaling the data matrix.

Since sub-sampling is used, you can start with a much smaller subset of the data.

>
>
> On Wed, 3 Aug 2005, Nestor Fernandez wrote:
>
>> Dear all,
>>
>> I'm trying to estimate clusters from a very large dataset using clara but
>> the
>> program stops with a memory error. The (very simple) code and the error:
>>
>> mydata<-read.dbf(file="fnorsel_4px.dbf")
>> my.clara.7k<-clara(mydata,k=7)
>>
>>> Error: cannot allocate vector of size 465108 Kb
>>
>> The dataset contains >3,000,000 rows and 15 columns. I'm using a windows
>> computer with 1.5G RAM; I also tried changing the memory limit to the
>> maximum
>> possible (4000M)
>
> Actually, the limit is probably 2048M: see the rw-FAQ Q on memory limits.
>
>> Is there a way to calculate clara clusters from such large datasets?
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Thu Aug 04 03:38:25 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:02:48 EST