Re: [R] data set size question

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed 14 Jun 2006 - 14:01:18 EST

The restriction is that objects are kept in memory so if you have sufficient memory and your OS lets you access it then you should be ok. S-Plus is a commercial package similar to R but stores its objects in files and can handle larger data sets if you run into trouble.

Given that R is free and once downloaded can be installed on Windows in a minute or so (I assume its just as easy on other OSes) just install it and generate some test data and see if you have any problems, e.g. I had no trouble running the following on my PC:

n <- 100000
p <- 20
x <- matrix(rnorm(n * p), n)

colnames(x) <- letters[1:p]
# regress column a against the rest
x.lm <- lm(a ~., as.data.frame(x))
plot(x.lm) # click mouse to advance to successive plots summary(x.lm)

On 6/13/06, Carl Hauser <Carl.Hauser@nwea.org> wrote:
> Hi there,
>
> I'm very new to R and am only in the beginning stages of investigating
> it for possible use. A document by John Maindonald at the r-project
> website entitled "Using R for Data Analysis and Graphics: Introduction,
> Code and Commentary" contains the following paragraph, "The R system may
> struggle to handle very large data sets. Depending on available computer
> memory, the processing of a data set containing one hundred thousand
> observations and perhaps twenty variables may press the limits of what R
> can easily handle". This document was written in 2004.
>
> My questions are:
>
> Is this still the case? If so, has anyone come up with creative
> solutions to mitigate these limitations? If you work with large data
> sets in R, what have your experiences been?
>
> >From what I've seen so far, R seems to have enormous potential and
> capabilities. I routinely work with data sets of several hundred
> thousand to several million. It would be unfortunate if such potential
> and capabilities were not realized because of (effective) data set size
> limitations.
>
> Please tell me it ain't so.
>
> Thanks for any help or suggestions.
>
> Carl
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 14 14:06:15 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 14 Jun 2006 - 16:11:49 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.