Re: [R] Reading data into R

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Thu, 3 Jan 2008 09:10:03 -0500

On Jan 3, 2008 9:00 AM, BEP <perronbe_at_gmail.com> wrote:
> Hello all,
>
> I am working with a very large data set into R, and I have no interest in
> reviving my SAS skills. To do this, I will need to drop unwanted variables
> given the size of the data file. The most common strategy seems to be
> subsetting the data after it is read into R. Unfortunately, given the size
> of the data set, I can't get the file read and then subsquently do the
> subset procedure. I would be appreciative of help on the following:
>
> 1. What are the possibilities of reading in just a small set of variables
> during the <read.table> statement (or another 'read' statement)? That is,
> is it possible specify just the variables that I want to keep?

read.table can skip columns. Specify the releveant component of colClasses as NULL.

>
> 2. Can I randomly select a set of observations during the 'read' statement?
>
>
> I have searched various R resources for this information, so if I am simply
> overlooking a key resource on this issue, pointing that out to me would be
> greatly appreciated.
>

The development version of sqldf can do all of the above (i.e. read in a subset of
columns, a subset of rows or a random subset of rows) subject to certain limitations on the input format. See Example 6 on the home page:

   http://sqldf.googlecode.com

readTable in the R.utils package can also read in a subset of rows and columns.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 Jan 2008 - 14:13:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 Jan 2008 - 15:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive