Re: [R] Treatment for Unequal Column Lengths?

From: Greg Snow <Greg.Snow_at_intermountainmail.org>
Date: Thu 11 Jan 2007 - 18:50:27 GMT


One of the ways that R (and S-plus) is different from most other stats packages (all that I can think of) is that it forces you to think about your data up front. This is a good thing. It sounds like you really have multiple datasets in one file, it is best to read them into R as separate datasets, not try to force them into 1 dataset (like other packages do). If you need to keep the multiple datasets grouped together you can combine them together in a list.

To read one dataset out of the file you can use read.table (read.csv) with the colClasses (set other columns to "NULL") and nrows to grab just the columns and rows from a given dataset. I would recommend writing a script to read each of the separate pieces.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow@intermountainmail.org
(801) 408-8111
 
 


> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Gerald Gamoric
> Sent: Thursday, January 11, 2007 9:52 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Treatment for Unequal Column Lengths?
>
> Fellow R Users:
>
> I have a .csv dataset that I have brought into R via
> read.table (and also via read.csv). The dataset has columns
> that are not equal in length.
> Essentially, this data file has vectors/columns in which I
> plan to use different analyses on, hence they are unequal in
> length. Also, the columns are either numeric or calendar
> dates. Is there a way to prevent R from appending "NA"s to
> the numeric columns that are not the longest? Is there a way
> to prevent R from appending blank cells to the columns of
> dates in the dataset that are not the longest? In other
> words, I'd like to have R maintain each column's length.
>
> I am aware that I can use "na.omit" before calling each
> numeric column in my analysis in order to work with the
> subset of that column that does not contain the "NA" values.
> However, the na.omit command does not work when R appends
> blank cells to my date column lengths. Is there something
> analogous to "na.omit" that I might be able to use when I am
> working with a column of dates to ignore the blank cells?
>
> Further, I am curious as to whether there is an option that
> one might use when the dataset is read in to R in order to
> keep all the column lengths as they are. Any ideas/hints
> would be very much appreciated.
>
> platform i386-pc-mingw32
>
> arch i386
>
> os mingw32
>
> system i386, mingw32
>
> status
>
> major 2
>
> minor 3.1
>
> year 2006
>
> month 06
>
> day 01
>
> svn rev 38247
>
> language R
>
> Thank you,
>
> Dave H
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Fri Jan 12 05:57:16 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 11 Jan 2007 - 20:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.