Re: [R] R and Data Storage

From: Frank E Harrell Jr <>
Date: Sat 01 Oct 2005 - 22:35:00 EST wrote:
> Where I work a lot of people end up using Excel spreadsheets for storing
> data. This has limitations and maybe some less than obvious problems. I'd
> like to recommend a uniform way for storing and archiving data collected
> in the department. Most of the data could be stored in simple csv type
> files but it would be nice to have something that stores more information
> about the variables and units. netcdf seems like overkill (and not easy
> for casual users). Same for postgres and mysql databases. Could someone
> recommend some system for storing relatively small data sets (50-100
> variables, <1000 records) that would be reliable, safe, and easy for
> people to view and edit their data that works nicely with R and is open
> source? Am I asking for the moon?
> Rick B.

What I use is the facilities in the Hmisc package, which handles variable labels and units of measurement and has functions for importing data (saving labels in the appropriate place) and making use of the attributes (e.g., combining labels and units with a smaller font for the units portion in an axis label). When such an annotated data frame is saved using save(...., compress=TRUE), load()'ing it back will provide an annotated data frame, quickly. The contents( ) function can show the attributes, and we use html(contents( )) to put up a web page with hyperlinks for value labels (factor variable levels attribute).

Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________ mailing list
PLEASE do read the posting guide!
Received on Sat Oct 01 22:37:13 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:07:51 EST