Re: [R] Q: Suggestions for long-term data/program storage policy?

From: Sean Davis <sdavis2_at_mail.nih.gov>
Date: Tue 11 Oct 2005 - 21:11:14 EST


On 10/11/05 6:54 AM, "Duncan Murdoch" <murdoch@stats.uwo.ca> wrote:

> Alexander Ploner wrote:

>> Dear list,
>>
>> we are a statistical/epidemiological departement that - after a few
>> years of rapid growth - finally is getting around to formulate a
>> general data storage and retention policy - mainly to ensure that we
>> can reproduce results from published papers/theses easier in the
>> future, but also with the hope that we get more synergy between
>> related projects.
>> I would also be very grateful for any other suggestions, comments or
>> links for setting up and implementing such a storage policy (R-
>> specific or otherwise).

I would also consider a relational database (such as mysql or postgres) for your data warehousing. These products (particularly postgres) are designed with data integrity first-and-foremost. Data formats can change over time, but the data can be easily extracted from the database to match the needs at hand. Data generated at different times can be easily mined and combined as needed. The data backup process is fairly straightforward. R already integrates with several relational database systems, so an integrated solution can be defined if one so desires. Look at RMySQL, Rdbi, and RdbiPgSQL for how to integrate R with MySQL and Postgres.

Sean



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Oct 11 21:15:46 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:40:10 EST