Re: [R] Q: Suggestions for long-term data/program storage policy?

From: sosman <>
Date: Tue 11 Oct 2005 - 19:02:49 EST

Alexander Ploner wrote:
> Dear list,
> we are a statistical/epidemiological departement that - after a few
> years of rapid growth - finally is getting around to formulate a
> general data storage and retention policy - mainly to ensure that we
> can reproduce results from published papers/theses easier in the
> future, but also with the hope that we get more synergy between
> related projects.
> We have formulated what we feel is a reasonable draft, requiring
> basically that the raw data, all programs to create derived data
> sets, and the analysis programs are stored and documented in a
> uniform manner, regardless of the analysis software used. The minimum
> data retention we are aiming for is 10 years, and the format for the
> raw data is quite sane (either flat ASCII or real
> Given the rapid devlopment cycle of R, this suggests that at the very
> least all non-base packages used in the analysis are stored together
> with each project. I have basically two questions:
> 1) Are old R versions (binaries/sources) going to be available on
> CRAN indefinitely?
> 2) Is .RData a reasonable file format for long term storage?

> I would also be very grateful for any other suggestions, comments or
> links for setting up and implementing such a storage policy (R-
> specific or otherwise).

I am coming more from a software development angle but you might want to take a look at subversion for versioning your projects. For non-geeky types, TortoiseSVN has a point and click interface.

It handles binary files efficiently and you can easily go back and get earlier versions of your projects. mailing list PLEASE do read the posting guide! Received on Tue Oct 11 19:19:45 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:39:33 EST