[R] Q: Suggestions for long-term data/program storage policy?

From: Alexander Ploner <Alexander.Ploner_at_meb.ki.se>
Date: Tue 11 Oct 2005 - 18:04:54 EST


Dear list,

we are a statistical/epidemiological departement that - after a few years of rapid growth - finally is getting around to formulate a general data storage and retention policy - mainly to ensure that we can reproduce results from published papers/theses easier in the future, but also with the hope that we get more synergy between related projects.

We have formulated what we feel is a reasonable draft, requiring basically that the raw data, all programs to create derived data sets, and the analysis programs are stored and documented in a uniform manner, regardless of the analysis software used. The minimum data retention we are aiming for is 10 years, and the format for the raw data is quite sane (either flat ASCII or real

Given the rapid devlopment cycle of R, this suggests that at the very least all non-base packages used in the analysis are stored together with each project. I have basically two questions:

  1. Are old R versions (binaries/sources) going to be available on CRAN indefinitely?
  2. Is .RData a reasonable file format for long term storage?

I would also be very grateful for any other suggestions, comments or links for setting up and implementing such a storage policy (R- specific or otherwise).

Thank you for your time,

alexander

Alexander.Ploner@meb.ki.se
Medical Epidemiology & Biostatistics
Karolinska Institutet, Stockholm
Tel: ++46-8-524-82329
Fax: ++46-8-31 49 75

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Oct 11 18:11:52 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:41 EST