Re: [R] Reading through a group of .RData files

From: Henrik Bengtsson <hb_at_stat.berkeley.edu>
Date: Tue, 11 Dec 2007 12:51:09 -0800

Hi,

depending on what you do and how (and why) you save objects in RData files in the first place, you might be interested in knowing of the loadObject()/saveObject() methods of R.utils, as well as loadCache()/saveCache() in R.cache.

The R.utils methods are basically "clever" wrappers around load()/save() in the 'base' package that does not rely on saving and loading the variable name but rather the object. To save multiple objects you have wrap them up in a list structure or in an environment. Example:

x <- 1:100
saveObject(x, file="foo.RData")
y <- loadObject("foo.RData")
stopifnot(identical(x,y))

u <- list(x=x, y=y)
saveObject(u, file="bar.RData")
v <- loadObject("bar.RData")
stopifnot(identical(u,v))

The R.cache methods let you store objects/results to a file cache without having to worry about filenames. Instead the objects are identified by lookup keys generated from other R objects. This is useful for temporary/semi-temporary storing of results, especially computationally expensive results. The file cache is persistent between sessions. Example:

x <- 1:100
key <- list("x")
saveCache(x, key=key)
y <- loadCache(key)
stopifnot(identical(x,y))

u <- list(x=x, y=y)
key <- list("u")
saveCache(u, key=key)
v <- loadCache(key)
stopifnot(identical(u,v))

Although not of immediate interest, the pathname of the above cache files can be found by
findCache(key), e.g.
"~/.Rcache/78488a47006df5d333db9e200fc539c5.Rcache". There are methods for specifying the root of the file cache, and having different subdirectories for different projects.

The above example is not showing the full power of using R.cache. Instead consider this example:

slowFcn <- function(x, y, force=FALSE) {   # Cached results?
  key <- list(x=x, y=y)
  if (!force) {
    res <- loadCache(key=key)
    if (!is.null(res))
      return(res);
  }

  # Emulate a computational expensive calculation   Sys.sleep(10)

  res <- list(x=x, y=y, xy=x*y)

  # Save to cache
  saveCache(res, key=key)

  res
}

# First call takes time
> system.time(res1 <- slowFcn(x=1, y=2))

   user system elapsed

      0 0 10

# All successive calls with the same arguments are instant
> system.time(res2 <- slowFcn(x=1, y=2))

   user system elapsed
   0.02 0.00 0.01

> stopifnot(identical(res1, res2))

Cheers

Henrik

On 10/12/2007, Talbot Katz <topkatz_at_msn.com> wrote:
>
> Hi.
>
> I have a procedure that reads a directory, loops through a set of particular .RData files, loading each one, and feeding its object(s) into a function, as follows:
>
> cvListFiles<-list.files(fnDir);
> for(i in grep(paste("^",pfnStub,".*\\.RData$",sep=""),cvListFiles)){
> load(paste(fnDir,cvListFiles[i],sep="/"));
> myFunction(rliObject);
> rm(rliObject);
> };
>
> where fnDir is the directory I'm reading, and pfnStub is a string that begins the name of each of the files I want to load. As you can see, I'm assuming that each of the selected .RData files contains an object named "rliObject" and I'm hoping that nothing in any of the files I'm loading overwrites an object in my environment. I'd like to clean this up so that I can extract the object(s) from each data file, and feed them to my function, whatever their names are, without corrupting my environment. I'd appreciate any assistance. Thanks!
>
> -- TMK --212-460-5430 home917-656-5351 cell
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 11 Dec 2007 - 20:55:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 11 Dec 2007 - 21:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.