R-alpha: Re: Memory Requirements for Large Datasets

Kurt Hornik (hornik@ci.tuwien.ac.at)
Thu, 6 Mar 1997 09:31:20 +0100

Date: Thu, 6 Mar 1997 09:31:20 +0100
Message-Id: <199703060831.JAA11250@aragorn.ci.tuwien.ac.at>
From: Kurt Hornik <hornik@ci.tuwien.ac.at>
To: r-testers@stat.math.ethz.ch
Subject: R-alpha: Re:  Memory Requirements for Large Datasets

I've added the following to the FAQ:

  6.  R Miscellania

  6.1.  How Can I Read a Large Data Set into R?

  R (currently) uses a static memory model.  This means that when it
  starts up, it asks the operating system to reserve a fixed amount of
  memory for it.  The size of this chunk cannot be changed subsequently.
  Hence, it can happen that not enough memory was allocated.

  In these cases, you should restart R with more memory available, using
  the command line options -v and -n.  To understand these options, one
  needs to know that R maintains separate areas for fixed and variable
  sized objects.  The first of these is allocated as an array of
  SEXPRECs assembled in a list using ``cons cells'' (ordered pairs each
  containing an element of the list and a pointer to the next cell), and
  the second as an array of VECRECs.  The -n option can be used to
  specify the number of cons cells (each occupying 16 bytes) which R is
  to use (the default is 200000), and the -v option to specify the size
  of the vector heap in megabytes (the default is 2).  Only integers are
  allowed for both options.

  E.g., to read in a table of 5000 observations on 40 numeric variables,
  R -v 6 should do.

It would be nice if someone could provide me with a formula for the
required memory size (or some decent estimate thereof).

r-testers mailing list -- For info or help, send "info" or "help",
To [un]subscribe, send "[un]subscribe"
(in the "body", not the subject !)  To: r-testers-request@stat.math.ethz.ch