Re: [R] rbind wastes memory

From: Roger D. Peng <rpeng_at_jhsph.edu>
Date: Tue 31 May 2005 - 01:00:12 EST

Rather than 'rbind' in a loop, try putting your dataframes in a list and then doing something like 'do.call("rbind", list.of.data.frames")'.

-roger

lutz.thieme@amd.com wrote:
> Hello everybody,
>
> if I try to (r)bind a number of large dataframes I run out of memory because R
> wastes memory and seems to "forget" to release memory.
>
> For example I have 10 files. Each file contains a large dataframe "ds" (3500 cols
> by 800 rows) which needs ~20 MB RAM if it is loaded as the only object.
> Now I try to bind all data frames to a large one and need more than 1165MB (!)
> RAM (To simplify the R code, I use the same file ten times):
>
> ________ start example 1 __________
> load(myFile)
> ds.tmp <- ds
> for (Cycle in 1:10) {
> ds.tmp <- rbind(ds.tmp, ds)
> }
> ________ end example 1 __________
>
>
>
> Stepping into details I found the following (comment shows RAM usage after this line
> was executed):
> load(myFile) # 40MB (19MB for R itself)
> ds.tmp <- ds # 40MB; => only a pointer seems to be copied
> x<-rbind(ds.tmp, ds) # 198MB
> x<-rbind(ds.tmp, ds) # 233MB; the same instruction a second time leads to
> # 35MB more RAM usage - why?
>
>
> Now I played around, but I couldn't find a solution. For example I bound each dataframe
> step by step and removed the variables and cleared memory, but I still need 1140MB(!)
> RAM:
>
> ________ start example 2 __________
> tmpFile<- paste(myFile,'.tmp',sep="")
> load(myFile)
> ds.tmp <- ds
> save(ds.tmp, file=tmpFile, compress=T)
>
> for (Cycle in 1:10) {
> ds <- NULL
> ds.tmp <- NULL
> rm(ds, ds.tmp)
> gc()
> load(tmpFile)
> load(myFile)
> ds.tmp <- rbind(ds.tmp, ds)
> save(ds.tmp,file=tmpFile, compress=T)
> cat(Cycle,': ',object.size(ds),object.size(ds.tmp),'\n')
> }
> ________ end example 1 __________
>
>
> platform i386-pc-solaris2.8
> arch i386
> os solaris2.8
> system i386, solaris2.8
> status
> major 1
> minor 9.1
> year 2004
> month 06
> day 21
> language R
>
>
>
>
> How can I avoid to run in that memory problem? Any ideas are very appreciated.
> Thank you in advance & kind regards,
>
>
>
> Lutz Thieme
> AMD Saxony/ Product Engineering AMD Saxony Limited Liability Company & Co. KG
> phone: + 49-351-277-4269 M/S E22-PE, Wilschdorfer Landstr. 101
> fax: + 49-351-277-9-4269 D-01109 Dresden, Germany
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue May 31 01:07:56 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:16 EST