[R] data.frame() size

From: Matthew Dowle <mdowle_at_concordiafunds.com>
Date: Fri 09 Dec 2005 - 05:18:16 EST

Hi,

In the example below why is d 10 times bigger than m, according to object.size ? It also takes around 10 times as long to create, which fits with object.size() being truthful. gcinfo(TRUE) also indicates a great deal more garbage collector activity caused by data.frame() than matrix().

$ R --vanilla
....
> nr = 1000000
> system.time(m<<-matrix(integer(1), nrow=nr, ncol=2))
[1] 0.22 0.01 0.23 0.00 0.00
> system.time(d<<-data.frame(a=integer(nr), b=integer(nr)))
[1] 2.81 0.20 3.01 0.00 0.00 # 10 times longer

> dim(m)

[1] 1000000 2
> dim(d)

[1] 1000000 2 # same dimensions

> storage.mode(m)

[1] "integer"
> sapply(d, storage.mode)

        a         b 
"integer" "integer" 				# same storage.mode

> object.size(m)/1024^2

[1] 7.629616
> object.size(d)/1024^2

[1] 76.29482 # but 10 times bigger

> sum(sapply(d, object.size))/1024^2

[1] 7.629501 # or is it ? If its not really 10 times bigger, why 10 times longer above ?

> version

platform x86_64-unknown-linux-gnu

arch     x86_64                  
os       linux-gnu               
system   x86_64, linux-gnu       
status                           
major    2                       
minor    1.1                     
year     2005                    
month    06                      
day      20                      
language R                       


Many thanks in advance!
Matthew

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 09 05:50:18 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:34 EST