Re: [Rd] S3/S4 classes performance comparison

From: Torsten Hothorn <Torsten.Hothorn_at_rzmail.uni-erlangen.de>
Date: Sat 15 Jan 2005 - 02:11:34 EST

On Fri, 14 Jan 2005, Eric Lecoutre wrote:

>
> Hi R-devel,
>
> If you did read my survey on Rhelp about reporting, you may have seen that
> I am implementing a way to handle outputs for R (mainly target output
> destinations: xHTML and TeX).
> In fact: I does have something that works for basic objects, entirely done
> with S4 classes, with the results visible at:
> http://www.stat.ucl.ac.be/ROMA/sample.htm
> http://www.stat.ucl.ac.be/ROMA/sample.pdf
>
> To achieve this goal, I do use intermediary objects that would reprensent
> the structure of the output. Thus I defined classes for Vector, Tables,
> Rows, Cells, Sections, and so on. Most of those structure are recursive.
> Then, at a firts attemps, a matrix would be represented as a Table
> containing Rows containg Cells containing Vectors, which finally is easy to
> export and which makes easy the customisation (if you need to insert a
> footnote within a cell for example).
> I know that this intermediary layout would be far more easier to handle at
> C level, but I dont have any C skill for that...
>
> One of my problem is that this consumes a lot of memory/computation time.
> Too much, indeed...
> 20 sec. to export data(iris) on my PIV 3.2 Ghz 1Go RAM, which is not
> acceptable.
>
> I was intending to do start properly, as starting from scratch new code. I
> did write everything using S4 classes.
> Doing a simple test reveals crucial efficiency differences between S3 and
> S4 classes.
>
> Here is the test:
>
> ---
>
> ### S3 CLASSES
>
> S3content <- function(obj=NULL,add1=NULL,add2=NULL,type="",...){
> out <- list(content=obj,add1=add2,add2=add2,type=type)
> class(out) <- "S3Content"
> return(out)
> }
>
> S3vector <- function(vec,...){
> out <- S3content(obj=vec,type="Vector",...)
> class(out) <- "S3Vector"
> return(out)
> }
>
>
> ### S4 classes
>
> setClass("S4content",representation(content="ANY",add1="ANY",add2="ANY",type="character"))
>
> S4content <- function(obj=NULL,add1=NULL,add2=NULL,type="",...){
> new("S4content",content=obj,add1=add1,add2=add2,type=type)
> }
>
> S4vector <- function(vec,...){
> new("S4content",type="vector",content=vec,...)
> }
>
> ### Now the test
> > test <- rnorm(10000)
> > gc()
> used (Mb) gc trigger (Mb)
> Ncells 169135 4.6 531268 14.2
> Vcells 75260 0.6 786432 6.0
> > (system.time(lapply(test,S3vector)))
> [1] 0.17 0.00 0.19 NA NA
> > gc()
> used (Mb) gc trigger (Mb)
> Ncells 169136 4.6 531268 14.2
> Vcells 75266 0.6 786432 6.0
> > (system.time(lapply(test,S4vector)))
> [1] 15.08 0.00 15.13 NA NA
> -----
>
> There is here a factor higher than 80!
>
> Is there something trivial I did overlook?
> Is this 80 factor normal?
>

my experience was that calling the constructor _with_ data is slow, so the following performs a little bit better

R> S3content <- function(obj=NULL,add1=NULL,add2=NULL,type="",...){

+          out <- list(content=obj,add1=add2,add2=add2,type=type)
+          class(out) <- "S3Content"
+          return(out)
+ }

R>
R> S3vector <- function(vec,...){
+    out <- S3content(obj=vec,type="Vector",...)
+    class(out) <- "S3Vector"
+    return(out)
+ }
R>
R>
R> ### S4 classes

R>
R>
setClass("S4content",representation(content="ANY",add1="ANY",add2="ANY",type="character")) [1] "S4content"
R>
R> S4vector <- function(vec,...){
+    RET <- new("S4content")
+    RET@type <- "vector"
+    RET@content <- vec
+    RET
+ }
R>

R> test <- rnorm(10000)
R> gc()
         used (Mb) gc trigger (Mb)
Ncells 156181  4.2     350000  9.4
Vcells  67973  0.6     786432  6.0

R> system.time(lapply(test,S3vector))

[1] 0.23 0.00 0.23 0.00 0.00
R> gc()
         used (Mb) gc trigger (Mb)
Ncells 156314  4.2     350000  9.4
Vcells  68005  0.6     786432  6.0

R> system.time(lapply(test,S4vector))
[1] 6.04 0.00 6.04 0.00 0.00
R>

Torsten

> Is it still recommended (recommendable...) to use S4 classes when
> considered that?
>
>
>
> Eric
>
> Eric Lecoutre
> UCL / Institut de Statistique
> Voie du Roman Pays, 20
> 1348 Louvain-la-Neuve
> Belgium
>
> tel: (+32)(0)10473050
> lecoutre@stat.ucl.ac.be
> http://www.stat.ucl.ac.be/ISpersonnel/lecoutre
>
> If the statistics are boring, then you've got the wrong numbers. -Edward
> Tufte
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Jan 15 01:23:22 2005

This archive was generated by hypermail 2.1.8 : Sat 15 Jan 2005 - 02:24:07 EST