Re: [R] R versus SAS: lm performance

About this list Date view Thread view Subject view Author view Attachment view

From: Peter Dalgaard (p.dalgaard@biostat.ku.dk)
Date: Tue 11 May 2004 - 22:37:51 EST


Message-id: <x2ekpr3zs0.fsf@biostat.ku.dk>


"Liaw, Andy" <andy_liaw@merck.com> writes:

> I tried the following on an Opteron 248, R-1.9.0 w/Goto's BLAS:
>
> > y <- matrix(rnorm(14000*1344), 1344)
> > x <- matrix(runif(1344*503),1344)
> > system.time(fit <- lm(y~x))
> [1] 106.00 55.60 265.32 0.00 0.00
>
> The resulting fit object is over 600MB. (The coefficient compoent is a 504
> x 14000 matrix.)
>
> If I'm not mistaken, SAS sweeps on the extended cross product matrix to fit
> regression models. That, I believe, in usually faster than doing QR
> decomposition on the model matrix itself, but there are trade-offs. You
> could try what Prof. Bates suggested.

Hmm. Shouldn't be all that much faster, but it will produce the Type I
SS as you go along, whereas R probably wants to fit the 15 different
models.

I'm still surprised that R/S-PLUS manages to use a full 15 minutes on
a single response variable. It might be due to the singularities --
the SAS code indicated that there was a nesting issue with the "A"
factor in the last 4-factor interaction. If so, a reformulation of the
model might help.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

______________________________________________ R-help@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:09 EST