Re: [R] Antwort: Re: Antwort: Buying more computer for GLM

From: Prof Brian Ripley <>
Date: Fri 01 Sep 2006 - 09:12:48 GMT

On Fri, 1 Sep 2006, wrote:

> Peter Dalgaard wrote
> > Is this floating point bound? (When you say 30 factors does that mean
> > 30 parameters or factors representing a much larger number of groups).
> > If it is integer bound, I don't think you can do much better than
> > increase CPU speed and - note - memory bandwidth (look for large-cache
> > systems and fast front-side bus). To increase floating point
> > performance, you might consider the option of using optimized BLAS
> > (see the Windows FAQ 8.2 and/or the "R Installation and
> > Administration" manual) like ATLAS; this in turn may be multithreaded
> > and make use of multiple CPUs or multi-core CPUs.
> By "factors" I mean "parameters". I apologise for the confusion.
> This is floating point bound, so ATLAS might be a good idea.
> Before I put a lot of work into investigating multiple processors, I
> need to know, is the bottleneck with GLM going to be BLAS?

Probably not, but you have the ability to profile in R and find out.

Some more comments;

  1. The Fortran code that underlies glm is that of that only makes use of level-1 BLAS and so is not going to be helped greatly by an optimized BLAS.
  2. No one has as far as I know succeeded in making a multithreaded Rblas.dll for Windows. And under systems using pthreads, the success with multithreaded BLAS is very mixed, with it resulting in a dramatic slowdown in some problems.
  3. As I recall, you were doing model selection via AIC on 20,000 observations. You might want to think hard about that, since AIC is designed for good prediction. I would do model exploration on a much smaller representative subset, and if I had 20,000 observations and 30 parameters and was interested in prediction, not do subset selection at all.
  4. glm() alllows you to specify starting parameters, which you could find from a subsample. Very likely only 1 or 2 iterations would be needed.
Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Sep 02 04:22:54 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 07 Sep 2006 - 07:51:17 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.