Re: [R] Antwort: Re: Antwort: Buying more computer for GLM

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Fri 01 Sep 2006 - 13:50:52 GMT

On Fri, 1 Sep 2006, g.russell@eos-finance.com wrote:

> Prof Brian Ripley wrote:
> > I would not have expected glm to be more than say 5x slower than lm if
> CPU
> > cycles and not memory were the limiting factor. In that case more RAM
> > might be all you need.

>
> The ratio between glm and lm might well be about 5x, but that's still a
> big difference for us.

You said lm was 'very fast', so I did not expect 5x 'very fast' to be 'too slow'.

> I am pretty sure that RAM is not the main
> problem; according to the Windows Task Manager the computer is at close to
> 100% CPU usage, and swapping is not going on. Of course L1/L2 caches may
> still be
> something one can work on, but I'm not sure whether glm has enough
> repeated access to the same data for that to help. (I don't know how glm
> works,
> but I guess it does a lot of scans through the whole data set, and that
> the amount of working memory it needs during these scans is basically a
> function of the number of parameters, not the number of observations, is
> that right?)

Not so. Because glm does weighted fits, it needs to access the whole data matrix at each iteration (to re-weight).

> Many thanks for your observations about subset selection by the way, they
> are a lot of help. Would a good approach be, say, to use some stricter
> criteria like BIC for choosing a model, and then use non-statistical
> methods to improve the plausibility of the chosen parameters?

The latter entirely I would say. All statistics can say is that a variable improves the fit measurably more than one that is unrelated to the response: whether it improves it enough to be worthwhile in your application is non-statistical. The point here is that all but the most uselss variables will measurably improve the fit in large problems with few variables.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri Sep 01 23:57:42 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 07 Sep 2006 - 07:51:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.