Re: [Rd] proposal: allowing alternative variance estimators in glm/lm

From: Thomas Lumley <>
Date: Tue 02 Jan 2007 - 18:03:02 GMT

On Tue, 2 Jan 2007, ivo welch wrote:
> As a non-statistician user of R, maybe a hook functionality at strategic
> places could provide some flexibility without too much pain. I think
> replacing the standard output from summary.lm would be a bad idea (it
> could easily create errors downstream, when idiots like myself ask "why
> don't I get the s.e. that stata produces? duh---you loaded
> heteroskedasticity adjustment, but forgot about it). But I think some
> flexibility to add more information would be a very good thing.
> Hooks that can be set by functions (perhaps cascades) would allow third
> parties to create additional statistics, that could survive future
> changes to the functions themselves, without requiring a full object
> paradigm. For example, summary.lm could provide two hooks that allow
> programmers to chain my own objects to either the ans$coefficients and
> the ans object. (I guess even one hook would do.) Well-thought-out
> hooks could also add to print methods, etc., without requiring complete
> function rewrites, and would survive future changes in the real R code
> itself.

In some sense this is what I was proposing -- I think that summary.lm and predict.lm should get coefficients from coef() and variance matrices from vcov() to provide hooks.

This should ideally happen automatically by inheritance, but for historical reasons all the basic code was written for the most specialized case (classical linear model) rather than the most general case.

>> From the perspective of a first-time amateurish end-user, an invokation
> of "library(lm.addnormalized)" could then magically always add a
> normalized coefficient to the coefficient output. An invokation of
> "library(lm.addheteroskedasticity)" could magically always add
> heteroskedasticity se's and T-stats. And so on.


I really think you want these properties to be part of the object, not global options (and particularly not global options set by the library() function). It should be possible to simultaneously have objects that use the fisher-information standard errors and other objects that use sandwich estimators.

I would say that "using sandwich standard errors" is a property that should be specified when the object is created (as it is for survival::coxph() and gee::gee()).

"Displaying normalized coefficients" is a bit different. I don't think you want to override what the coef() method produces, since that will mess up fitted values and predictions. You would also need to override the vcov() method, and that would cause other problems.

So in order to do what you want, the objects would need a printedcoef() and printedSE() method in addition to the actual coef() and vcov(). These would have to be called by a summary-like function. But even this isn't the level of generality that you would want, since if you are doing this work it would be nice to allow transformed coefficients (eg odds ratios rather than log odds ratios for logistic regression), and you probably then need a printedCI() method for confidence intervals. This would give quite a useful modelsummary() function (provided it did something sensible when the additional methods weren't present). The Design package does things along these lines already, but its display functions don't work on vanilla model objects.

In any case, I don't think this function should be called summary.lm(). My proposal was just to allow predict.lm and summary.lm to be used in cases where they already give the right sort of results and in addition to tidy up the duplication of variance-matrix code.

         -thomas mailing list Received on Wed Jan 03 09:38:18 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 02 Jan 2007 - 23:31:08 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.