From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>

Date: Fri 06 Jan 2006 - 04:42:59 EST

*>
*

*>
*

*> . . .
*

*>
*

*>
*

*>
*

*>
*

*>
*

> The reference, I believe, is to the linear.hypothesis() function, which has

*> methods for lm and glm objects. [To see what kinds of objects
*

*> linear.hypothesis is suitable for, use the command
*

*> methods(linear.hypothesis).] For lm objects, you get an F-test by default.
*

*> Note that the Anova() function, also in car, can more conveniently compute
*

*> Wald tests for certain kinds of hypotheses. More generally, however, I'd be
*

*> interested in your suggestions for an alternative method of specifying
*

*> linear hypotheses. There is currently no method for mle objects, but adding
*

*> one is a good idea, and I'll do that when I have a chance. (In the meantime,
*

*> it's very easy to compute Wald tests from the coefficients and the
*

*> hypothesis and coefficient-covariance matrices. Writing a small function to
*

*> do so, without the bells and whistles of something like linear.hypothesis(),
*

*> should not be hard. Indeed, the ability to do this kind of thing easily is
*

*> what I see as the primary advantage of working in a statistical computing
*

*> environment like R -- or Stata.
*

*>
*

*>
*

*>
*

*>
*

> I think that we have a philosophical difference here: I don't like giving

*> advice in documentation. An egregious extended example of this, in my
*

*> opinion, is the SPSS documentation. The hccm() function uses hc3 as the
*

*> default, which is an implicit recommendation, but more usefully, in my view,
*

*> points to Long and Erwin's American Statistician paper on the subject, which
*

*> does give advice and which is quite accessible. As well, and more generally,
*

*> the car package is associated with a book (my R and S-PLUS Companion to
*

*> Applied Regression), which gives advice, though, admittedly, tersely in this
*

*> case.
*

*>
*

*> The Anova() function with argument white=TRUE will give you F-tests
*

*> corresponding to the t-tests to which you refer (though it will combine df
*

*> for multiple-df terms in the model). To get the kind of summary you
*

*> describe, you could use something like
*

*>
*

*> mysummary <- function(model){
*

*> coef <- coef(model)
*

*> se <- sqrt(diag(hccm(model)))
*

*> t <- coef/se
*

*> p <- 2*pt(abs(t), df=model$df.residual, lower=FALSE)
*

*> table <- cbind(coef, se, t, p)
*

*> rownames(table) <- names(coef)
*

*> colnames(table) <- c("Estimate", "Std. Error", "t value",
*

*> "Pr(>|t|)")
*

*> table
*

*> }
*

*>
*

*> Again, it's not time-consuming to write simple functions like this for one's
*

*> own use, and the ability to do so is a strength of R, in my view.
*

*>
*

*> I'm not sure what you mean about handling missing data: functions like
*

*> hccm(), linear.hypothesis(), and Anova() start with a model object for which
*

*> missing data have already been handled.
*

*>
*

*> Regards,
*

*> John
*

*>
*

*> ______________________________________________
*

*> R-help@stat.math.ethz.ch mailing list
*

*> https://stat.ethz.ch/mailman/listinfo/r-help
*

*> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
*

*>
*

Date: Fri 06 Jan 2006 - 04:42:59 EST

John Fox wrote:

*> Dear Peter,
**>
**>
*

>>-----Original Message----- >>From: r-help-bounces@stat.math.ethz.ch >>[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Peter >>Muhlberger >>Sent: Wednesday, January 04, 2006 2:43 PM >>To: rhelp >>Subject: [R] A comment about R: >>

>>Ex. 1) Wald tests of linear hypotheses after max. likelihood >>or even after a regression. "Wald" does not even appear in >>my standard R package on a search. There's no comment in the >>lm help or optim help about what function to use for >>hypothesis tests. I know that statisticians prefer >>likelihood ratio tests, but Wald tests are still useful and >>indeed crucial for first-pass analysis. After searching with >>Google for some time, I found several Wald functions in >>various contributed R packages I did not have installed. One >>confusion was which one would be relevant to my needs. This >>took some time to resolve. I concluded, perhaps on >>insufficient evidence, that package car's Wald test would be >>most helpful. To use it, however, one has to put together a >>matrix for the hypotheses, which can be arduous for a >>many-term regression or a complex hypothesis. >>In comparison, >>in Stata one simply states the hypothesis in symbolic terms. >>I also don't know for certain that this function in car will >>work or work properly w/ various kinds of output, say from lm >>or from optim. To be sure, I'd need to run time-consuming >>tests comparing it with Stata output or examine the >>function's code. In Stata the test is easy to find, and >>there's no uncertainty about where it can be run or its >>accuracy. Simply having a comment or "see also" in lm help >>or mle or optim help pointing the user to the right Wald >>function would be of enormous help.

The Design package's anova.Design and contrast.Design make many Wald tests very easy. contrast( ) will allow you to test all kinds of hypotheses by stating which differences in predicted values you are interested in.

Frank Harrell

>>

> The reference, I believe, is to the linear.hypothesis() function, which has

>>Ex. 2) Getting neat output of a regression with Huberized >>variance matrix. >>I frequently have to run regressions w/ robust variances. In >>Stata, one simply adds the word "robust" to the end of the >>command or "cluster(cluster.variable)" for a cluster-robust >>error. In R, there are two functions, robcov and hccm. I >>had to run tests to figure out what the relationship is >>between them and between them and Stata (robcov w/o cluster >>gives hccm's hc0; hccm's hc1 is equivalent to Stata's >>'robust' w/o cluster; etc.). A single sentence in hccm's >>help saying something to the effect that statisticians prefer >>hc3 for most types of data might save me from having to >>scramble through the statistical literature to try to figure >>out which of these I should be using. A few sentences on >>what the differences are between these methods would be even >>better. Then, there's the problem of output. Given that hc1 >>or hc3 are preferred for non-clustered data, I'd need to be >>able to get regression output of the form summary(lm) out of >>hccm, for any practical use. Getting this, however, would >>require programming my own function. Huberized t-stats for >>regressions are commonplace needs, an R oriented a little >>toward more everyday needs would not require programming of >>such needs. Also, I'm not sure yet how well any of the >>existing functions handle missing data. >>

> I think that we have a philosophical difference here: I don't like giving

-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Fri Jan 06 04:50:57 2006

*
This archive was generated by hypermail 2.1.8
: Fri 06 Jan 2006 - 06:04:56 EST
*