From: Paul Johnson <pauljohn32_at_gmail.com>

Date: Wed, 14 Dec 2011 00:30:33 -0600

Date: Wed, 14 Dec 2011 00:30:33 -0600

I'm making some functions to illustrate regressions and I have been
staring at termplot and predict.lm and residuals.lm to see how this is
done. I've wondered who wrote predict.lm originally, because I think
it is very clever.

I got interested because termplot doesn't work with interactive models:

> m1 <- lm(y ~ x1*x2)

> termplot(m1)

Error in `[.data.frame`(mf, , i) : undefined columns selected

Digging into that, I realized some surprising implications of nonlinear formulas.

This issue arises when there are math functions in the regression formula. The question focuses on what we mean by the mean of "x" when we are discussing predictions and deviations.

Suppose one fits:

m1 <- lm (y ~ x1 + log(x2), data=dat)

I had thought the partial residual was calculated with reference to the log of the mean of x2. But that's not right. It is calculated with reference to mean(log(x2)). That seems misleading, termplot shows a graph illustrating the effect of x2 on the horizontal axis (not "log(x2)"). I should not say misleading. Rather, it is unexpected. I think users who want the reference value in the plot of x2 to be the mean of x2 have a legitimate concern here.

With a more elaborate formula, the mismatch gets more confusing. Suppose the regression formula is

m2 <- lm (y ~ x1 + poly(x2,3), data=dat)

The model frame has these variables:

y x1 poly(x2, 3).1 poly(x2, 3).2 poly(x2, 3).3

and the partial residual calculation for variable x1, which I had expected would be based on a polynomial transformation of mean(x2), is the weighted sum of the means of the 3 polys.

Can you help me see this more clearly? (Or less wrongly?)

Perhaps you think I don't understand partial residuals in termplot, but I am pretty sure I do. I made notes about it. See slides 54 and 55 in here: http://pj.freefaculty.org/guides/Rcourse/regression-tableAndPlot-1/regression-tableAndPlot.pdf

-- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas ______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Wed 14 Dec 2011 - 06:32:56 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

*
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.2.0, at Wed 14 Dec 2011 - 12:00:17 GMT.
*

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*