Re: [Rd] variable scope in update(): bug or feature?

From: NL <wuolong_at_gmail.com>
Date: Fri 22 Dec 2006 - 17:48:54 GMT

Here is an example:

> rm (list = ls())
> x <- 1:10
> mdata <- data.frame (z = rnorm (10), y = x + 3)
> m1 <- lm (y ~ x + z, data = mdata)
> summary (m1)

Call:
lm(formula = y ~ x + z, data = mdata)

Residuals:

       Min 1Q Median 3Q Max -4.950e-16 -8.107e-17 2.085e-17 9.043e-17 3.787e-16

Coefficients:

              Estimate Std. Error t value Pr(>|t|)

(Intercept)  3.000e+00  1.923e-16  1.560e+16   <2e-16 ***
x            1.000e+00  2.881e-17  3.472e+16   <2e-16 ***
z           -8.717e-17  1.149e-16 -7.590e-01    0.473
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.6e-16 on 7 degrees of freedom
Multiple R-Squared:     1,	Adjusted R-squared:     1
F-statistic: 6.103e+32 on 2 and 7 DF,  p-value: < 2.2e-16


> x <- rep (1:2, each = 5)
> m2 <- update (m1, ~ . - z)
> summary (m2)
Call: lm(formula = y ~ x, data = mdata) Residuals: Min 1Q Median 3Q Max -2.000e+00 -1.000e+00 2.086e-16 1.000e+00 2.000e+00 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.000 1.581 0.632 0.54474 x 5.000 1.000 5.000 0.00105 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.581 on 8 degrees of freedom Multiple R-Squared: 0.7576, Adjusted R-squared: 0.7273 F-statistic: 25 on 1 and 8 DF, p-value: 0.001053 This is R 2.4.1 on Mac OS X 10.4.8. I think this could be a bug (at least it is not doing what I expected) so I emailed R-devel. Michael On 12/22/06, Martin Maechler <maechler@stat.math.ethz.ch> wrote:
> Hi Michael,
> can you please
>
> - use a simple reproducible example --
> just for the convenience of your readers
>
> - use R-help. This is really a question about R.
>
>
>
> >>>>> "Michael" == Michael <wuolong@gmail.com>
> >>>>> on Thu, 21 Dec 2006 11:08:15 -0600 writes:
>
> Michael> I stumbled upon this when using update()
> Michael> (specifically update.lm()). If in the original
> Michael> call to lm(), say
>
> Michael> a <- lm (y ~ x + z, data = mydata)
>
> Michael> where y and z are in data frame mydata but x is in
> Michael> the global environment.
>
> Michael> Then if later I run,
>
> Michael> a0 <- update (a, ~ . - z)
>
> Michael> a0$model will contain values of x in the global
> Michael> environment which may well be different, even
> Michael> different length from mydata$y. Somehow, update()
> Michael> pads a0$model to have the same number of rows as
> Michael> the length of x.
>
> Michael> I would think that it would desirable to use x as
> Michael> in a$model rather than the global one.
>
> Michael> Is this a bug or a deliberate feature?
>
> Michael> Thanks,
>
> Michael> Michael
>
> Michael> ______________________________________________
> Michael> R-devel@r-project.org mailing list
> Michael> https://stat.ethz.ch/mailman/listinfo/r-devel
>
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sun Dec 24 01:05:53 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 23 Dec 2006 - 14:31:03 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.