Re: [R] Bug in stepAIC?

From: Martin C. Martin <martin_at_martincmartin.com>
Date: Thu 12 Oct 2006 - 13:47:10 GMT

Prof Brian Ripley wrote:
> You sent this earlier to R-devel. Please do see the posting guide!
> Since you (incorrectly) thought this was a bug in MASS, you should have
> contacted the maintainer.

Thanks, but I did try emailing both you and Prof. Venables directly a month ago. After not receiving a response, I emailed R-devel last week.   After not receiving a response there, I thought perhaps the code was correct after all, and I misunderstood how to call it - a perfect question for R-help.

There can be a fine line between R-help and R-devel, which is even harder to find when you're new to R and you don't really know where the problem is.

> On Wed, 11 Oct 2006, Martin C. Martin wrote:
>

>> Hi,
>>
>> First of all, thanks for the great work on R in general, and MASS in
>> particular.  It's been a life saver for me many times.
>>
>> However, I think I've discovered a bug.  It seems that, when I use
>> weights during an initial least-squares regression fit, and later try to
>> add terms using stepAIC(), it uses the weights when looking to remove
>> terms, but not when looking to add them:
>>
>> hills.lm <- lm(time ~ dist + climb, data = hills, weights = 1/dist2)

>
> Presumably dist^2?

Yes, sorry, a problem with Thunderbird being a little too smart for it's own good. :)

>> small.hills.lm <- stepAIC(hills.lm)
>> stepAIC(small.hills.lm, time ~ dist + climb)
>>
>> In the first stepAIC(), it says that the AIC for the full "time ~ dist +
>> climb" is 94.41.  Yet, during the second stepAIC, it says adding climb
>> would produce an AIC of 212.1 (and an RSS of 12633.3).  Is this a bug?

>
> Yes, but not in stepAIC. Consider
>
>> drop1(hills.lm)

> Single term deletions
>
> Model:
> time ~ dist + climb
> Df Sum of Sq RSS AIC
> <none> 437.64 94.41
> dist 1 164.05 601.68 103.55
> climb 1 8.66 446.29 93.10
>> add1(small.hills.lm, time ~ dist + climb)

> Single term additions
>
> Model:
> time ~ dist
> Df Sum of Sq RSS AIC
> <none> 15787.2 217.9
> climb 1 3153.8 12633.3 212.1
>> stats:::add1.default(small.hills.lm, time ~ dist + climb)

> Single term additions
>
> Model:
> time ~ dist
> Df AIC
> <none> 93.097
> climb 1 94.411
>
> so the bug is in add1.lm, part of R itself. Other code has been altered
> which then broke add1.lm and 'z' needs to be given class "lm". Now
> fixed in r-devel and r-patched.

Great; thanks!

Best,
Martin



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Oct 12 23:50:37 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 12 Oct 2006 - 14:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.