[R] Possible improvement in lm

From: Vivek Satsangi <vivek.satsangi_at_gmail.com>
Date: Wed 18 Jan 2006 - 23:08:00 EST


I do a series of regressions (one for each quarter in the dataset) and then go and extract the residuals from each stored lm object that is returned as follows:

vResiduals <- as.vector(unlist(resid(lQuarterlyRegressions[[i]])));

Here lQuarterlyRegressions is a vector of objects returned by lm().

Next, I may go find outliers using identify() on a plot or do some other analysis which tells me which row of the quarterly data I need to take a closer look at.

However, if I try to match some point in one of the quarters that I have with its residual, then I have to drop the points from my "current Data" which have NA's for either the explanatory variables or the explained, so that the vector or residuals and the data have the same indexes.

This lead to some serious confusion/bugs for me, and I am wondering if it might not be better for lm to put an NA into those rows where the point was dropped because of NA's in the explanatory or explained variables (currently it just returns nothing at that index). Ofcourse, there might be some arguments against this idea, and I would be interested to hear them.

Thank you for your time and attention,

