Re: [R] [Fwd: Re: [Fwd: failure delivery]]

From: Uwe Ligges <>
Date: Fri 27 May 2005 - 00:59:25 EST

Can you please specify a small reproducible example?

Uwe Ligges

Prof J C Nash wrote:

> I appear to have hit one of the "drop" issues raised in some discussions
> a couple of years ago by Frank Harrell. They don't seem to have been
> fixed, and I'm under some pressure to get a quick solution for a
> forecasting task I'm doing.
> I have been modelling some retail sales data, and the days just after
> Thanksgiving (US version!) are important. So I created some dummy
> variables by a factor called "events" and (really ugly!!) have TG, TG+1,
> TG+2, etc. Now I also have DEC1, and the calendar and data are such
> that in the period I'm forecasting I have TG+3 but this is
> NOT in the estimation data. There are also weekday factors (wdf) and some
> cross factors (Saturday + some special days is highly significant).
> The model is Sales ~ daynumber + wdf*events + wdf*specialevents
> where daynumber is the day sequence in the year and specialevents is a
> set of factors to tell when the business has promotional activities.
> The entire model has about 330 coefficients (it seriously needs some
> economizing), but only about 140 of these are estimated.
> I'm using lm() to do the estimation. I plan to change the model and
> possibly
> the method once I've seen if forecasting works. The current model "works"
> moderately well for in-sample fits, though I suspect there is too
> much variability generally.
> I want to advance 1 week at a time, reestimate, and iterate. This is
> a test case where we know the "future". I can get this to work for a few
> weeks starting at 20041101, but then get an error msg
> "new factor levels in 'events' ...".
> I have tried putting drop.factor.levels = TRUE in predict(), but this
> didn't seem to register. Also tried suggestion from web to use
> ifac <- sapply(estndta,is.factor)
> fcstdta[ifac] <- lapply(fcstdta[ifac],factor)
> Still get same error.
> I've tried a couple of dozen variants on this with no joy.
> Finally have tried using the full data set in lm() but set weights for
> the estimation period to 1, and those for the forecast period to 0. This
> "computes", but the results include NAs at a point where there seems no
> reason for them.
> I'm starting to suspect that there's some sort of bug somewhere in the R
> internals.
> Any advice welcome.
> mailing list PLEASE do read the posting guide! Received on Fri May 27 01:15:45 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:07 EST