Re: [R] predict nbinomial glm

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed 17 Aug 2005 - 00:13:50 EST

This is seems to be an unstated repeat of much of an earlier and unanswered post

         https://stat.ethz.ch/pipermail/r-help/2005-August/075914.html

entitled

         [R] error in predict glm (new levels cause problems)

It is nothing to do with `nbinomial glm' (sic): all model fitting functions including lm and glm do this. The reason you did not get at least one reply from your first post is that you seemed not to have done your homework. (One thing the posting guide does ask is for you to try the current version of R, and yours is three versions old.)

The code is protecting you from an attempt at statistical nonsense. (Indeed, the check was added to catch such misuses.) Your email address seems to be that of a student, so please seek the help of your advisor. You seem surprised that you are not allowed to make predictions about levels for which you have supplied no relevant data.

On Tue, 16 Aug 2005, K. Steinmann wrote:

> Dear R-helpers,
>
> let us assume, that I have the following dataset:
>
> a <- rnbinom(200, 1, 0.5)
> b <- (1:200)
> c <- (30:229)
> d <- rep(c("q", "r", "s", "t"), rep(50,4))
> data_frame <- data.frame(a,b,c,d)
>
> In a first step I run a glm.nb (full code is given at the end of this mail) and
> want to predict my response variable a.
> In a second step, I would like to run a glm.nb based on a subset of the
> data_frame. As soon as I want to predict the response variable a, I get the
> following error message:
> "Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
> object$xlevels) :
> factor d has new level(s) q"
>
> Does anybody have a solution to this problem?
>
> Thank you in advance,
> K. Steinmann (working with R 2.0.0)
>
>
> Code:
>
> library(MASS)
>
> a <- rnbinom(200, 1, 0.5)
> b <- (1:200)
> c <- (30:229)
> d <- rep(c("q", "r", "s", "t"), rep(50,4))
>
> data_frame <- data.frame(a,b,c,d)
>
> model_1 = glm.nb(a ~ b + d , data = data_frame)
>
> pred_model_1 = predict(model_1, newdata = data_frame, type = "response", se.fit
> = FALSE, dispersion = NULL, terms = NULL)
>
> subset_of_dataframe = subset(data_frame, (b > 80 & c < 190 ))
>
> model_2 = glm.nb(a ~ b + d , data = subset_of_dataframe)
> pred_model_2 = predict(model_2, newdata = subset_of_dataframe, type =
> "response", se.fit = FALSE, dispersion = NULL, terms = NULL)

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Aug 17 00:19:35 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:39:49 EST