From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>

Date: Wed 17 Aug 2005 - 00:13:50 EST

Date: Wed 17 Aug 2005 - 00:13:50 EST

This is seems to be an unstated repeat of much of an earlier and unanswered post

https://stat.ethz.ch/pipermail/r-help/2005-August/075914.html

entitled

[R] error in predict glm (new levels cause problems)

It is nothing to do with `nbinomial glm' (sic): all model fitting functions including lm and glm do this. The reason you did not get at least one reply from your first post is that you seemed not to have done your homework. (One thing the posting guide does ask is for you to try the current version of R, and yours is three versions old.)

The code is protecting you from an attempt at statistical nonsense. (Indeed, the check was added to catch such misuses.) Your email address seems to be that of a student, so please seek the help of your advisor. You seem surprised that you are not allowed to make predictions about levels for which you have supplied no relevant data.

On Tue, 16 Aug 2005, K. Steinmann wrote:

> Dear R-helpers,

*>
**> let us assume, that I have the following dataset:
**>
**> a <- rnbinom(200, 1, 0.5)
**> b <- (1:200)
**> c <- (30:229)
**> d <- rep(c("q", "r", "s", "t"), rep(50,4))
**> data_frame <- data.frame(a,b,c,d)
**>
**> In a first step I run a glm.nb (full code is given at the end of this mail) and
**> want to predict my response variable a.
**> In a second step, I would like to run a glm.nb based on a subset of the
**> data_frame. As soon as I want to predict the response variable a, I get the
**> following error message:
**> "Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
**> object$xlevels) :
**> factor d has new level(s) q"
**>
**> Does anybody have a solution to this problem?
**>
**> Thank you in advance,
**> K. Steinmann (working with R 2.0.0)
**>
**>
**> Code:
**>
**> library(MASS)
**>
**> a <- rnbinom(200, 1, 0.5)
**> b <- (1:200)
**> c <- (30:229)
**> d <- rep(c("q", "r", "s", "t"), rep(50,4))
**>
**> data_frame <- data.frame(a,b,c,d)
**>
**> model_1 = glm.nb(a ~ b + d , data = data_frame)
**>
**> pred_model_1 = predict(model_1, newdata = data_frame, type = "response", se.fit
**> = FALSE, dispersion = NULL, terms = NULL)
**>
**> subset_of_dataframe = subset(data_frame, (b > 80 & c < 190 ))
**>
**> model_2 = glm.nb(a ~ b + d , data = subset_of_dataframe)
**> pred_model_2 = predict(model_2, newdata = subset_of_dataframe, type =
**> "response", se.fit = FALSE, dispersion = NULL, terms = NULL)
*

-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Wed Aug 17 00:19:35 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:39:49 EST
*