Re: [R] "Error in contrasts" in step wise regression

From: Prof Brian D Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue 28 Jun 2005 - 04:01:37 EST

On Mon, 27 Jun 2005, Young Cho wrote:

> Thanks for the reply. I created a new dataframe and ran step on it. But, still it does not work.
>
> > detach(dat)
> > attach(ds)
> > dat <- ds[,sapply(ds,nlevels)>=2]
> > dat$Y <- Response
> > detach(ds)
> > attach(dat)
> > fmla <- as.formula(paste(" ~ ",paste(collist1[sapply(ds,nlevels)>=2],collapse="+")))
> > fit.s <- step(fit.1, direction="forward",scope=list(upper= fmla,lower= ~1))
> Start: AIC= -1651.18
> Y ~ 1
> Error in "contrasts<-"(`*tmp*`, value = "contr.treatment") :
> contrasts can be applied only to factors with 2 or more levels
> >

R does have debugging tools: please use them.

> Also, I was wondering if you know why the followings behave differently
> from the above:

Yes, as I have read the help page for step(). Have you? It is discussed there.

> > fit.s <- step(lm(Y~1),scope=list(upper=~.,lower=~1),)
> Start: AIC= -1651.18
> Y ~ 1
> > fit.s <- step(fit.1,scope=list(upper=~.,lower=~1),)
> Start: AIC= -1651.18
> Y ~ 1
>
> I thought "~." uses "all other variables in the data frame" according to
> "Introduciton to R."

In contexts where there is a data frame and there is no more specific documentation, it means `all remaining variables separated by +'.

>
> -Young.
>
>
> Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
> On Fri, 24 Jun 2005, Young Cho wrote:
>
> > Hi,
> >
> > I have a problem in getting step function work.
>
> This is not coming from step(), but (AFAIK) from model.matrix() called by
> lm(). One way to debug it is to try fitting the models directly.
>
> > I am getting the following error:
> >
> >> fit1 <- lm(Response~1)
> >> fmla <- as.formula(paste(" ~ ",paste(colnames,collapse="+")))
> >> sfit <- step(fit1,scope=list(upper= fmla,lower= ~1),k=log(nrow(dat)))
> > Start: AIC= -1646.66
> > Response ~ 1
> > Error in "contrasts<-"(`*tmp*`, value = "contr.treatment") :
> > contrasts can be applied only to factors with 2 or more levels
> >
> > But if i count the unique values in each column by
> >
> > A <- NULL
> > for (ii in 1:length(colnames)){
> > A[ii] <- length(unique( eval(parse(text=paste('dat$',colnames[ii])))))
> > }
> >
> > I do not see any column with only 1 value. Is there some other possible
> > reason why I am getting the error? Thanks a lot!
>
> It says `levels', not values. So try
>
> sapply(dat, nlevels)
>
> The values can include NA, which is not a level (usually). E.g.
>
> > x <- factor(c(1, NA))
> > nlevels(x)
> [1] 1
> > length(unique(x))
> [1] 2
>
> (Incidentally, you are assuming variables are found in dat, and you should
> use
>
> lm(Response ~ 1, data=dat)
>
> to ensure that. And your calculation can be done more legibly as
>
> sapply(dat, function(x) length(unique(x)))
>
> .)
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
>
> ---------------------------------
> Yahoo! Sports
> Rekindle the Rivalries. Sign up for Fantasy Football



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jun 28 04:06:29 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:03 EST