Re: [R] "Error in contrasts" in step wise regression

From: Young Cho <iidn01_at_yahoo.com>
Date: Tue 28 Jun 2005 - 03:49:34 EST


Thanks for the reply. I created a new dataframe and ran step on it. But, still it does not work.  

> detach(dat)
> attach(ds)
> dat <- ds[,sapply(ds,nlevels)>=2]
> dat$Y <- Response
> detach(ds)
> attach(dat)
> fmla <- as.formula(paste(" ~ ",paste(collist1[sapply(ds,nlevels)>=2],collapse="+")))
> fit.s <- step(fit.1, direction="forward",scope=list(upper= fmla,lower= ~1))
Start: AIC= -1651.18
 Y ~ 1
Error in "contrasts<-"(`*tmp*`, value = "contr.treatment") :

        contrasts can be applied only to factors with 2 or more levels
>

Also, I was wondering if you know why the followings behave differently from the above:

> fit.s <- step(lm(Y~1),scope=list(upper=~.,lower=~1),)
Start: AIC= -1651.18
 Y ~ 1
> fit.s <- step(fit.1,scope=list(upper=~.,lower=~1),)
Start: AIC= -1651.18
 Y ~ 1

I thought "~." uses "all other variables in the data frame" according to "Introduciton to R."  

-Young.

Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote: On Fri, 24 Jun 2005, Young Cho wrote:

> Hi,
>
> I have a problem in getting step function work.

This is not coming from step(), but (AFAIK) from model.matrix() called by lm(). One way to debug it is to try fitting the models directly.

> I am getting the following error:
>
>> fit1 <- lm(Response~1)
>> fmla <- as.formula(paste(" ~ ",paste(colnames,collapse="+")))
>> sfit <- step(fit1,scope=list(upper= fmla,lower= ~1),k=log(nrow(dat)))
> Start: AIC= -1646.66
> Response ~ 1
> Error in "contrasts<-"(`*tmp*`, value = "contr.treatment") :
> contrasts can be applied only to factors with 2 or more levels
>
> But if i count the unique values in each column by
>
> A <- NULL
> for (ii in 1:length(colnames)){
> A[ii] <- length(unique( eval(parse(text=paste('dat$',colnames[ii])))))
> }
>
> I do not see any column with only 1 value. Is there some other possible
> reason why I am getting the error? Thanks a lot!

It says `levels', not values. So try

sapply(dat, nlevels)

The values can include NA, which is not a level (usually). E.g.

> x <- factor(c(1, NA))
> nlevels(x)
[1] 1
> length(unique(x))
[1] 2

(Incidentally, you are assuming variables are found in dat, and you should use

lm(Response ~ 1, data=dat)

to ensure that. And your calculation can be done more legibly as

sapply(dat, function(x) length(unique(x)))

.)

-- 
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595

		
---------------------------------

 Rekindle the Rivalries. Sign up for Fantasy Football
	[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue Jun 28 03:54:22 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:03 EST