[Rd] Unexpected behavior in factor level ordering

From: Paul Johnson <pauljohn32_at_gmail.com>
Date: Sat, 25 Feb 2012 12:16:07 -0600

ink1">Hello, Everybody:

This may not be a "bug", but for me it is an unexpected outcome. A factor variable's levels
do not retain their ordering after the levels function is used. I supply an example in which
a factor with values "BC" "AD" (in that order) is unintentionally re-alphabetized by the levels

To me, this is very bad behavior. Would you agree?

# Paul Johnson 2012-02-05

x <- c("AD","BC","AD","BC","AD","BC")
xf <- factor(x, levels=c("BC", "AD"), labels=c("Before Christ","After Christ")) y <- rnorm(6)

m1 <- lm (y ~ xf )

plot(y ~ xf)

abline (m1)
## Just a little problem the line does not "go through" the box
## plot in the right spot because contrasts(xf) is 0,1 but
## the plot uses xf in 1,2.

xlevels <- levels(xf)
newdf <- data.frame(xf=xlevels)

ypred <- predict(m1, newdata=newdf)

##Watch now: the plot comes out "reversed", AC before BC plot(ypred ~ newdf$xf)

## Ah. Now I see:

## Why doesnt newdf$xf respect the ordering of the levels?

Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

R-devel_at_r-project.org mailing list
Received on Sat 25 Feb 2012 - 18:27:47 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 26 Feb 2012 - 07:00:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive