thanks to all for the quick replies!

dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y = rnorm(9))
dat.new = dat[1:6, ]

dat.new$fact = dat$fact[1:6, drop = T]

> When I take a subset of a factor the reduced factor still maintains

*> all the original levels of the factor when say forming the key in a plot.
**> The data is correct, but the variable still "remembers" the original
**> levels. See below for reproducible code. Does anyone know how to fix
**> this?
**> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact =
**> fact[1:6]
**> > new.fact
**> [1] A A A B B B
**> Levels: A B C ## should only show A B
Just use

> factor(new.fact)

[1] A A A B B B

Levels: A B

or

> fact[1:6, drop=T]

[1] A A A B B B

Levels: A B

And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.)

And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.)

