Re: [R] logistic regression and dummy variable coding

From: Marc Schwartz <marc_schwartz_at_comcast.net>
Date: Thu, 28 Jun 2007 19:41:21 -0500

On Thu, 2007-06-28 at 18:16 -0500, Bingshan Li wrote:
> Hello everyone,
>
> I have a variable with several categories and I want to convert this
> into dummy variables and do logistic regression on it. I used
> model.matrix to create dummy variables but it always picked the
> smallest one as the reference. For example,
>
> model.matrix(~.,data=as.data.frame(letters[1:5]))
>
> will code 'a' as '0 0 0 0'. But I want to code another category as
> reference, say 'b'. How to do it in R using model.matrix? Is there
> other way to do it if model.matrix has no such functionality?
>
> Thanks!

See ?relevel

Note that this (creating dummy variables) will be done automatically in R's modeling functions, which default to treatment contrasts on factors. model.matrix() is used internally by model functions such as glm().

For example using a single factor:

FL <- factor(letters[1:5])

> FL

[1] a b c d e
Levels: a b c d e

> contrasts(FL)

  b c d e
a 0 0 0 0
b 1 0 0 0
c 0 1 0 0
d 0 0 1 0
e 0 0 0 1

FL.b <- relevel(FL, "b")

> FL.b

[1] a b c d e
Levels: b a c d e

> contrasts(FL.b)

  a c d e
b 0 0 0 0
a 1 0 0 0
c 0 1 0 0
d 0 0 1 0
e 0 0 0 1

See ?contrasts and the Statistical Models section in "An Introduction to R".

HTH, Marc Schwartz



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 29 Jun 2007 - 00:46:12 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 29 Jun 2007 - 01:32:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.