[R] converting factors to dummy variables

From: Tim Calkins <tim.calkins_at_gmail.com>
Date: Wed, 5 Dec 2007 14:39:45 +1100


Hi all -

I'm trying to find a way to create dummy variables from factors in a regression. I have been using biglm along the lines of

ff <- log(Price) ~ factor(Colour):factor(Store) + factor(DummyVar):factor(Colour):factor(Store)

lm1 <- biglm(ff, data=my.dataset)

but because there are lots of colours (>100) and lots of stores (>250), I run it to memory problems. Now, not every store sells every colour and so it should be possible to create the matrix of factor variables myself and greatly reduce the size of the problem. it seems that lm / biglm use all combinations of factor levels when used in factor(Colour):factor(Store) so by creating my own matrix of factor variables i should be able to reduce the size of the problem considerably.

If i have a data frame
>my.dataset <- data.frame(Price=1:12, Colour= c('red','blue','green'),
Store=c('a', 'b', 'c', 'a', 'c', 'd', 'e', 'e', 'e', 'e', 'b', 'e'), DummyVar = sort(rep(c(0,1),6)) )

i want to create a data frame with the dummy vars that looks like

red:a	red:e	blue:b	blue:c	blue:e	green:c	green:d	green:e
1	0	0	0	0	0	0	0
0	0	1	0	0	0	0	0
0	0	0	0	0	1	0	0
1	0	0	0	0	0	0	0
0	0	0	1	0	0	0	0
0	0	0	0	0	0	1	0
0	1	0	0	0	0	0	0
0	0	0	0	1	0	0	0
0	0	0	0	0	0	0	1
0	1	0	0	0	0	0	0
0	0	1	0	0	0	0	0
0	0	0	0	0	0	0	1

any ideas would be appreciated.

-- 
Tim Calkins
0406 753 997

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 05 Dec 2007 - 03:52:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 05 Dec 2007 - 11:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.