[R] two cols in a data frame are the same factor

From: Andres Legarra <legarra_at_gmail.com>
Date: Tue, 18 Mar 2008 10:11:05 +0100


Dear all,
I have a data set (QTL detection) where I have two cols of factors in the data frame that correspond logically (in my model) to the same factor. In fact these are haplotype classes. Another real-life example would be family gas consumption as a function of car company (e.g. Ford, GM, and Honda) (assuming 2 cars by family).

An artificial example follows:
set.seed(1234)
L3 <- LETTERS[1:3]
(d <- data.frame( y=rnorm(10), fac=sample(L3, 10,
repl=TRUE),fac1=sample(L3,10,repl=T)))

 lm(y ~ fac+fac1,data=d)

and I get:

Coefficients:
(Intercept) facB facC fac1B fac1C

     0.3612 -0.9359 -0.2004 -2.1376 -0.5438

However, to respect my model, I need to constrain effects in fac and fac1 to be the same, i.e. facB=fac1B and facC=fac1C. There are logically just 4 unknowns (average,A,B,C). With continuous covariates one might do y ~ I(cov1+cov2), but this is not the case.

Is there any trick to do that?
Thanks,

Andres Legarra
INRA-SAGA
Toulouse, France



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 18 Mar 2008 - 09:17:22 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 18 Mar 2008 - 12:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive