Re: [R] Formula with no intercept

From: Gang Chen <gangchen6_at_gmail.com>
Date: Thu, 17 Apr 2008 12:11:44 -0400

Thanks both Harold Doran and Prof. Ripley for the suggestion. Time*Group - 1 or Time*(Group-1) does seem better. However as Prof. Ripley pointed out, it is a little complicated with the interactions. For example,


> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> summary(fit <- lm ( y ~ time * group - 1))

Call:
lm(formula = y ~ time * group - 1)

Residuals:

         1 2 3 4 5 6 7 -5.122e-01 3.916e-01 5.985e-01 9.547e-01 5.122e-01 1.665e-16 -5.985e-01

         8 9 10 11 12 -9.547e-01 0.000e+00 -3.916e-01 -5.551e-17 2.220e-16

Coefficients:

             Estimate Std. Error t value Pr(>|t|)
time1         1.12493    0.91795   1.225    0.288
time2         0.38984    0.91795   0.425    0.693
time3        -0.02273    0.64909  -0.035    0.974
time4        -1.26004    0.64909  -1.941    0.124
groupM       -0.12533    1.12426  -0.111    0.917
time2:groupM  0.08218    1.58994   0.052    0.961
time3:groupM 0.13187 1.58994 0.083 0.938 time4:groupM 2.32921 1.58994 1.465 0.217

Residual standard error: 0.918 on 4 degrees of freedom Multiple R-squared: 0.6962, Adjusted R-squared: 0.08858 F-statistic: 1.146 on 8 and 4 DF, p-value: 0.4796


There are totally 8 fixed effects listed above. I believe I can interpret time1, time2, time3 and time4 as the fixed effects of those 4 levels of factor Time in groupF. But I'm not so sure about the other 4 fixed effects: are time2:groupM, time3:groupM, and time4:groupM the fixed effect differences of those 3 levels of factor Time between groupM and groupF? If so, what is groupM (the 5th)? Or are time2:groupM, time3:groupM, and time4:groupM the difference (between groupM and groupF) of the fixed effects of those 3 levels of time factor versus time1 while groupM (the 5th) the fixed effect of time1 or groupM versus GroupF?

> packages such as multcomp can post-hoc test any (coherent) set of hypotheses you
> choose, irrespective of the model parametrization.

This does not seem true unless I'm missing something. See the following example:



> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> fit <- lm(y ~ time * group)
> library(multcomp)
> summary(glht(fit, linfct=c("time1=0", "time2=0")))
Error in chrlinfct2matrix(linfct, names(beta)) :   variable(s) 'time1' not found
> summary(glht(fit, linfct=c("time2=0", "time3=0")))

         Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = y ~ time * group)

Linear Hypotheses:

           Estimate Std. Error t value p value
time2 == 0  -0.7351     1.2982  -0.566   0.797
time3 == 0  -1.1477     1.1243  -1.021   0.533
(Adjusted p values reported -- single-step method)

The problem is that glht doesn't allow any hypothesis involving time1 if intercept is included in the model specification. Any more thoughts?

Thanks,
Gang

On 4/16/08, Prof Brian Ripley <ripley_at_stats.ox.ac.uk> wrote:
> On Wed, 16 Apr 2008, Doran, Harold wrote:
>
>
> > R may not be giving you what you want, but it is doing the right thing.
> > You can change what the base category is through contrasts but you can't
> > get the marginal effects for every level of all factors because this
> > creates a linear dependence in the model matrix.
> >
>
> I suspect that Time*Group - 1 or Time*(Group-1) come closer to the aim. It
> is the first factor in the model which is coded without contrasts in a
> no-intercept model.
>
> Once you include interactions I think the 'convenience' is largely lost,
> and packages such as multcomp can post-hoc test any (coherent) set of
> hypotheses you choose, irrespective of the model parametrization.
>
>
>
> >
> > > -----Original Message-----
> > > From: r-help-bounces_at_r-project.org
> > > [mailto:r-help-bounces_at_r-project.org] On Behalf Of Gang Chen
> > > Sent: Monday, April 14, 2008 5:38 PM
> > > To: r-help_at_stat.math.ethz.ch
> > > Subject: [R] Formula with no intercept
> > >
> > > I'm trying to analyze a model with two variables, one is
> > > Group with two levels (male and female), and other is Time
> > > with four levels (T1, T2, T3 and T4). And for the convenience
> > > of post-hoc testing I wanted to consider a model with no
> > > intercept for factor Time, so I tried formula
> > >
> > > Group*(Time-1)
> > >
> > > However this seems to give me the following terms in the model
> > >
> > > GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4,
> > > GroupMale:TimeT2, GroupMale:TimeT3, GroupMale:TimeT4,
> > > GroupFemale:TimeT2, GroupFemale:TimeT3, GroupFemale:TimeT4
> > >
> > > which is not exactly what I wanted. Also it seems (Group-1)*Time and
> > > (Group-1)*(Time-1) also give me exactly the same set of terms
> > > as Group*(Time-1).
> > >
> > > So I have some conceptual trouble understanding this. And how
> > > could I create a model with terms including all the levels of
> > > factor Time?
> > >
> > > Thanks,
> > > Gang



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 17 Apr 2008 - 17:09:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 17 Apr 2008 - 17:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive