From: Daniel Malter <daniel_at_umd.edu>

Date: Wed, 5 Dec 2007 16:02:48 -0500

cuncta stricte discussurus

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 05 Dec 2007 - 21:05:53 GMT

Date: Wed, 5 Dec 2007 16:02:48 -0500

You estimate a model with the Factors A or B either present (1) or not
present (0) and with an intercept. Thus you would predict:

For both A and B not present: Intercept

For A only present: Intercept+coef(A)

For B only preseent: Intercept+coef(B)

For both present: Intercept+coef(A)+coef(B).

Again, you would interpret the intercept as the value of "fruit" when A and B are not present (or inactive). If the intercept is not meaningful in your setting and you just want to know if both groups differ, then you want to use function aov I guess. What is your "fruit" variable? I would also suggest to visually inspect your data. That always helps :) The code is also down below.

Look at the following example in which 4 x 10 Ys are drawn randomly from normal distributions with equal variance but different means. The first ten observations have both A and B not present (i.e. 0) as specified in the vectors "a" and "b". The mean of these observations where A and B are zero is 1 as specified in y1=rnorm(10, -> 1 <-,1). As you will see if you run this code, the estimated Intercept is 1.0512 which is close to 1 (the true mean). As you see (just confirming what was said above), this is the average of the baseline (or reference group if you will) when both A and B are absent.

y1=rnorm(10,1,1) y2=rnorm(10,2,1) y3=rnorm(10,3,1) y4=rnorm(10,4,1)

a=c(rep(0,20),rep(1,20))

b=c(rep(0,10),rep(1,10),rep(0,10),rep(1,10))

y=c(y1,y2,y3,y4)

data=data.frame(cbind(y,a,b))

####Plot####

interaction.plot(a,b,y)

####Models####

summary(lm(y~factor(a)+factor(b),data=data)

####Compare this to####

summary(aov(y~factor(a)+factor(b),data=data)

Cheers,

Daniel

cuncta stricte discussurus

coef( summary ( lm ( fruit ~ A + B, data = test)))

Estimate Std. Error t value Pr(>|t|) (Intercept) 2.716667 0.5484828 4.953058 7.879890e-04 A2 6.266667 0.6333333 9.894737 3.907437e-06 B2 5.166667 0.6333333 8.157895 1.892846e-05

I understand that the mean of A2 is +6.3 more than A1, and that B2 is 5.2 more than B1.

So the question is: Is the intercept A1 and B1 combined as one mean ("the baseline")? or is it something else? Does this number actually tell me anything useful (2.716)??

What does the model (y = intercept + ??) look like then? I can't understand how both factors (A and B) can have the same intercept?

Thanks in advance!!

Gustaf Granath

Dept of Plant Ecology

Uppsala University, Sweden

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 05 Dec 2007 - 21:05:53 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 05 Dec 2007 - 22:30:17 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*