From: John Maindonald <john.maindonald_at_anu.edu.au>

Date: Tue 13 Sep 2005 - 10:48:02 EST

> xy$REPNO <- factor(xy$REPNO)

> xy$y <- rnorm(40)

Please Include either a toy data set or, if the actual data set is small,

lists of factor values. If you are happy to make the information public,

give the result vector also (this is less important!) Or you can put the

data and, where relevant, your code, on a web site.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Sep 13 11:03:08 2005

Date: Tue 13 Sep 2005 - 10:48:02 EST

For the record, it turns out that EXPNO ran from 1 to 20, i.e., it
identified

subject.

Thus EXPNO/COND parsed into the two error terms (additional to residual) EXPNO and EXPNO:COND. This second error term accounts for all variation between levels of COND; so there is no COND sum of squares. (In SPSS the fixed effect COND may have taken precedence; I do not know for sure.)

In R, if this was a complete randomized design, the term Error(EXPO), or in the mock-up example I gave Error(subj), would be enough on its own.

The R implementation can handle error terms akin to Error(REPNO/subj), but because there are redundant model matrix columns generated by the REPNO:subj term, complains that the Error() model is singular.

In general, terms of the form a/b should be used only if b is nested
within a,

i.e.,

REPNO/IndividualWithinBlock

(where IndividualWithinBlock runs from 1 to 4)
not REPNO/subj.

(Either of these cause REPNO to be treated as a blocking factor).

> xy <- expand.grid(REPNO=letters[1:5], COND=letters[1:4],

+ TIME=factor(paste(1:2))) > xy$subj <- factor(paste(xy$REPNO, xy$COND, sep=":")) > ## Below subj becomes EXPNO > xy$COND <- factor(xy$COND)

> xy$REPNO <- factor(xy$REPNO)

> xy$y <- rnorm(40)

Plea to those who post such questions to the list:

Please Include either a toy data set or, if the actual data set is small,

lists of factor values. If you are happy to make the information public,

give the result vector also (this is less important!) Or you can put the

data and, where relevant, your code, on a web site.

Be careful about the use of the word "groups" in an experimental design context; speak of "treatment groups" if that is the meaning, or "blocks" if that is what is intended. I suspect that confusion between these two contexts in which the word groups is wont to be used lay behind the use of the EXPNO/COND form of model formula.

John Maindonald.

On 10 Sep 2005, at 8:00 PM, Larry A Sonna wrote:

> From: "Larry A Sonna" <larry_sonna@hotmail.com>

*> Date: 10 September 2005 12:10:06 AM
**> To: <r-help@stat.math.ethz.ch>
**> Subject: [R] Discrepancy between R and SPSS in 2-way, repeated
**> measures ANOVA
**>
**>
**> Dear R community,
**>
**> I am trying to resolve a discrepancy between the way SPSS and R
**> handle 2-way, repeated measures ANOVA.
**>
**> An experiment was performed in which samples were drawn before and
**> after treatment of four groups of subjects (control and disease
**> states 1, 2 and 3). Each group contained five subjects. An
**> experimental measurement was performed on each sample to yield a
**> "signal". The before and after treatment signals for each subject
**> were treated as repeated measures. We desire to obtain P values
**> for disease state ("CONDITION"), and the interaction between signal
**> over time and disease state ("CONDITION*TIME").
**>
**> Using SPSS, the following output was obtained:
**> DF SumSq (Type 3) Mean Sq F
**> value P=
**>
**> COND 3 42861 14287
**> 3.645 0.0355
**>
**> TIME 1 473
**> 473 0.175 0.681
**>
**> COND*TIME 3 975 325
**> 0.120 0.947
**>
**> Error 16 43219 2701
**>
**>
**>
**> By contrast, using the following R command:
**>
**> summary(aov(SIGNAL~(COND+TIME+COND*TIME)+Error(EXPNO/COND),
**> Type="III"))
**>
**> the output was as follows:
**>
**> Df Sum Sq Mean Sq F value Pr(>F)
**>
**> COND 3 26516 8839 3.2517 0.03651 *
**>
**> TIME 1 473 473 0.1739 0.67986
**>
**> COND:TIME 3 975 325 0.1195 0.94785
**>
**> Residuals 28 76107 2718
**>
**>
**>
**> I don't understand why the two results are discrepant. In
**> particular, I'm not sure why R is yielding 28 DF for the residuals
**> whereas SPSS only yields 16. Can anyone help?
*

John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.

John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Sep 13 11:03:08 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 16:57:26 EST
*