# Re: [R] Lack of independence in anova()

From: Phillip Good <pigood_at_verizon.net>
Date: Tue 05 Jul 2005 - 02:58:15 EST

My bad. The example of a call should read, sim2(size,10000,.16). Originally, the intent of the program was to compare ANOV (the gold standard) with synchronized permutation tests when data is from contaminated normal distributions or the design is unbalanced. The tests for main effects and interactions are always independent with synchronized permutations and ought to be for ANOV with normal data and balanced designs.

-----Original Message-----
From: Douglas Bates [mailto:dmbates@gmail.com] Sent: Monday, July 04, 2005 9:24 AM
To: pigood@verizon.net
Cc: r-help@r-project.org
Subject: Re: [R] Lack of independence in anova()

A couple more comments on this simulation. Notice that the sim2 function is defined with arguments size, N and p but is being called with four arguments. It appears as if the value of p will be 10 in that call.

If you decide to do such a simulation yourself you can save a lot of time by just building the model matrix once and using lm.fit in subsequent calls.

Also, there is no need to do the counting in the body of the sim2 function. Just save the 3 p-values from each replication. The test of independence is equivalent to showing that the distribution of the p-values is uniform over the unit cube.

On 7/4/05, Phillip Good <pigood@verizon.net> wrote:
> If the observations are normally distributed and the 2xk design is
> balanced, theory requires that the tests for interaction and row effects
be
> independent. In my program, appended below, this would translate to cntT
> (approx)= cntR*cntI/N if all R routines were functioning correctly. They
> aren't.
>
> sim2=function(size,N,p){
> cntR=0
> cntC=0
> cntI=0
> cntT=0
> cntP=0
> for(i in 1:N){
> #generate data
> v=gendata(size)
> #analyze after build(ing) design containing data
> lm.out=lm(yield~c*r,build(size,v))
> av.out=anova(lm.out)
> #if column effect is significant, increment cntC
> if (av.out[[5]][1]<=p)cntC=cntC+1
> #if row effect is significant, increment cntR
> if (av.out[[5]][2]<=p){
> cntR=cntR+1
> tmp = 1
> }
> else tmp =0
> if (av.out[[5]][3]<=p){
> #if interaction is significant, increment cntI
> cntI=cntI+1
> #if both interaction and row effect are significant, increment
cntT
> cntT=cntT + tmp
> }
> }
> list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
> }
>
> build=function(size,v){
> #size is a vector containing the sample sizes
> col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
> rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
> row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
> +size[7]+size[8]))
> return(data.frame(c=factor(col), r=factor(row),yield=v))
> }
>
> gendata=function(size){
> ssize=sum(size);
> return (rnorm(ssize))
> }
>
> #Example
> size=c(3,3,3,0,3,3,3,0)
> sim2(size,10000,10,.16)
>
>
>
> Phillip Good
> Huntington Beach CA
>
>
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help