[R] Effect of data set size on calculation

From: <Peter.Watkins_at_foodscience.afisc.csiro.au>
Date: Thu 08 Sep 2005 - 12:21:28 EST


Dear listers,  

I have a piece of code which performs an ANOVA type of analysis on 2D GC data. The code is shown below:  

# ANOVA 2D GC analysis

# maxc <- number of samples

# nreps <- number of samples

maxc <- 2

nreps <- 4

sscl <- NULL

cmean <- NULL

#

# Initial stat. variable

#

dftot <- nrow(mat)-1

dfcl <- maxc - 1

dferr <- dftot - dfcl

totmean <- mean(mat)

sstot <- sd(mat)^2*dftot

#

# Calculate class-to-class variance

#

for (j in 1:maxc) {

cmean <- rbind(cmean,mean(mat[((j-1)*nreps+1):((j-1)*nreps+nreps),]))

}

for (j in 1:ncol(mat)) {

cmean[,j] <- cmean[,j]-totmean[j]

}

cmean <- (cmean)^2*nreps

for (i in 1:ncol(mat)) {

sscl[i] <- sum(cmean[,i])

}

#

# sserr <- sstot-sscl

#

ratios <- (sscl/dfcl)/((sstot-sscl)/dferr)  

I have tested the above on a small data set (based on average on the second dimension) and produced a result which was meaningful. However, when I analyse data with both dimensions (larger dataset), the analysis is not successful. I've narrowed the problem down to the calculation for cmean but I have no idea why there is a problem. If anyone has any suggestions then feel free to comment. Relevant output is given below.  

Many thanks, Peter.  

# Averaged dataset  

> ncol(mat)

[1] 636

> nrow(mat)

[1] 8
 

[SNIP]
  > for (j in 1:maxc) {

+ cmean <- rbind(cmean,mean(mat[((j-1)*nreps+1):((j-1)*nreps+nreps),]))

+ }

> cmean

           V2 V3 V4 V5 V6 V7 V8 V9

[1,] 27.38970 27.68816 27.80730 27.72688 27.68044 27.33749 6667.038
15537.47

[2,] 26.36001 26.72920 26.64940 26.82506 26.54539 26.30811 8029.746
13656.60  

... [SNIP]            V634 V635 V636 V637

[1,] 27.51868 27.51270 27.52344 27.52127

[2,] 26.45830 26.45837 26.46089 26.46407

>  

# Full dataset  

> nrow(mat)

[1] 8

> ncol(mat)

[1] 390010
 

[SNIP]
  > for (j in 1:maxc) {

+ cmean <- rbind(cmean,mean(mat[((j-1)*nreps+1):((j-1)*nreps+nreps),]))

+ }

> cmean

         [,1]

[1,] 54.48274

[2,] 63.14705

>      

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Sep 08 12:27:58 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 16:31:38 EST