[R] 'mean' and 'sd' calculations do not match

From: Ulrich Leopold <uleopold_at_science.uva.nl>
Date: Thu 08 Dec 2005 - 21:50:25 EST


Dear list,

I am using R 2.1.1 on a Fedora 3 Linux, 32 bit PC.

If I compute the aggregated mean and the standard deviation I get standard deviation values for factors where the mean was not computed. It seems to me that this is somehow related to the NA values. But I don't quite understand what is going wrong?

Could it be related to the data import already? Some of the imported data got the character strings NA and others <NA>. But they are defined from the same values, -9999.

I used the code below. Below the code are parts of the results.

Cheers, Ulrich

Data import:

chemicS <- read.table("ChemieUlli_4_Quellen.csv", header = TRUE, sep = ",",na.strings = "-9999")

Count EC NO3 NO2 NH4

3504  630.0000  33.00  0.001  0.01 
3505        NA  26.66   <NA>  <NA> 
3506        NA   0.72   <NA>  <NA> 
3507        NA     NA   <NA>  <NA> 
3508        NA     NA   <NA>  <NA> 
3509        NA     NA   <NA>  <NA> 
3510 1210.0000  14.00  0.001  0.01 
3511 1265.0000  12.00  0.001  0.01 
3512 1400.0000  14.00  0.001  0.01 
3513 1427.0000  12.00  0.001  0.01 
3514 1410.0000   7.00      0     0 
3515 1520.0000   8.00  0.001  0.01 
3516 1470.0000   7.60      0     0 
3517 1170.0000  10.00  0.001  0.01 
3518 4570.0000  20.00  0.001  0.45 
3519 8560.0000   0.50   0.14  0.31 

3520 708.0000 39.00 0.001 0.01
3521 833.0000 40.00 0.01 0.01
3522 NA NA <NA> <NA>

Computing the mean:

aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD), FUN = mean)

Count east north Mean

350    89885   103160  318.50000
351    55870   103510  400.00000
352    82570   104845  637.33333
353    79119   107433         NA
354    79160   107462  362.77778
355    83010   108990         NA
356    82810   109010         NA
357    69135   112992         NA
358    55490   120140  142.25000
359    56580   120600         NA
360    56582   120607         NA
361    58050   125350         NA
362    58059   125360         NA
363    60360   128191         NA
364    65448   128293  252.50000
365  65472.5 128308.1         NA
366    61412   131141         NA

Computing the standard deviation:

aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD), FUN = sd, na.rm = TRUE)

Count east north Stdev.
350 89885 103160 4.9497475
351 55870 103510 NA
352 82570 104845 19.6553640
353 79119 107433 NA
354 79160 107462 73.6745848
355 83010 108990 NA
356 82810 109010 15.6950098
357 69135 112992 NA
358 55490 120140 5.3150729
359 56580 120600 NA
360 56582 120607 22.4435801
361 58050 125350 NA

362    58059   125360   23.3108523
363    60360   128191   20.9789577
364    65448   128293   10.6066017
365  65472.5 128308.1           NA

366 61412 131141 8.6184556

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Dec 08 21:57:08 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:34 EST