Re: [R] 'mean' and 'sd' calculations do not match

From: Petr Pikal <petr.pikal_at_precheza.cz>
Date: Fri 09 Dec 2005 - 00:18:12 EST


Hi

you see the differenc between factors and numbers.

columns with <NA> are factors
columns with NA ar numeric

you can see it by

str(chemicS) which will reveal a structure of your data

So either change factors by
as.numric(as.character())

or read it with forcing columns to numeric

?read.table

HTH
Petr

On 8 Dec 2005 at 11:50, Ulrich Leopold wrote:

From:           	Ulrich Leopold <uleopold@science.uva.nl>
To:             	R-help <R-help@stat.math.ethz.ch>
Organization:   	University of Amsterdam
Date sent:      	Thu, 08 Dec 2005 11:50:25 +0100
Subject:        	[R] 'mean' and 'sd' calculations do not match

> Dear list,
>
> I am using R 2.1.1 on a Fedora 3 Linux, 32 bit PC.
>
> If I compute the aggregated mean and the standard deviation I get
> standard deviation values for factors where the mean was not computed.
> It seems to me that this is somehow related to the NA values. But I
> don't quite understand what is going wrong?
>
> Could it be related to the data import already? Some of the imported
> data got the character strings NA and others <NA>. But they are
> defined from the same values, -9999.

>
> I used the code below. Below the code are parts of the results.
>
> Cheers, Ulrich
>
> Data import:
>
> chemicS <- read.table("ChemieUlli_4_Quellen.csv", header = TRUE, sep =
> ",",na.strings = "-9999")
>
> Count EC NO3 NO2 NH4
> 3504 630.0000 33.00 0.001 0.01
> 3505 NA 26.66 <NA> <NA>
> 3506 NA 0.72 <NA> <NA>
> 3507 NA NA <NA> <NA>
> 3508 NA NA <NA> <NA>
> 3509 NA NA <NA> <NA>
> 3510 1210.0000 14.00 0.001 0.01
> 3511 1265.0000 12.00 0.001 0.01
> 3512 1400.0000 14.00 0.001 0.01
> 3513 1427.0000 12.00 0.001 0.01
> 3514 1410.0000 7.00 0 0
> 3515 1520.0000 8.00 0.001 0.01
> 3516 1470.0000 7.60 0 0
> 3517 1170.0000 10.00 0.001 0.01
> 3518 4570.0000 20.00 0.001 0.45
> 3519 8560.0000 0.50 0.14 0.31
> 3520 708.0000 39.00 0.001 0.01
> 3521 833.0000 40.00 0.01 0.01
> 3522 NA NA <NA> <NA>
>
> Computing the mean:
>
> aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD),
> FUN = mean)
>
> Count east north Mean
> 350 89885 103160 318.50000
> 351 55870 103510 400.00000
> 352 82570 104845 637.33333
> 353 79119 107433 NA
> 354 79160 107462 362.77778
> 355 83010 108990 NA
> 356 82810 109010 NA
> 357 69135 112992 NA
> 358 55490 120140 142.25000
> 359 56580 120600 NA
> 360 56582 120607 NA
> 361 58050 125350 NA
> 362 58059 125360 NA
> 363 60360 128191 NA
> 364 65448 128293 252.50000
> 365 65472.5 128308.1 NA
> 366 61412 131141 NA
>
> Computing the standard deviation:
>
> aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD),
> FUN = sd, na.rm = TRUE)
>
> Count east north Stdev.
> 350 89885 103160 4.9497475
> 351 55870 103510 NA
> 352 82570 104845 19.6553640
> 353 79119 107433 NA
> 354 79160 107462 73.6745848
> 355 83010 108990 NA
> 356 82810 109010 15.6950098
> 357 69135 112992 NA
> 358 55490 120140 5.3150729
> 359 56580 120600 NA
> 360 56582 120607 22.4435801
> 361 58050 125350 NA
> 362 58059 125360 23.3108523
> 363 60360 128191 20.9789577
> 364 65448 128293 10.6066017
> 365 65472.5 128308.1 NA
> 366 61412 131141 8.6184556
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
>
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal@precheza.cz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 09 00:31:50 2005

This archive was generated by hypermail 2.1.8 : Fri 09 Dec 2005 - 02:26:23 EST