Re: [R] How to join data.frames and vectors of different length, in an inteligent way?

From: Chuck Cleland <ccleland_at_optonline.net>
Date: Tue, 10 Jun 2008 10:24:43 -0400

   You could put the group averages back into dafSamp using ave():

dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),

                  c(117,73,92,113,80,78,98,106,99)))

dafSamp$Ay <- ave(dafSamp$X2, dafSamp$X1, FUN=mean)

dafSamp$vecAA <- dafSamp$X2 * (dafSamp$Ay / mean(dafSamp$X2))

dafSamp

     X1 X2 Ay vecAA

1 1972 117 117.0000 143.92640
2 1984  73  89.5000  68.69334
3 1969  92  92.0000  88.99065
4 1976 113 103.3333 122.76869
5 1999  80  80.0000  67.28972
6 1996  78  78.0000  63.96729
7 1976  98 103.3333 106.47196
8 1984 106  89.5000  99.74650
9 1976  99 103.3333 107.55841

?ave

On 6/10/2008 9:05 AM, Hvidberg, Martin wrote:
> I have a data set something like this:
>
>
>
> "YYYY", "Value"
>
> 1972 , 117
>
> 1984 , 73
>
> 1969 , 92
>
> 1976 , 113
>
> 1999 , 80
>
> 1996 , 78
>
> 1976 , 98
>
> 1984 , 106
>
> 1976 , 99
>
>
>
> it could be created with:
>

>> dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))

>
>
>
> The real dataset is of cause much larger, app. 100.000 samples
>
>
>
> I need to adjust each value to remove any tendency of some years generally having higher values and others lower, since this is an unwanted artifact from different measuring traditions.
>
> My plan is to generate an average for each year Ay, as well as a global average Ag. Then each value should be multiplied by Ay/Ag.
>
>
>
>
>
> I can make the averages like this:
>
>
>
>> Ag <- mean(dafSamp[,2])

>
>> Ag

>
> [1] 95.11111
>
>
>
>> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')

>
>> Ay

>
> Group.1 x
>
> 1 1969 92.0000
>
> 2 1972 117.0000
>
> 3 1976 103.3333
>
> 4 1984 89.5000
>
> 5 1996 78.0000
>
> 6 1999 80.0000
>
>
>
>
>
> To see how many samples from each year I could write:
>
>
>
>> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')

>
>> Cy

>
> Group.1 x
>
> 1 1969 1
>
> 2 1972 1
>
> 3 1976 3
>
> 4 1984 2
>
> 5 1996 1
>
> 6 1999 1
>
>
>
>
>
> I would like to create a new vector with the adjusted values (dafSmap[,2] * Ay(for a relevant year) / Ag)
>
>
>
> I tried to write:
>
>
>
> vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
>
>
>
> but the result is all NAs :-( Might have seen that coming, Not the same length...
>
>
>
> Question: How do I go about making such calculation?
>
>
>
> :-) Martin Hvidberg
>
>
>
> Here is the code in full, if you want to try it...
>
>
>
> dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
>
> Ag <- mean(dafSamp[,2])
>
> Ag
>
> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')
>
> Ay
>
> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')
>
> Cy
>
> vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
>
>
>
>
>
>
>
> University of Aarhus <http://www.au.dk/en> Danmarks Miljøundersøgelser <http://www.dmu.dk/>
>
> Hvidberg, Martin <http://www2.dmu.dk/1_Om_DMU/2_medarbejdere/cv/employee2_NH.asp?PersonID=MHV>
> Senior Geographer (Climatology, Spatial modeling) <http://www.geogr.ku.dk/>
> N 55°41m43.48s E 12°06m05.13s ETRS89
> National Environmental Research Inst. <http://www.dmu.dk/International/>
> P.O. Box 358
> Frederiksborgvej 399
> DK-4000 Roskilde
> Martin.Hvidberg_at_dmu.dk
> www.dmu.dk/AtmosphericEnvironment/ tel:
> fax: +45 46 30 11 55
> +45 46 30 12 14
>
> [[alternative HTML version deleted]]
>
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 10 Jun 2008 - 14:28:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 11 Jun 2008 - 10:30:42 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive