[R] How to join data.frames and vectors of different length, in an inteligent way?

From: Hvidberg, Martin <mhv_at_dmu.dk>
Date: Tue, 10 Jun 2008 15:05:41 +0200


I have a data set something like this:  

"YYYY", "Value"

1972 , 117

1984 , 73

1969 , 92

1976 , 113

1999 , 80

1996 , 78

1976 , 98

1984 , 106

1976 , 99  

it could be created with:

> dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
 

The real dataset is of cause much larger, app. 100.000 samples  

I need to adjust each value to remove any tendency of some years generally having higher values and others lower, since this is an unwanted artifact from different measuring traditions.

My plan is to generate an average for each year Ay, as well as a global average Ag. Then each value should be multiplied by Ay/Ag.    

I can make the averages like this:  

> Ag <- mean(dafSamp[,2])

> Ag

[1] 95.11111  

> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')

> Ay

  Group.1 x

1 1969 92.0000

2 1972 117.0000

3 1976 103.3333

4 1984 89.5000

5 1996 78.0000

6 1999 80.0000    

To see how many samples from each year I could write:  

> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')

> Cy

  Group.1 x

1 1969 1

2 1972 1

3 1976 3

4 1984 2

5 1996 1

6 1999 1    

I would like to create a new vector with the adjusted values (dafSmap[,2] * Ay(for a relevant year) / Ag)  

I tried to write:  

vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag  

but the result is all NAs :-( Might have seen that coming, Not the same length...  

Question: How do I go about making such calculation?  

:-) Martin Hvidberg  

Here is the code in full, if you want to try it...  

dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))

Ag <- mean(dafSamp[,2])

Ag

Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')

Ay

Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')

Cy

vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag    

        University of Aarhus <http://www.au.dk/en> Danmarks Miljøundersøgelser <http://www.dmu.dk/>         

Hvidberg, Martin <http://www2.dmu.dk/1_Om_DMU/2_medarbejdere/cv/employee2_NH.asp?PersonID=MHV> Senior Geographer (Climatology, Spatial modeling) <http://www.geogr.ku.dk/> N 55°41m43.48s E 12°06m05.13s ETRS89
National Environmental Research Inst. <http://www.dmu.dk/International/> P.O. Box 358
Frederiksborgvej 399
DK-4000 Roskilde
Martin.Hvidberg_at_dmu.dk

www.dmu.dk/AtmosphericEnvironment/ 	tel:
fax: 	+45 46 30 11 55

+45 46 30 12 14                    

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 10 Jun 2008 - 14:12:23 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 10 Jun 2008 - 14:30:39 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive