From: Mendiburu, Felipe (CIP) <F.MENDIBURU_at_CGIAR.ORG>

Date: Thu, 31 May 2007 15:06:23 -0500

N<-length(x)

SX2<- (N^3-N)/12 - Lx

SY2<- (N^3-N)/12 - Ly

rs<- (SX2+SY2-sum(d^2))/(2*sqrt(SX2*SY2)) return(rs)

}

3 97 20

4 113 12

5 120 12

6 110 17

> cor(y[1],y[2],method="spearman")

Felipe de Mendiburu

Statistician

# Spearman correlation "rs" with ties or no ties
rs<-function(x,y) {

d<-rank(x)-rank(y)

tx<-as.numeric(table(x)) ty<-as.numeric(table(y)) Lx<-sum((tx^3-tx)/12) Ly<-sum((ty^3-ty)/12)

N<-length(x)

SX2<- (N^3-N)/12 - Lx

SY2<- (N^3-N)/12 - Ly

rs<- (SX2+SY2-sum(d^2))/(2*sqrt(SX2*SY2)) return(rs)

}

# Aplicacion

> cor(y[,1],y[,2],method="spearman")

[1] 0.2319084

> rs(y[,1],y[,2])

[1] 0.2319084

> y=read.table(file="tmp",header=TRUE,sep="\t")

* > y
IQ Hours

1 106 7 2 86 0

3 97 20

4 113 12

5 120 12

6 110 17

> cor(y[1],y[2],method="spearman")

Hours

IQ 0.2319084

[it's an abbreviated example of one I took from Wikipedia]. I calculated by hand (apologies if the table looks strange when pasted into e-mail):

IQ Hours rank(IQ) rank(hours) diff diff^2 1 106 7 3 2 1 1 2 86 0 1 1 0 0 3 97 20 2 6 -4 16 4 113 12 5 3.5 1.5 2.25 5 120 12 6 3.5 2.5 6.25 6 110 17 4 5 -1 1 26.5 rho= 0.242857

where rho = (1 - ((6 * 26.5) / 6 * (6^2 - 1))). I kept modifying the table and realized that the difference in result comes from ties. i.e., if I remove the tie in rows 4 and 5, I get the same result from both cor and calculating by hand. Perhaps I'm handling ties wrong...does anyone know how R does it or perhaps I need to change how I'm using it?

Thank you!

Ray

