Dear,

I wrote a code to estimate the overlap between two kernel distributions. The script must estimates the overlap among each columns of data frame. With S sampled species (columns) in my data frame, I want obtain S(S-1)/2 pairs of overlap values between species. However, the code is not well write at all (only an overlap value is produced) and I can't find the solution.

To illustrate the calculations, I use the data frame "tdon" and the value of the bandwidth "h", which was estimated in other part of script.

tdon <- data.frame (sp.1=c (5 ,9 ,NA ,5, 11) , sp.2=c (4, 2, 4, NA, 11, ),sp.3=c(5, 4, 2, 6, 13), sp.4=c(3 , 11, NA, 5, 3), sp.5=c(2 ,5 ,2, 9, 9))

*> h
*

[1] 1.047 2.973 0.887 1.520 2.955

Here is the code:

for (i in 1:(nbcol-1)) # nbcol<-ncol(tdon)

{tdon1<-tdon[,i]

tdon11<- subset(tdon1,tdon1!="NA")

fctk1<-function(x)

{density (tdon11, bw=h[i], kernel="gaussian")$y}

for (j in (i+1):nbcol)

{tdon2<-tdon[,j]

tdon21<- subset(tdon2,tdon2!="NA")

fctk2<-function(x)

{density (tdon21, bw=h[j], kernel="gaussian")$y}

diffctk<-function(x) {abs(fctk1(x)-fctk2(x))} intctk<- approxfun (diffctk(x), rule=2) int<- integrate(diffctk,-Inf,Inf)$value overlap<- 1 - 0.5* int } }

The use of "approxfun" to integrate the difference in the estimated density values (my "diffctk" function) was suggested by Thomas Lumley, but I'm not sure that I have found the solution or if this solution is correct for my problem.

I need that the "overlap" produce a vector with the length equal to 10, with all pairs of overlap values.

Any help or advice on improvement for this code will be appreciated.

With kind regards,

Rogério

