From: Peter Dalgaard <P.Dalgaard_at_biostat.ku.dk>

Date: Wed, 23 May 2007 15:00:12 +0200

Date: Wed, 23 May 2007 15:00:12 +0200

Mike White wrote:

> I am trying to use Fisher's z' transformation of the Pearson's r but the

*> standard error does not appear to be correct. I have simulated an example
**> using the R code below. The z' data appears to have a reasonably normal
**> distribution but the standard error given by the formula 1/sqrt(N-3) (from
**> http://davidmlane.com/hyperstat/A98696.html) gives a different results than
**> sd(z). Can anyone tell me where I am going wrong?
**>
**>
*

You're not calculating 3000 independent sample correlations. You are
calculating the correlation of each row of "dat" with the column means
in one case, and a bunch of pairwise corrrelations in the other. In
addition, and this may be the important part, you are generating
"correlation" not by using a sample from a two-dimensional normal but by
adding in specific means.

Conrast

*> r <- replicate(30000, {Z<-rnorm(10,55,30);cor(rnorm(10)+Z,rnorm(10)+Z)})
*

> z <- 0.5*log((1+r)/(1-r))

*> sd(z)
*

[1] 0.3648677

> 1/sqrt(7)

[1] 0.3779645

with

*> r <- replicate(30000, cor(rnorm(10)+10*(1:10), rnorm(10)+10*(1:10)))
*

> z <- 0.5*log((1+r)/(1-r))

*> sd(z)
*

[1] 0.2662451

> library(amap)

*>
**> ## SIMULATED DATA #########################################################
**> p<-10
**> n<-3000
**> means<-1000*c(1:p)
**> SDs<-rep(100,p)
**> set.seed(1)
**> dat<-mapply(rnorm, mean=means, sd=SDs, n=n)
**> colnames(dat)<-paste("V",1:p, sep="")
**> rownames(dat)<-1:n
**> # calculate centroid of simulated data
**> dat.mean<-apply(dat,2,mean)
**> # calculated Pearson's r to centroid
**> r<-apply(dat,1,cor, y=dat.mean)
**> plot(density(r))
**> # Fisher's z' transformation
**> z<-0.5*log((1+r)/(1-r))
**> plot(density(z))
**> sd(z)
**> # [1] 0.2661779
**>
**> 1/sqrt(p-3)
**> # [1] 0.3779645
**>
**> ## alternatively use comparisons for all possible pairs
**> ## Centred Pearson's r on raw data
**> r<-1-Dist(dat,"corr")
**> plot(density(r))
**> z<-0.5*log((1+r)/(1-r))
**> plot(density(z))
**> sd(z)
**> # [1] 0.2669787
**> 1/sqrt(p-3)
**> # [1] 0.3779645
**>
**> Many thanks
**> Mike White
**>
**> ______________________________________________
**> R-help_at_stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

-- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-help_at_stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Wed 23 May 2007 - 13:08:20 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 23 May 2007 - 13:31:18 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*