From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>

Date: Sat 02 Jul 2005 - 21:22:19 EST

*> ...
*

*>
*

> Well, the covariance is (everything has mean zero, of course)

*>
*

*> E(XY) = (1+rho)/2*EX^2 + (1-rho)/2*E(X*-X) = rho*EX^2
*

*>
*

*> The marginal distribution of Y is a mixture of two identical uniforms
*

*> (X and -X) so is uniform and in particular has the same variance as X.
*

*>
*

*> In summary, EXY/sqrt(EX^2EY^2) == rho
*

*>
*

*> So as I said, it satisfies the formal requirements. X and Y are
*

*> uniformly distributed and their correlation is rho.
*

*>
*

*> If for nothing else, I suppose that this example is good for
*

*> demonstrating that independence and uncorrelatedness is not the same
*

*> thing.
*

E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jul 02 21:45:57 2005

Date: Sat 02 Jul 2005 - 21:22:19 EST

On 02-Jul-05 Peter Dalgaard wrote:

> "Jim Brennan" <jfbrennan@rogers.com> writes:

*>
*

>> OK now I am skeptical especially when you say in a weird way:-) >> This may be OK but look at plot(x,y) and I am suspicious. Is it still >> alright with this kind of relationship?

>> N <- 10000 >> rho <- .6 >> x <- runif(N, -.5,.5) >> y <- x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2))

> Well, the covariance is (everything has mean zero, of course)

That was a nice sneaky solution! I was toying with something similar, but less sneaky, until I saw Peter's, on the lines of

x<-runif(2N, -0.5,0.5); ix<-(N-k):(N+k); y<-x; y[ix]<-(-y[ix])

(which makes the same point about independence and correlation). The larger k as a fraction of N, the more you swing from rho = 1 to rho = -1, but you cannot achieve, as Peter did, an arbitrary correlation coefficient rho since the value depends on k which can only take discrete values.

Another approach which leads to a less "special" joint distribution is

x<-sort(runif(N, -0.5,0.5)); y<-sort(runif(N, -0.5,0.5))

followed by a rho-dependent permutation of y. I'm still pondering a way of choosing the permutation so as to get a desired rho.

The extremes are the identity, which for a given sample will give as close as you can get to rho = +1, and reversal, which gives as close as you can get to rho = -1.

However, the maximum theoretical rho which you can get (as opposed to what is possible for particular samples, which may get arbitrarily close to +1) depends on N. For instance, with N=3, it looks as though the theoretical rho is about 0.9 with the "identity" permutation (for N=1000, however, just about all samples give rho > 0.99).

I smell a source of interesting exam questions ...

Over to you!

Best wishes,

Ted.

E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861

Date: 02-Jul-05 Time: 12:22:09 ------------------------------ XFMail ------------------------------ ______________________________________________R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jul 02 21:45:57 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:33:11 EST
*