Re: [R] How to simulate correlated data

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Fri 16 Dec 2005 - 04:04:21 EST


On 15-Dec-05 Lisa Wang wrote:
> Hello there,
>
> I would like to simulate X --Normal (20, 5)
> Y-- Normal (40, 10)
>
> and the correlation between X and Y is 0.6. How do I do it in R?

... and, as well as using mvrnorm (MASS) or rmvnorm (mvtnorm), as have been suggested, you could simply do it "by hand":

If U, V are independent and N(0,1), then

  E(U + a*V)*(U - a*V) = 1 - a^2

  E(U+a*V)^2 = E(U - a*V) = 1 + a*2

so the correlation between (U + a*V) and U - a*V) is

  r = (1 - a^2)/(1 + a^2)

Hence, for -1 < r < 1, choose

  a = sqrt((1 - r)/(1 + r))

which, for r = 0.6, gives a = sqrt(0.4/1.6) = sqrt(1/4) = 1/2 (how nice! ... ).

Then Var(U + a*V) = 1 + a^2 = 1 + 1/4 = 5/4 (I smell more smooth numbers coming ... ).

Then, since the correlation between two variables is unchanged if you add a constant to either, or multiply either by a constant, you can give (U + a*V) variance 5 by multiplying it by 2, and give (U - a*V) variance 10 by multiplying by 2*sqrt(2), both still having expectation 0. So finally add 10 and 20:

  X = 10 + 2*(U + V/2) ; Y = 20 + 2*sqrt(2)*(U - V/2)

So you can get U and V by sampling from rnorm(), and then X and Y as described.

(Which is how I used to do it before starting to use R, e.g. in matlab/octave).

Best wishes,
Ted.



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 15-Dec-05                                       Time: 17:04:18
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 16 04:16:04 2005

This archive was generated by hypermail 2.1.8 : Fri 16 Dec 2005 - 09:45:26 EST