From: Huntsinger, Reid <reid_huntsinger_at_merck.com>

Date: Thu 16 Jun 2005 - 06:27:29 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jun 16 06:52:16 2005

Date: Thu 16 Jun 2005 - 06:27:29 EST

This is almost brute force; you could also use tapply, as follows:

sxy - 2*sx*(sy/n) + n*(sx/n)*(sy/n) = sxy - sx*sy/n

lapply(split(mydata,f=mydata$id),function(z) (length(z$x) - 1)*cov(z$x,z$y))

and take sum(,na.rm=TRUE) to remove the NAs due to single ids that you want to be zeros.

> Suppose I have the following data set:

*>
**>
**> ......
**>
*

Now I want to compute the following double summation:

sum_{i=1}^k

sum_{j=1}^{n_i}(x_{ij}-mean(x_i))*(y_{ij}-mean(y_i))

i is from 1 to k,

indexing the ith subject id; and j is from 1 to n_i,
indexing the jth observation for the ith subject.

in the above expression, mean(x_i) is the mean of x
values for the ith

subject, mean(y_i) is the mean of y values for the ith
subject.

Is there a simple way to do this in R?

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jun 16 06:52:16 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:32:43 EST
*