# Re: [R] need help on computing double summation

From: Huntsinger, Reid <reid_huntsinger_at_merck.com>
Date: Thu 16 Jun 2005 - 06:27:29 EST

ids <- unique(mydata\$id)
ans <- vector(length=length(ids), mode="list") for (i in ids) {
g <- which(mydata\$id == i)
ans[[i]] <- (length(g) - 1)*cov(mydata\$x[g], mydata\$y[g]) }
ans

but cov() returns NA for length 1 vectors, so you'd want an if (length(g) == 1) ans[i] <- 0 else ans[i] <- ... construction.

This is almost brute force; you could also use tapply, as follows:

sx <- tapply(mydata\$x,INDEX=mydata\$id,FUN=sum) sy <- tapply(mydata\$y,INDEX=mydata\$id,FUN=sum) sxy <- tapply(mydata\$x*mydata\$y, INDEX=mydata\$id, FUN=sum) n <- tapply(mydata\$id,INDEX=mydata\$id,FUN=length) # or use table()!

and now your inner sum is

sxy - 2*sx*(sy/n) + n*(sx/n)*(sy/n) = sxy - sx*sy/n

so

sum(sxy - sx*sy/n) should do.

One more approach is to make your dataset into a list of data frames, one for each id, then use lapply(). The list can be created by split(). In one line,

lapply(split(mydata,f=mydata\$id),function(z) (length(z\$x) - 1)*cov(z\$x,z\$y))

and take sum(,na.rm=TRUE) to remove the NAs due to single ids that you want to be zeros.

Reid Huntsinger

Reid Huntsinger

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch
[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Kerry Bush Sent: Wednesday, June 15, 2005 11:41 AM
To: r-help@stat.math.ethz.ch
Subject: [R] need help on computing double summation

Dear helpers in this forum,

This is a clarified version of my previous questions in this forum. I really need your generous help on this issue.

> Suppose I have the following data set:
>
>
> ......
>

Now I want to compute the following double summation:

sum_{i=1}^k
sum_{j=1}^{n_i}(x_{ij}-mean(x_i))*(y_{ij}-mean(y_i))

i is from 1 to k,
indexing the ith subject id; and j is from 1 to n_i, indexing the jth observation for the ith subject.

in the above expression, mean(x_i) is the mean of x values for the ith
subject, mean(y_i) is the mean of y values for the ith subject.

Is there a simple way to do this in R?

R-help@stat.math.ethz.ch mailing list