Re: [R] need help on computing double summation

From: Huntsinger, Reid <reid_huntsinger_at_merck.com>
Date: Thu 16 Jun 2005 - 06:27:29 EST


You could do something like

ids <- unique(mydata$id)
ans <- vector(length=length(ids), mode="list") for (i in ids) {
  g <- which(mydata$id == i)
  ans[[i]] <- (length(g) - 1)*cov(mydata$x[g], mydata$y[g]) }
ans

but cov() returns NA for length 1 vectors, so you'd want an if (length(g) == 1) ans[i] <- 0 else ans[i] <- ... construction.

This is almost brute force; you could also use tapply, as follows:

sx <- tapply(mydata$x,INDEX=mydata$id,FUN=sum) sy <- tapply(mydata$y,INDEX=mydata$id,FUN=sum) sxy <- tapply(mydata$x*mydata$y, INDEX=mydata$id, FUN=sum) n <- tapply(mydata$id,INDEX=mydata$id,FUN=length) # or use table()!

and now your inner sum is

sxy - 2*sx*(sy/n) + n*(sx/n)*(sy/n) = sxy - sx*sy/n

so

sum(sxy - sx*sy/n) should do.

One more approach is to make your dataset into a list of data frames, one for each id, then use lapply(). The list can be created by split(). In one line,

lapply(split(mydata,f=mydata$id),function(z) (length(z$x) - 1)*cov(z$x,z$y))

and take sum(,na.rm=TRUE) to remove the NAs due to single ids that you want to be zeros.

Reid Huntsinger

Reid Huntsinger

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch
[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Kerry Bush Sent: Wednesday, June 15, 2005 11:41 AM
To: r-help@stat.math.ethz.ch
Subject: [R] need help on computing double summation

Dear helpers in this forum,

   This is a clarified version of my previous questions in this forum. I really need your generous help on this issue.

> Suppose I have the following data set:
>
>
> ......
>

Now I want to compute the following double summation:

sum_{i=1}^k
sum_{j=1}^{n_i}(x_{ij}-mean(x_i))*(y_{ij}-mean(y_i))

i is from 1 to k,
indexing the ith subject id; and j is from 1 to n_i, indexing the jth observation for the ith subject.

in the above expression, mean(x_i) is the mean of x values for the ith
subject, mean(y_i) is the mean of y values for the ith subject.

Is there a simple way to do this in R?



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jun 16 06:52:16 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:43 EST