[R] Simplify formula for heterogeneity

From: Stefaan Lhermitte <stefaan.lhermitte_at_biw.kuleuven.be>
Date: Fri 27 May 2005 - 01:06:22 EST


Dear R-ians,

I'm looking for a computational simplified formula to calculate a measure for heterogeneity (let's say H ):

H = sqrt [ (Si (Sj (Xi - Xj) ) ) /n ]

where:
sqrt = square root

Si = summation over i  (= 0 to n)
Sj = summation over j (= 0 to n)
Xi = element of X with index i
Xj = element of X with index j

I can simplify the formula to:

H = sqrt [ ( 2 * n * Si (Xi) - 2 Si (Sj ( Xi * Xj)) ) / n]

Unfortunately this formula stays difficult in iterative programming, because I have to keep every element of X to calculate H.

I know a computional simplified formula exists for the standard deviation (sd) that is much easier in iterative programming. Therefore I wondered I anybody knew about analog simplifications to simplify H:

sd = sqrt [ ( Si (Xi - mean(X) ) ) /n ] -> simplified computation -> sqrt [ (n * Si( X ) - ( Si( X ) ) )/ n ]

This simplied formula is much easier in iterative programming, since I don't have to keep every element of X.
E.g.: I have a vector X[1:10] and I already have caculated Si( X[1:10] ) (I will call this A) and Si( X ) (I will call this B). When X gets extendend by 1 element (eg. X[11]) it easy fairly simple to calculate sd(X[1:11]) without having to reuse the elements of X[1:10]. I just have to calculate:

sd = sqrt [ (n * (A + X[11]) - (A + X[11]) ) / n ]

This is failry easy in an iterative process, since before we continue with the next step we set:
A = (A + X[11])
B = (B + X[11])

Can anybody help me to do something comparable for H? Any other help to calculate H easily in an iterative process is also welcome!

Thanx in advance!

Kind regards,
Stef



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 27 01:42:03 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:07 EST