[R] clustering or homegenity approaches?

From: Weiwei Shi <helprhelp_at_gmail.com>
Date: Fri 12 Aug 2005 - 07:36:11 EST


Hi, there:
I have a question on the following dataset

> rbind(t2[which(t4>0.3),][1:3,], t2[1:3,]) # don't worry about what this line means

          [,1] [,2] [,3] [,4] [,5]
[1,] 34.216166 96.928587 330.125990 330.183222 330.201215
[2,] 2.819183 8.134491 8.275841 8.525256 8.828448
[3,] 2.819183 7.541680 7.550333 8.374636 8.690998
[4,] 4.672551 5.036353 5.072710 5.152218 5.223204
[5,] 5.470131 5.500513 5.674139 5.689151 5.770423
[6,] 4.480287 4.628300 4.797686 4.814106 4.823345

I want to filter out the first 3 cases from the rest and the criteria is I am looking for a "gap".

My way is using std(eachrow)/median(each) and set up a threshold, which is very naive, but fast and good enough. But I want it better and more "academic". Please be advised. I think clustering might help, but it needs to be quick since t2 has 30000 rows.

Thanks,

Weiwei

-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Aug 12 07:40:59 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:15:31 EST