[R] cluster

From: Weiwei Shi <helprhelp_at_gmail.com>
Date: Tue 26 Jul 2005 - 07:45:12 EST

Dear listers:

Here I have a question on clustering methods available in R. I am trying to down-sampling the majority class in a classification problem on an imbalanced dataset. Since I don't want to lose information in the original dataset, I don't want to use naive down-sampling: I think using clustering on the majority class' side to select
"representative" samples might help. So, my question is, which
clustering method should be tested to get the best result. I think the key thing might be the selection of "distance" considering the next step in which I would like to use decision trees.

Please share your experience in using clustering (Any available implementation outside R is also welcome)


Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue Jul 26 07:55:26 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:34:01 EST