From: Spencer Graves <spencer.graves_at_pdf.com>

Date: Sun 07 Aug 2005 - 12:10:31 EST

Date: Sun 07 Aug 2005 - 12:10:31 EST

I'm not certain what you are asking. PLEASE do read the posting guide! "http://www.R-project.org/posting-guide.html". If you formulate your question in terms of a simple example, showing where you got stuck as suggested in the posting guide, it might help others understand your question and inspire suggestions.

TINSTAFL = There is no such thing as a free lunch (Heinlein, The Moon is a Harsh Mistress)

spencer graves

Weiwei Shi wrote:

> Dear listers:

*> I have an idea to do the outlier detection and I need to use R to
**> implement it first. Here I hope I can get some input from all the
**> guru's here.
**>
**> I select distance-based approach---
**> step 1:
**> calculate the distance of any two rows for a dataframe. considering
**> the scaling among different variables, I choose mahalanobis, using
**> variance as scaler.
**>
**> step 2:
**> Let k be the number of points in one "cluster". K is decided by
**> answering the following question: how many neighbors a point needs for
**> not being an outlier.
**>
**> for each point, get the smallest (k-1) distances from step1. Among
**> the (k-1) distances of each point, get the max for the point.
**>
**> step 3:
**> get the distribution of those max for all the points. Thus, the
**> multivariate problem becomes a univariate one. Then the outlier in
**> those max's will define the outlier of the point.
**>
**> My question is:
**> 1. I don't know if using mahalanobis is proper or not since most
**> clustering algorithms implemented in R (like pam or clara) use
**> euclidean or mahattan.
**> 2. Is there a way to get the mahalanobis distance matrix for any two
**> rows of a dataframe or matrix?
**> 3. My approach does allow a point belonging to more than one
**> k-cluster. Is there similar algorithm in R or published?
**>
**> Thanks for any suggestions,
**>
**> weiwei
*

-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves@pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Sun Aug 07 12:17:47 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 15:07:49 EST
*