Re: [R] Very Slow Gower Similarity Function

From: Anon. <>
Date: Tue 19 Apr 2005 - 03:36:56 EST

Jari Oksanen wrote:

> On 18 Apr 2005, at 19:10, Tyler Smith wrote:
>> Hello,
>> I am a relatively new user of R. I have written a basic function to
>> calculate
>> the Gower similarity function. I was motivated to do so partly as an
>> excercise
>> in learning R, and partly because the existing option (vegdist in the
>> vegan
>> package) does not accept missing values.
> Speed is the reason to use C instead of R. It should be easy, almost
> trivial, to modify the vegdist.c so that it handles missing values. I
> guess this handling means ignoring the value pair if one of the values
> is missing -- which is not so gentle to the metric properties so dear
> to Gower. Package vegan is designed for ecological community data
> which generally do not have missing values (except in environmental
> data), but contributions are welcome.
The only reason you never see ecological community data with missing values is because the ecologists remove those species/sites from their Excel sheets before they give it to you to sort out their mess. This is actually one of the few things they know how to do in Excel - I'm dreading the day when a paper appears in JAE saying that you can use Excel to produce P-values.

To be slightly more serious, as an exercise the OP could consider writing a wrapper function in R that removes the missing data and then calls vegdist to calculate his Gower similarity index.


Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB:

______________________________________________ mailing list
PLEASE do read the posting guide!
Received on Tue Apr 19 03:41:45 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:15 EST