# Re: [R] calculating dissimilarities in R

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Tue 26 Sep 2006 - 07:55:50 GMT

Hi Elvina,

>>>>> "Elvina" == Elvina Payet <virgin@seychelles.sc> >>>>> on Tue, 26 Sep 2006 05:48:01 GMT writes:

```    Elvina> ,A (BDear All,
Elvina> I’ve got a statistical question on calculating
Elvina> dissimilarities in R.
Elvina> I want to calculate the different types of dissimilarities
Elvina> on the ‘flower’ dataset found in the package
Elvina> ‘cluster’. Flower is a data frame with 18 observations
Elvina> on 8 variables. Variable 1 and 2 are binary, variable 3 is
Elvina> asymmetric binary, variable 4 is nominal, variable 5 and 6
Elvina> are ordered and variable 7 and 8 are interval scaled.

```

Elvina> Commands to load the dataset in R.

> library(cluster)
> data(flower)

or data(flower, package = "cluster")

```    Elvina> What are the different types of dissimilarities that can be
Elvina> calculated on such a dataset?
Elvina> Do I need to group the types of variables first i.e. all
Elvina> binary together then run the calculation?  Do I use
Elvina> dissimilarity indices such as Jaccard or should it be
Elvina> classification function such as ‘daisy’ which should be
Elvina> used?

```

Yes, you should use daisy() to calculate dissimilarities, particularly when you are interested in the difference between symmetric and asymmetric binary.

Do read help(daisy) and look at its examples.

```

Regards,
Martin Maechler, ETH Zurich

