Re: [R] Specifying medoids in PAM?

From: Martin Maechler <>
Date: Thu 09 Jun 2005 - 09:08:50 EST

>>>>> "MM" == Martin Maechler <>
>>>>> on Wed, 8 Jun 2005 18:57:55 +0200 writes:

>>>>> "David" == David Finlayson <>
>>>>> on Wed, 8 Jun 2005 09:24:54 -0700 writes:

    David> Sorry, I wasn't trying to submit a bug report just yet.

    MM> the posting guide asks you to provide reproducible examples, in
    MM> any case, not just for bug reports ...
    MM> {and strictly speaking, you still haven't provided one, since
    MM> it's a bit painful to read in your table below -- because of the
    MM> extra row names ... but here I'm nit picking a bit }

    David> I wanted to see if I was using the command correctly.

    MM> Yes, you were.

    >>> pam(stats.table, metric="euclidean", stand=TRUE, medoids=c(1,3,20,2,5), k=5)

    David> This command crashes RGUI.exe and windows sends an error report to
    David> Microsoft. It also crashes if I first subtract the NA rows from
    David> stats.table.

    MM> I can confirm to get segmentation faults using this example data
    MM> with k=5 , so effectively, it seems you've uncovered a bug in pam().     MM> I will investigate and patch eventually.

I found and fixed the bug:
Some part of the C code was assuming that the indices in 'medoids' were sorted (increasingly).

I.e., for the moment you can easily work around the problem by using

   pam(stats.table, ...., medoids=c(1,2,3,5,20), k=5) instead of

   pam(stats.table, ...., medoids=c(1,3,20,2,5), k=5)

The next version of the cluster package which allows to specify the "fuzzyness exponent" in fanny() will have this problem fixed.

Martin Maechler,
