[R] Varying results of sammon(), for the same data set

From: Ole Edsberg <edsberg_at_stud.ntnu.no>
Date: Mon 30 Jan 2006 - 19:39:15 EST


Hello,

I have a data set on which I run the sammon algorithm as follows:

library(MASS)
data = read.table('problemforr.dat')
y = cmdscale(data, add=TRUE)
s = sammon(data, y$points)

(In case it should be relevant, I make the data available at
http://idi.ntnu.no/~edsberg/problemforr.dat)

With R 2.2.1 on Debian Sid I always get one of two solutions (stress 1.74288 after 10 iterations or stress 1.33629 afer 9 iterations). I always get the same result within the same R session, even if I read the data again. With R 2.2.0 on SunOS 5.9 I always get the same result
(stress 0.13186 after 74 iterations).

I understand that the sammon algorithm is very sensitive to even tiny variations in the starting point, but the observed behaviour seems strange to me. Difference between machines could perhaps be explained by floating point portability issues, but not difference on the same machine, and not the fact that i get the same result within the same R session.

I read in the documentation
(http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/sammon.html)
that "Further, since the configuration is only determined up to rotations and reflections (by convention the centroid is at the origin), the result can vary considerably from machine to machine." This doesn't make sense to me. If the data and the algorithm is the same, the result should be the same. What differences between machines do they refer to here? Floating point issues?

I must admit that I am a beginner, both in R and in statistics. I'm very curious about the cause of this strangeness. Does anybody have an explanation?

Best Regards,

Ole Edsberg



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jan 30 19:48:25 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:14 EST