Re: [R] Calculating the distance samples using distance metics method

From: <Bill.Venables_at_csiro.au>
Date: Wed, 20 Feb 2008 09:58:46 +1000

Distance matrices are not usually and end in themselves but a means to some other end. Rather than ask what is the best way to calculate such a huge distance matrix, maybe the question you should ask yourself is what are you going to do with it if ever you did manage to calculate it.

Maybe you can bypass the distance matrix calculation and get to the end point by some other means. For example, if the eventual goal is clustering, then perhaps something like clara() in the 'cluster' package will do the job more effectively. It is designed to handle situations of this kind.

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA

Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables_at_csiro.au
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Keizer_71
Sent: Wednesday, 20 February 2008 9:35 AM To: r-help_at_r-project.org
Subject: [R] Calculating the distance samples using distance metics method

***********reading in data**********

data<-read.table("microarray.txt",header=T, sep="\t")

head(data)

dim(data)

attach(data)

***********creating matrix and calculating variance across
probesets********

x<-1:20000

y<-2:141

data.matrix<-data.matrix(data[,y])

variableprobe<-apply(data.matrix[x,],1,var)

hist(variableprobe)

**************filter out low variance*************

data.sub = data.matrix[order(variableprobe,decreasing=TRUE),][1:10000,]

dim(data.sub)
[1] 10000 140

What is the best way to calculate the distances between the samples using
the euclidean or manhattan distance metrics?

any suggestions?

-- 
View this message in context:
http://www.nabble.com/Calculating-the-distance-samples-using-distance-me
tics-method-tp15578860p15578860.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 20 Feb 2008 - 00:01:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 20 Feb 2008 - 02:30:15 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive