[R] manipulating (extracting) data from distance matrices

From: Michael Rennie <mdrennie_at_gmail.com>
Date: Tue, 15 Jul 2008 09:07:09 -0400


Hi all,

Does anyone have any tips for extracting chunks of data from a distance matrix?

For instance, if one was interested in only a subset of distance comparisons (i.e., that of rows 4 thru 6, and no others), is there a simple way to pull that data out?

>From some playing around with an example (below), I've been able to
figure out that a distance matrix in R is stored as a single vector, running top to bottom and left to right, so if you know the size of your distance matrix, you can figure out which elements to query and stick them together using c().

However, all this stuff is still indexed by the "labels" attribute. Does anyone know of a way to use that to pull out subsets from the distance matrix in a simpler manner than my example code below?

##############
# ex_dist.R
# example for
# manipulating
# distance matrices
####################

set.seed<-12345

a<-sample(20:40, 10)
b<-sample(80:100, 10)
c<-sample(0:40, 10)

dat<-data.frame(a,b,c)
dat

dmat<-dist(dat, method="euclidean")
dmat

dmat[1:6] #vector that stores the distance matrix runs descending down columns, left to right

#in a 10-element distance matrix, column lengths are 9,8,7,6....1

#get comparisons of rows 1:4 (from dat) ONLY #top-left matrix will consist of top 3 of first column, top 2 of second col, top 1 or third col.

topleft<-c(dmat[1:3],dmat[10:11],dmat[18]) topleft

#get comparisons of rows 9:10 (from dat) ONLY #bottom right 4

bottomright<-c(dmat[8:9],dmat[16:17])
bottomright

#######end#####

I'm sure there's a simpler way to do this using the labels of the distance matrix, but I can't see it. I've thought of converting it using as.matrix(), which would allow me to pull out particular rows, but I'm only interested in the triangluar matrix. Now, if there was a way to as.matrix(dmat) such that I got the bottom triangular matrix and zeros elsewhere, then I'd be in buisness. Any suggestions on how to pull that off would be helpful.

I'm certainly interested in any tips or tricks anyone might have for working with distance matrices, or any material that people can point me towards.

Cheers,

Mike

--
Michael D. Rennie
Ph.D. Candidate
University of Toronto at Mississauga
3359 Missisagua Rd. N.
Mississauga, ON L5L 1C6
Ph: 905-828-5452 Fax: 905-828-3792
www.utm.utoronto.ca/~w3rennie

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 15 Jul 2008 - 13:26:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 Jul 2008 - 14:31:52 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive