Re: [R] manipulating (extracting) data from distance matrices

From: Michael Rennie <mdrennie_at_gmail.com>
Date: Tue, 15 Jul 2008 11:56:05 -0400

Hi Jon,

That only controls the print display of the matrix, not how one can access the elements. I think my solution revolves around indexing in as.matrix() with a mind to the fact that results will be duplicated along the diagonal.

Cheers, and thanks all,

Mike

On Tue, Jul 15, 2008 at 11:43 AM, Jon Olav Skoien <j.skoien_at_geo.uu.nl> wrote:
> Maybe
>
> dmat<-dist(dat, method="euclidean",upper = TRUE,diag = TRUE)
>
> can fix your problem with the triangular matrix?
>
> Cheers
> Jon
>
> Michael Rennie wrote:
>>
>> Not really,
>>
>> I'd actually want
>>
>> f[4:6,4:6]
>>
>> to get comparisons of observations 4 to 6 only. And I'm still left
>> with the upper triangular matrix. This is a problem since I want to
>> sum the distances over the blocks that I am extracting.
>>
>> Then again, I could just divide the sum by two and get the answer....
>>
>> And, if I want to sum blocks comparing distances among two groups, say
>>
>> f[7:10,4:6]
>>
>> then I'm in the triangluar matrix and not crossing the diagonal
>> anymore, so I should be okay.
>>
>> I think I may have my answer, but any other tips are more than welcome.
>>
>> Cheers,
>>
>> Mike
>>
>> On Tue, Jul 15, 2008 at 9:35 AM, stephen sefick <ssefick_at_gmail.com> wrote:
>>
>>>
>>> how about this
>>> f <- as.matrix(dmat)
>>> f[,4:6]
>>> #you get repeats but I think this is what you want
>>>
>>> On Tue, Jul 15, 2008 at 9:07 AM, Michael Rennie <mdrennie_at_gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi all,
>>>>
>>>> Does anyone have any tips for extracting chunks of data from a distance
>>>> matrix?
>>>>
>>>> For instance, if one was interested in only a subset of distance
>>>> comparisons (i.e., that of rows 4 thru 6, and no others), is there a
>>>> simple way to pull that data out?
>>>>
>>>> >From some playing around with an example (below), I've been able to
>>>> figure out that a distance matrix in R is stored as a single vector,
>>>> running top to bottom and left to right, so if you know the size of
>>>> your distance matrix, you can figure out which elements to query and
>>>> stick them together using c().
>>>>
>>>> However, all this stuff is still indexed by the "labels" attribute.
>>>> Does anyone know of a way to use that to pull out subsets from the
>>>> distance matrix in a simpler manner than my example code below?
>>>>
>>>> ##############
>>>> # ex_dist.R
>>>> # example for
>>>> # manipulating
>>>> # distance matrices
>>>> ####################
>>>>
>>>> set.seed<-12345
>>>>
>>>> a<-sample(20:40, 10)
>>>> b<-sample(80:100, 10)
>>>> c<-sample(0:40, 10)
>>>>
>>>> dat<-data.frame(a,b,c)
>>>> dat
>>>>
>>>> dmat<-dist(dat, method="euclidean")
>>>> dmat
>>>>
>>>> dmat[1:6] #vector that stores the distance matrix runs descending down
>>>> columns, left to right
>>>>
>>>> #in a 10-element distance matrix, column lengths are 9,8,7,6....1
>>>>
>>>> #get comparisons of rows 1:4 (from dat) ONLY
>>>> #top-left matrix will consist of top 3 of first column, top 2 of
>>>> second col, top 1 or third col.
>>>>
>>>> topleft<-c(dmat[1:3],dmat[10:11],dmat[18])
>>>> topleft
>>>>
>>>> #get comparisons of rows 9:10 (from dat) ONLY
>>>> #bottom right 4
>>>>
>>>> bottomright<-c(dmat[8:9],dmat[16:17])
>>>> bottomright
>>>>
>>>> #######end#####
>>>>
>>>> I'm sure there's a simpler way to do this using the labels of the
>>>> distance matrix, but I can't see it. I've thought of converting it
>>>> using as.matrix(), which would allow me to pull out particular rows,
>>>> but I'm only interested in the triangluar matrix. Now, if there was a
>>>> way to as.matrix(dmat) such that I got the bottom triangular matrix
>>>> and zeros elsewhere, then I'd be in buisness. Any suggestions on how
>>>> to pull that off would be helpful.
>>>>
>>>> I'm certainly interested in any tips or tricks anyone might have for
>>>> working with distance matrices, or any material that people can point
>>>> me towards.
>>>>
>>>> Cheers,
>>>>
>>>> Mike
>>>>
>>>> --
>>>> Michael D. Rennie
>>>> Ph.D. Candidate
>>>> University of Toronto at Mississauga
>>>> 3359 Missisagua Rd. N.
>>>> Mississauga, ON L5L 1C6
>>>> Ph: 905-828-5452 Fax: 905-828-3792
>>>> www.utm.utoronto.ca/~w3rennie
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> --
>>> Let's not spend our time and resources thinking about things that are so
>>> little or so large that all they really do for us is puff us up and make
>>> us
>>> feel like gods. We are mammals, and have not exhausted the annoying
>>> little
>>> problems of being mammals.
>>>
>>> -K. Mullis
>>>
>>
>>
>>
>>
>

-- 
--
Michael D. Rennie
Ph.D. Candidate
University of Toronto at Mississauga
3359 Missisagua Rd. N.
Mississauga, ON L5L 1C6
Ph: 905-828-5452 Fax: 905-828-3792
www.utm.utoronto.ca/~w3rennie

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 15 Jul 2008 - 15:59:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 Jul 2008 - 17:31:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive