Re: [R] partial matches across rows not columns

From: Jannis <bt_jannis_at_yahoo.de>
Date: Tue, 08 Jun 2010 23:39:17 +0200

I did not go too deep into your zoology problem ;-) but as far as I understood you, you want to omit all rows where ID and TO_ID are A1 and A1.1, (or A2....) correct?

If the data you send us is all the data and if there do not occour any different situations the following should be sufficient:

Transfer the vectors ID an TO_ID to values without the . and the number following it (e.g. A1.1 -> A1):

ID.clean<-gsub("^.*[?]| .*$", "",data$ID) TO_ID.clean<-gsub("^.*[?]| .*$", "",data$TO_ID)

And then use logical indexing:
data.clean = data[ID.clean==TO_ID.clean,]

HTH
Jannis

RCulloch schrieb:
> Hi R users,
>
> I am trying to omit rows of data based on partial matches an example of my
> data (seal_dist) is below:
>
> A quick break down of my coding and why I need to answer this - I am dealing
> with a colony of seals where for example A1 is a female with pup and A1.1 is
> that female's pup, the important part of the data here is DIST which tells
> the distance between one seal (ID) and another (TO_ID). What I want to do is
> take a mean for these data for a nearest neighbour analysis but I want to
> omit any cases where there is the distance between a female and her pup,
> i.e. in the previous e.g. omit rows where A1 and A1.1 occur.
>
> I have looked at grep and pmatch but these appear to work across columns and
> don't appear to do what I'm looking to do,
>
> If anyone can point me in the right direction, I'd be most greatful,
>
> Best wishes,
>
> Ross
>
>
> FROM TO DIST ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
> 2 1 2 4.81803 A1 1 30 9 9 1 1 MALE1 12
> 3 1 3 2.53468 A1 1 30 9 9 1 1 A2 3
> 4 1 4 7.57332 A1 1 30 9 9 1 1 A1.1 7
> 5 1 1 7.57332 A1.1 1 30 9 9 7 1 A1 1
> 6 1 2 7.89665 A1.1 1 30 9 9 7 1 MALE1 12
> 7 1 3 6.47847 A1.1 1 30 9 9 7 1 A2 3
> 9 1 1 2.53468 A2 1 30 9 9 3 1 A1 1
> 10 1 2 2.59051 A2 1 30 9 9 3 1 MALE1 12
> 12 1 4 6.47847 A2 1 30 9 9 3 1 A1.1 7
> 13 1 1 4.81803 MALE1 1 30 9 9 12 1 A1 1
> 15 1 3 2.59051 MALE1 1 30 9 9 12 1 A2 3
> 16 1 4 7.89665 MALE1 1 30 9 9 12 1 A1.1 7
> 17 1 1 3.85359 A1 2 30 9 9 1 1 MALE1 12
> 19 1 3 4.88826 A1 2 30 9 9 1 1 A2 3
> 20 1 4 7.25773 A1 2 30 9 9 1 1 A1.1 7
> 21 1 1 9.96431 A1.1 2 30 9 9 7 1 MALE1 12
> 22 1 2 7.25773 A1.1 2 30 9 9 7 1 A1 1
> 23 1 3 5.71725 A1.1 2 30 9 9 7 1 A2 3
> 25 1 1 8.73759 A2 2 30 9 9 3 1 MALE1 12
> 26 1 2 4.88826 A2 2 30 9 9 3 1 A1 1
> 28 1 4 5.71725 A2 2 30 9 9 3 1 A1.1 7
> 30 1 2 3.85359 MALE1 2 30 9 9 12 1 A1 1
> 31 1 3 8.73759 MALE1 2 30 9 9 12 1 A2 3
> 32 1 4 9.96431 MALE1 2 30 9 9 12 1 A1.1 7
> 33 1 1 7.95399 A1 3 30 9 9 1 1 MALE1 12
> 35 1 3 0.60443 A1 3 30 9 9 1 1 A1.1 7
> 36 1 4 1.91136 A1 3 30 9 9 1 1 A2 3
> 37 1 1 8.29967 A1.1 3 30 9 9 7 1 MALE1 12
> 38 1 2 0.60443 A1.1 3 30 9 9 7 1 A1 1
> 40 1 4 1.43201 A1.1 3 30 9 9 7 1 A2 3
> 41 1 1 9.71659 A2 3 30 9 9 3 1 MALE1 12
> 42 1 2 1.91136 A2 3 30 9 9 3 1 A1 1
> 43 1 3 1.43201 A2 3 30 9 9 3 1 A1.1 7
> 46 1 2 7.95399 MALE1 3 30 9 9 12 1 A1 1
> 47 1 3 8.29967 MALE1 3 30 9 9 12 1 A1.1 7
> 48 1 4 9.71659 MALE1 3 30 9 9 12 1 A2 3
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jun 2010 - 21:41:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jun 2010 - 21:50:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive