Re: [R] partial matches across rows not columns

From: jim holtman <jholtman_at_gmail.com>
Date: Tue, 08 Jun 2010 17:15:13 -0400

Is this what you are looking for:

> # assume females start with "A"
> # extract first part if female from ID
> x.id <- sub("(A[[:digit:]]+).*", "\\1", x$ID)
> # now see if this pattern matches first part of TO_ID
> x.match <- x.id == substring(x$TO_ID, 1, nchar(x.id))
> # here are the ones that would be eliminated
> x[x.match,]

   FROM TO DIST ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL

4     1  4 7.57332   A1  1 30  9  9      1   1  A1.1         7
5     1  1 7.57332 A1.1  1 30  9  9      7   1    A1         1
20    1  4 7.25773   A1  2 30  9  9      1   1  A1.1         7
22    1  2 7.25773 A1.1  2 30  9  9      7   1    A1         1
35    1  3 0.60443   A1  3 30  9  9      1   1  A1.1         7
38    1  2 0.60443 A1.1  3 30  9  9      7   1    A1         1

>
>

On Tue, Jun 8, 2010 at 1:43 PM, RCulloch <ross.culloch_at_dur.ac.uk> wrote:
>
> Hi R users,
>
> I am trying to omit rows of data based on partial matches an example of my
> data (seal_dist) is below:
>
> A quick break down of my coding and why I need to answer this - I am dealing
> with a colony of seals where for example A1 is a female with pup and A1.1 is
> that female's pup, the important part of the data here is DIST which tells
> the distance between one seal (ID) and another (TO_ID). What I want to do is
> take a mean for these data for a nearest neighbour analysis but I want to
> omit any cases where there is the distance between a female and her pup,
> i.e. in the previous e.g. omit rows where A1 and A1.1 occur.
>
> I have looked at grep and pmatch but these appear to work across columns and
> don't appear to do what I'm looking to do,
>
> If anyone can point me in the right direction, I'd be most greatful,
>
> Best wishes,
>
> Ross
>
>
>    FROM TO     DIST    ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
> 2      1  2  4.81803    A1  1 30  9  9      1   1 MALE1        12
> 3      1  3  2.53468    A1  1 30  9  9      1   1    A2         3
> 4      1  4  7.57332    A1  1 30  9  9      1   1  A1.1         7
> 5      1  1  7.57332  A1.1  1 30  9  9      7   1    A1         1
> 6      1  2  7.89665  A1.1  1 30  9  9      7   1 MALE1        12
> 7      1  3  6.47847  A1.1  1 30  9  9      7   1    A2         3
> 9      1  1  2.53468    A2  1 30  9  9      3   1    A1         1
> 10     1  2  2.59051    A2  1 30  9  9      3   1 MALE1        12
> 12     1  4  6.47847    A2  1 30  9  9      3   1  A1.1         7
> 13     1  1  4.81803 MALE1  1 30  9  9     12   1    A1         1
> 15     1  3  2.59051 MALE1  1 30  9  9     12   1    A2         3
> 16     1  4  7.89665 MALE1  1 30  9  9     12   1  A1.1         7
> 17     1  1  3.85359    A1  2 30  9  9      1   1 MALE1        12
> 19     1  3  4.88826    A1  2 30  9  9      1   1    A2         3
> 20     1  4  7.25773    A1  2 30  9  9      1   1  A1.1         7
> 21     1  1  9.96431  A1.1  2 30  9  9      7   1 MALE1        12
> 22     1  2  7.25773  A1.1  2 30  9  9      7   1    A1         1
> 23     1  3  5.71725  A1.1  2 30  9  9      7   1    A2         3
> 25     1  1  8.73759    A2  2 30  9  9      3   1 MALE1        12
> 26     1  2  4.88826    A2  2 30  9  9      3   1    A1         1
> 28     1  4  5.71725    A2  2 30  9  9      3   1  A1.1         7
> 30     1  2  3.85359 MALE1  2 30  9  9     12   1    A1         1
> 31     1  3  8.73759 MALE1  2 30  9  9     12   1    A2         3
> 32     1  4  9.96431 MALE1  2 30  9  9     12   1  A1.1         7
> 33     1  1  7.95399    A1  3 30  9  9      1   1 MALE1        12
> 35     1  3  0.60443    A1  3 30  9  9      1   1  A1.1         7
> 36     1  4  1.91136    A1  3 30  9  9      1   1    A2         3
> 37     1  1  8.29967  A1.1  3 30  9  9      7   1 MALE1        12
> 38     1  2  0.60443  A1.1  3 30  9  9      7   1    A1         1
> 40     1  4  1.43201  A1.1  3 30  9  9      7   1    A2         3
> 41     1  1  9.71659    A2  3 30  9  9      3   1 MALE1        12
> 42     1  2  1.91136    A2  3 30  9  9      3   1    A1         1
> 43     1  3  1.43201    A2  3 30  9  9      3   1  A1.1         7
> 46     1  2  7.95399 MALE1  3 30  9  9     12   1    A1         1
> 47     1  3  8.29967 MALE1  3 30  9  9     12   1  A1.1         7
> 48     1  4  9.71659 MALE1  3 30  9  9     12   1    A2         3
> --
> View this message in context: http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2247757.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 08 Jun 2010 - 21:19:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jun 2010 - 21:20:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive