Re: [R] How to get multiple partial matches?

From: jim holtman <jholtman_at_gmail.com>
Date: Thu 07 Sep 2006 - 00:01:38 GMT

Try using 'grep' and regular expressions:

> x <- "72 5S_F_1 501 567

+ 7700   5S_F_2            338          611
+ 7517   5S_F_3            412          467
+ 10687  5S_F_4            380          428
+ 4870   5S_F_5            315          368
+ 6035   5S_F_6            300          359
+ 3826   5S_F_7            350          386
+ 8754   5S_F_8            450          473
+ 6399   5S_F_9            439          494
+ 749   5S_F_10            334          384
+ "

> df <- read.table(textConnection(x))
> df
      V1      V2  V3  V4
1     72  5S_F_1 501 567
2   7700  5S_F_2 338 611
3   7517  5S_F_3 412 467
4  10687  5S_F_4 380 428
5   4870  5S_F_5 315 368
6   6035  5S_F_6 300 359
7   3826  5S_F_7 350 386

8 8754 5S_F_8 450 473
9 6399 5S_F_9 439 494
10 749 5S_F_10 334 384
> # select only ones with '5S_F_1'
> df[grep('5S_F_1', as.character(df$V2)),]

    V1 V2 V3 V4
1 72 5S_F_1 501 567
10 749 5S_F_10 334 384
>
>

On 9/6/06, Sarah Tucker <sltucker15@yahoo.com> wrote:
> Hi,
>
> I'm very new to R, and am not at all a software
> programmer of any sort. I appreciate any help you
> may have. I have figured out how to get my data into
> a dataframe and order it alphabetically according to a
> particular column. Now, I would like to seperate out
> certain rows based on partial character matches. Here
> is an (extremely) abreviated example of my data set
>
> Probe Ch1 Median - B Ch1 Mean - B
> 72 5S_F_1 501 567
> 7700 5S_F_2 338 611
> 7517 5S_F_3 412 467
> 10687 5S_F_4 380 428
> 4870 5S_F_5 315 368
> 6035 5S_F_6 300 359
> 3826 5S_F_7 350 386
> 8754 5S_F_8 450 473
> 6399 5S_F_9 439 494
> 749 5S_F_10 334 384
>
> I would like to be able to select out all rows with,
> for example, "5S_F_" in the Probe column (there are
> non-"5S_F_" containing values in the real, larger data
> set).
>
> I think pmatch does this for instances where there is
> only 1 match, but I would like to recover all the
> matches. I have tried to use charmatch, match,
> pmatch, agrep and grep for this purpose, but with no
> luck.
>
> When I grep for "5S_F_" with value = T, I get
> "character(0)"
> Adding wildcards (either "*" or ".") does not change
> this outcome.
>
> I thought maybe the underscores were messing it up, so
> I tried to grep "5S*" with value = T, and I get a long
> list of numbers back
>
> [1] "55" "95" "56" "57" "58" "59" "65"
> "75" "85" "105"
> [11] "115" "125" "135" "5" "5" "5" "5"
> "5" "5" "5"
>
> These numbers make no sense to me. They don't seem to
> correlate with where the "5S"'s occur in the
> dataframe, and they don't look like any values in the
> Probe column (there are no numeric vaules in the Probe
> column, just strings of character digit combinations).
>
> How can I select out all the rows with the same
> partial character match?
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu Sep 07 10:08:54 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 07 Sep 2006 - 07:51:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.