[R] help with regexpr in gsub

From: Kimpel, Mark William <mkimpel_at_iupui.edu>
Date: Thu 18 Jan 2007 - 00:26:32 GMT

I have a very long vector of character strings of the format "GO:0008104.ISS" and need to strip off the dot and anything that follows it. There are always 10 characters before the dot. The actual characters and the number of them after the dot is variable.

So, I would like to return in the format "GO:0008104" . I could do this with substr and loop over the entire vector, but I thought there might be a more elegant (and faster) way to do this.

I have tried gsub using regular expressions without success. The code

gsub(pattern= "\.*?" , replacement="", x=character.vector)

correctly locates the positions in the vector that contain the dot, but replaces all of the strings with "". Obviously not what I want. Is there a regular expression for replacement that would accomplish what I want?

Or, does R have a better way to do this?



Mark W. Kimpel MD  

(317) 490-5129 Work, & Mobile

(317) 663-0513 Home (no voice mail please)

1-(317)-536-2730 FAX

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Jan 18 11:32:59 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 18 Jan 2007 - 01:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.