Re: [R] Pattern Matching Replacement

From: Gabor Grothendieck <>
Date: Thu, 19 Jun 2008 15:04:47 -0400

On Thu, Jun 19, 2008 at 2:17 PM, ppatel3026 <> wrote:
> I would like to replace "\r\n" with "" in a character string, where "\r\n"
> exists only between < and >, how could I do that?
> Initial:
> characterString = "<XML><tag1
> id=\"F\r\n2\"></t\r\nag1>\r\n<tag\r\n2></tag2></XML>"
> Result:
> characterString = "<XML><tag1 id=\"F2\"></tag1>\r\n<tag2></tag2></XML>"
> Tried with sub(below) but it only replaces the first instance and I am not
> sure how to pattern match so that it only replaces \r\n that exist within
> tags(< and >).
> sub("\r\n", "", charStream)

I assume you want to delete all \r and all \n in tags and not just \r\n but if its just \r\n then just modify the 2nd regular expression appropriately and the rest should work the same.

gsubfn from the package of the same name is like gsub except instead of replacing each occurrence of the regular expression with a fixed string it feeds each match into the function specified as arg2 and replaces the match with the output of that function. The function can alternately be specified as a formula, as it is here, in which case the right side of the formula specifies the function body and the formal arguments of the function are constructed from the free variables, in this case just x. See gsubfn home page at .

characterString <-
"<XML><tag1 id=\"F\r\n2\"></t\r\nag1>\r\n<tag\r\n2></tag2></XML>"

gsubfn("<[^>]*>", ~ gsub("[\r\n]", "", x), characterString) mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Thu 19 Jun 2008 - 19:58:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 19 Jun 2008 - 20:32:41 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive