From: Gabor Grothendieck <ggrothendieck_at_myway.com>

Date: Fri 30 Jul 2004 - 13:25:24 EST

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 30 13:31:57 2004

Date: Fri 30 Jul 2004 - 13:25:24 EST

Marc Schwartz <MSchwartz <at> MedAnalytics.com> writes:

*>
*

> On Thu, 2004-07-29 at 21:08, Gabor Grothendieck wrote:

*> > Bulutoglu Dursun A Civ AFIT/ENC <Dursun.Bulutoglu <at> afit.edu> writes:
**> >
**> > >
**> > > I was wondering if there is a way of editting strings in R. I
**> > > have a set of strings and each set is a row of numbers and paranthesis.
**> > > For example the first row is:
**> > > (0 2)(3 4)(7 9)(5 9)(1 5)
**> > > and I have a thousand or so such rows. I was wondering how I
**> > > could get the corresponding string obtained by adding 1 to all the
**> > > numbers in the string above.
**> >
**> > First do the 1 character translations simultaneously using chartr and
**> > then use gsub for the remaining one to two character translation:
**> >
**> > gsub("0","10",chartr("0123456789","1234567890","(0 2)(3 4)(7 9)(5 9)(1
*

5)"))

*>
*

> Gabor,

*>
**> One problem: Multi-digit numbers in the source string:
**>
**> > gsub("0","10",chartr("0123456789","1234567890",
**> "(10 99)(3 4)(7 9)(5 9)(1 5)"))
**> [1] "(21 1010)(4 5)(8 10)(6 10)(2 6)"
**>
**> Note the first number "10" gets transformed to "21" and the "99" goes to
**> "1010".
**>
**> I made a quick update to NewRow, which is not faster, but gets it to two
**> lines, instead of three, and is a bit cleaner:
**>
**> NewRow <- function(x)
**> {
**> TempMat <- matrix(as.numeric(unlist(strsplit(x, "([\\(\\) ])"))),
**> ncol = 3, byrow = TRUE) + 1
**>
**> paste("(", TempMat[, 2], " ", TempMat[, 3], ")", sep = "",
**> collapse = "")
**> }
**>
**> Note that with multi digit numbers, it gives a correct result:
**>
**> > NewRow("(10 99)(101 4)(7 9)(5 9)(1 5)")
**> [1] "(11 100)(102 5)(8 10)(6 10)(2 6)"
*

The above assumes a particular pattern of parentheses, based on the poster's example, just as mine assumed one digit numbers based on the poster's example. Both our examples assume the numbers are non-negative integers.

The poster can advise us on which additional assumptions, if any, are allowable but, just in case, here is a one line solution that handles multi-digit numbers and does not assume a particular pattern of parentheses and spaces.

For a number, say 99, the gsub replaces it with ",99+1," and the inner paste adds c(" to the front and ") to the end making it a valid R expression which we then evaluate and finally paste back together using the outer paste:

R> line <- "(10 99)(101 4)(7 9)()((5 9)(1 5))" # test data

R> paste(eval(parse(text = paste('c("', gsub("([0-9]+)", '",\\1+1,"', line, ext = TRUE), '")', sep = ""))), collapse = "")

[1] "(11 100)(102 5)(8 10)()((6 10)(2 6))"

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 30 13:31:57 2004

*
This archive was generated by hypermail 2.1.8
: Wed 03 Nov 2004 - 22:55:22 EST
*