[R] regular expressions, sub

From: Christian Hoffmann <christian.hoffmann_at_wsl.ch>
Date: Fri 27 Jan 2006 - 20:54:11 EST

Hi,

I am trying to use sub, regexpr on expressions like

    log(D) ~ log(N)+I(log(N)^2)+log(t)

being a model specification.

The aim is to produce:

    "ln D ~ ln N + ln^2 N + ln t"

The variable names N, t may change, the number of terms too.

I succeded only partially, help on regular expressions is hard to understand for me, examples on my case are rare. The help page on R-help for grep etc. and "regular expressions"

What I am doing:

(f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
(ft <- sub("","",f)) # creates string with parts of formula, how to do
it simpler?
(fu <- paste(ft[c(2,1,3)],collapse=" ")) # converts to one string

Then I want to use \1 for backreferences something like

(fv <- sub("log( [:alpha:] N )^ [:alpha:)","ln \\1^\\2",fu))

to change "log(g)^7" to "ln^7 g",

and to eliminate I(): sub("I(blabla)","\\1",fv) # I(xxx) -> xxx

The special characters are making trouble, sub acceps "(", ")" only in pairs. Code for experimentation:

trysub <- function(s,t,e) {
ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE)) print(paste(ii<-ii+1,ifelse(i1," "," ~"),"ext",ifelse(i2," "," ~"),"perl",ifelse(i3," "," ~"),"fixed ",ifelse(i4," "," ~"),"useBytes: ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3, useBytes=i4)),sep=""));invisible(0) }

trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases 5,6,13..16, the rest unsubstituted

trysub("log(","ln ",fu) # B: no substitutions; errors for cases 1..4,7.. 12 # typical errors:
"3 ext perl ~fixed useBytes: Error in sub.perl(pattern, replacement, x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"

trysub("log\(","ln ",fu) # C: same as A

trysub("log\\(","ln ",fu)         # D: no substitutions; errors for 
cases 15,16        # typical errors:

"15 ~ext ~perl ~fixed useBytes: Error in sub(pattern, replacement, x, ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression 'log\\('\n"

trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors # E: typical errors:
"3 ext perl ~fixed useBytes: Error in sub.perl(pattern, replacement, x, ignore.case, useBytes) : \n\tinvalid regular expression 'log\\(([:alpha:]+)\\)'\n"

Thanks for help
Christian

PS. The explanations in the documents

-- 
Dr. Christian W. Hoffmann,
Swiss Federal Research Institute WSL
Mathematics + Statistical Computing
Zuercherstrasse 111
CH-8903 Birmensdorf, Switzerland

Tel +41-44-7392-277  (office)   -111(exchange)
Fax +41-44-7392-215  (fax)
christian.hoffmann@wsl.ch
http://www.wsl.ch/staff/christian.hoffmann

International Conference 5.-7.6.2006 Ekaterinburg Russia
"Climate changes and their impact on boreal and temperate forests"
http://ecoinf.uran.ru/conference/

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Jan 27 21:00:55 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:14 EST