Re: [R] regular expression

From: Uwe Ligges <ligges_at_statistik.uni-dortmund.de>
Date: Sat 07 Apr 2007 - 11:18:44 GMT

Laurent Rhelp wrote:
> Uwe Ligges a écrit :
>

>>
>>
>> Laurent Rhelp wrote:
>>
>>> Dear R-List,
>>>
>>>      I have a great many files in a directory and I would like to 
>>> replace in every file the character " by the character ' and in the 
>>> same time, I have to change ' by '' (i.e. the character ' twice and 
>>> not the unique character ") when the character ' is embodied in "....."
>>>   So, "....." becomes '.....' and ".....'......" becomes '.....''......'
>>> Certainly, regular expression could help me but I am not able to use it.
>>>
>>> How can I do that with R ?
>>
>>
>>
>> In fact, you do not need to know anything about regular expressions in 
>> this case, since you are simply going to replace certain characters by 
>> others without any fuzzy restrictions:
>>
>> x <- "\".....'......\""
>> cat(x, "\n")
>> xn <- gsub('"', "'", gsub("'", "''", x))
>> cat(xn, "\n")
>>
>>
>> Uwe Ligges
>>
>>
>>> Thank you very much
>>>
>>> ______________________________________________
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>

>
> Yes, You are right. So I wrote the code below (that I find a little
> awkward but it works).
>
> ##-----
>
> dirdata <- getwd()
> fichnames <- list.files(path=paste(dirdata,"\\initial\\",sep=""))

see ?file.path to improve the above.

> for( i in 1:length(fichnames)){

see ?seq to improve the above: seq(along = fichnames) Or even better, just work on the names (see below).

> filein <- paste(dirdata,"\\initial\\",fichnames[i],sep="")

again, file.path() is your friend

> conin <- file(filein)
> open(conin)

 >      nbrows <- length( readLines(conin,n=-1) )

> close(conin)

You can simply use readLines() with the filename which open the connection to a file itself. And I do not see why you want to read the file here. Since your code becomes really complicated now, let me suggest the following procedure (untested!):

dirdata <- getwd()
fichnames <- list.files(file.path(dirdata, "initial")) for(i in fichnames){

     temp <- readLines(file.path(dirdata, "initial", i))
     temp <- gsub('"', "'", gsub("'", "''", temp))
     writeLines(temp, con = file.path(dirdata, "result", i))
}

Uwe Ligges

> fileout <- paste(dirdata,"\\result\\",fichnames[i],sep="")
> conout <- file(fileout,"w")
>
> conin <- file(filein)
> open(conin)
>
>
> for( l in 1:nbrows )
> {
> text <- gsub('"',"'",gsub("'","''",readLines(conin,n=1)))
> writeLines(con=conout,text=text)
> }
>
> close(conin)
> close(conout)
> }
>
> ##------



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Apr 07 21:24:00 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 07 Apr 2007 - 18:31:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.