Re: [R] regular expression

From: Laurent Rhelp <laurentRhelp_at_free.fr>
Date: Sat 07 Apr 2007 - 15:25:34 GMT

Uwe Ligges a écrit :

>Laurent Rhelp wrote:
>
>
>>Uwe Ligges a écrit :
>>
>>
>>
>>>Laurent Rhelp wrote:
>>>
>>>
>>>
>>>>Dear R-List,
>>>>
>>>> I have a great many files in a directory and I would like to
>>>>replace in every file the character " by the character ' and in the
>>>>same time, I have to change ' by '' (i.e. the character ' twice and
>>>>not the unique character ") when the character ' is embodied in "....."
>>>> So, "....." becomes '.....' and ".....'......" becomes '.....''......'
>>>>Certainly, regular expression could help me but I am not able to use it.
>>>>
>>>>How can I do that with R ?
>>>>
>>>>
>>>
>>>In fact, you do not need to know anything about regular expressions in
>>>this case, since you are simply going to replace certain characters by
>>>others without any fuzzy restrictions:
>>>
>>>x <- "\".....'......\""
>>>cat(x, "\n")
>>>xn <- gsub('"', "'", gsub("'", "''", x))
>>>cat(xn, "\n")
>>>
>>>
>>>Uwe Ligges
>>>
>>>
>>>
>>>
>>>>Thank you very much
>>>>
>>>>______________________________________________
>>>>R-help@stat.math.ethz.ch mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide
>>>>http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>>
>>>
>>Yes, You are right. So I wrote the code below (that I find a little
>>awkward but it works).
>>
>>##-----
>>
>>dirdata <- getwd()
>>fichnames <- list.files(path=paste(dirdata,"\\initial\\",sep=""))
>>
>>
>
>see ?file.path to improve the above.
>
>
>
>
>>for( i in 1:length(fichnames)){
>>
>>
>
>see ?seq to improve the above: seq(along = fichnames)
>Or even better, just work on the names (see below).
>
>
>
>> filein <- paste(dirdata,"\\initial\\",fichnames[i],sep="")
>>
>>
>
>again, file.path() is your friend
>
>
>
>> conin <- file(filein)
>> open(conin)
>>
>>
> > nbrows <- length( readLines(conin,n=-1) )
>
>
>> close(conin)
>>
>>
>
>You can simply use readLines() with the filename which open the
>connection to a file itself. And I do not see why you want to read the
>file here. Since your code becomes really complicated now, let me
>suggest the following procedure (untested!):
>
>dirdata <- getwd()
>fichnames <- list.files(file.path(dirdata, "initial"))
>for(i in fichnames){
> temp <- readLines(file.path(dirdata, "initial", i))
> temp <- gsub('"', "'", gsub("'", "''", temp))
> writeLines(temp, con = file.path(dirdata, "result", i))
>}
>
>Uwe Ligges
>
>
>
>
>
>
>
>> fileout <- paste(dirdata,"\\result\\",fichnames[i],sep="")
>> conout <- file(fileout,"w")
>>
>> conin <- file(filein)
>> open(conin)
>>
>>
>> for( l in 1:nbrows )
>> {
>> text <- gsub('"',"'",gsub("'","''",readLines(conin,n=1)))
>> writeLines(con=conout,text=text)
>> }
>>
>> close(conin)
>> close(conout)
>> }
>>
>>##------
>>
>>
>
>______________________________________________
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
I had had to modify the line below to allow for the connexion :

    temp <- readLines(file(file.path(dirdata, "initial", i)))

I didn't understand that readLines read all the file in one go, I understood that it read only one line !..so I did a loop on the lines of every file which is not necessary.

Thank you very much.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun Apr 08 01:30:06 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 07 Apr 2007 - 17:31:00 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.