Re: [R] "Special" characters in URI

From: Henrik Bengtsson <hb_at_maths.lth.se>
Date: Tue 03 May 2005 - 23:20:31 EST

Gregor GORJANC wrote:
> Henrik Bengtsson wrote:
>

>>Gregor GORJANC wrote:

>
> ...
>
>>>What do you think about this scratch, which afcourse doesn't solve all
>>>"special" characters:
>>>
>>>fixURLchar <- function(URL,
>>>                       from = c(" ", "\"", ",", "#"),
>>>                       to = c("%20", "%22", "%2c", "%23"))
>>
>>
>>Just a comment. It is much safer/easier to use named vectors for
>>mapping, e.g.
>>
>> map <- c(" "="%20", "\""="%22", ","="%2c", "#"="%23")
>>

>
> ...
>
> Henrik, thanks. So you suggest something like
>
> for (i in seq(along=map)) {
> URL <- gsub(pattern=names(map)[i], replacement=map[i], x=URL)
> }
>

Yes, something like that. To optimize, you might want to do

patterns <- names(map);
for (i in seq(along=map)) {

   URL <- gsub(pattern=patterns[i], replacement=map[i], x=URL) }

More important is that you treat a standard "%" different from a "%" used in encoding, e.g. how do you want to convert the string "100% %20"? You probably have to utilize more "fancy" regular expressions to detect a standard "%". Maybe "%[^0-9a-fA-F]" will do. There should be much more details in the document Brian Ripley refered you to.

In other words, you have to be careful and try to think through all cases you function may be called. A good test is to call it twice, once on your original string and the on the escaped on; you should get the same result. It depends how complete you want your function to be.

Good luck

Henrik



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 03 23:48:41 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:32 EST