Re: [Rd] subRaw?

From: Spencer Graves <spencer.graves_at_structuremonitoring.com>
Date: Fri, 20 Jul 2012 09:22:20 -0700

Hi, Hervé:

On 7/19/2012 10:19 PM, Hervé Pagès wrote:
> Hi Spencer,
>
> On 07/19/2012 08:29 PM, Spencer Graves wrote:
>> Hello, All:
>>
>>
>> Do you know of any capability to substitute more then one byte in
>> an object of class Raw?
>>
>>
>> Consider the following:
>>
>>
>> > let4 <- paste(letters[1:4], collapse='')
>> > (let4Raw <- charToRaw(let4))
>> [1] 61 62 63 64
>> > (let. <- sub('bc', '--', let4Raw))
>> [1] "61" "62" "63" "64"
>> > # no substitution
>> > (bc <- charToRaw('bc'))
>> [1] 62 63
>> > (ef <- charToRaw('ef'))
>> [1] 65 66
>> > (let. <- sub(bc, ef, let4Raw))
>> [1] "61" "65" "63" "64"
>> Warning messages:
>> 1: In sub(bc, ef, let4Raw) :
>> argument 'pattern' has length > 1 and only the first element will be
>> used
>> 2: In sub(bc, ef, let4Raw) :
>> argument 'replacement' has length > 1 and only the first element will
>> be used
>
> It makes no sense to use sub(), grep(), and family (i.e. all the stuff
> based on the regex code) *directly* on a raw vector because all these
> functions will start by coercing their 'x', 'text', 'pattern',
> 'replacement' args to character with as.character (if they are not
> already character).
>
> But the way as.character() operates on a raw vector won't give good
> results in that context. You'd rather do the coercion yourself first
> with rawToChar(), and coerce back the result with charToRaw():
>
> > charToRaw(sub("bc", "--", rawToChar(let4Raw)))
> [1] 61 2d 2d 64
>
> IMO it would make much more sense that sub(), grep(), and family()
> raise an error than blindly try to coerce to character but these
> functions (like many functions in R) are too polite to tell the
> user s/he's doing something wrong.

       Thanks for the reply.

       It sounds like you agree that a function "subRaw" to facilitate this would be useful. In my testing, charToRaw(sub(pattern, replacement, rawToChar(x)) did NOT preserve binary codes that did not match legitimate characters. I tried several things before finding one that seemed to work.

       Best Wishes,
       Spencer

>
> Cheers,
> H.
>
>>
>>
>> In this example, "b" was replaced by "e", but "bc" was not
>> replaced by "ef"? Do you know of any function to do this?
>>
>>
>> I ask, because I need it. I've written such a function, subRaw
>> for my own use. If I don't hear that another exists, I plan to add the
>> one I've written to the oro.dicom package.
>>
>>
>> Thanks,
>> Spencer
>>
>>
>> > sessionInfo()
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 20 Jul 2012 - 16:32:00 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 20 Jul 2012 - 19:00:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive