Re: [Rd] Writing character vectors with embedded nulls to a connection

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Fri 31 Mar 2006 - 16:48:35 GMT

The following approach

sobject <- charToRaw(serialize(object,NULL)) len <- length(sobject)
writeBin(sobject, outcon)

would appear to work. As from 2.3.0 you will then be able to do

unserialize(readBin(incon, "raw", n=len))

On Fri, 31 Mar 2006, Prof Brian Ripley wrote:

> I think you should be using a raw type to hold such data in R. It is not
> intentional that readChar handles embedded nuls (and in fact it might not in
> an MBCS).
>
> As ?serialize says
>
> For 'serialize', 'NULL' unless 'connection=NULL', when the result
> is stored in the first element of a character vector (but is not a
> normal character string unless 'ascii = TRUE' and should not be
> processed except by 'unserialize').
>
> so you have been told this is not intended to work as you tried.
>
> serialize predates the raw type, or it would have made use of it. In these
> days of MBCS character strings it is increasingly unsafe to use them to hold
> anything other than valid character data.
>
>
> On Thu, 30 Mar 2006, Jeffrey Horner wrote:
>
>> Is this possible? I've tried both writeChar() and writeBin() to no avail.
>>
>> My goal is to serialize(ascii=FALSE) an object to a connection but
>> determine the size of the serialized object before hand:
>>
>> sobject <- serialize(object,NULL,ascii=FALSE)
>> len <- nchar(sobject)
>> #
>> # run some code here to notify listener on other end of connection
>> # how many bytes I'm getting ready to send
>> #
>> writeChar(sobject,con)
>>
>> The other option is to serialize twice:
>>
>> len <- nchar(serialize(object,NULL,ascii=FALSE))
>> #
>> # run some code here to notify listener on other end of connection
>> # how many bytes I'm getting ready to send
>> #
>> serialize(object,con,ascii=FALSE)
>>
>> Object stores, like memcache (http://danga.com/memcached/), need to know
>> object sizes before storing. RDBMS's which support large objects (CLOBS
>> or BLOBS) don't nececarilly need to know object sizes before-hand, but
>> they do have max column size limits which must be honored.
>>
>> BTW, readchar() can read strings with embedded nulls; I figured
>> writeChar() should be able to write them.
>>
>>
>
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat Apr 01 03:11:39 2006

This archive was generated by hypermail 2.1.8 : Fri 31 Mar 2006 - 20:16:44 GMT