Re: [Rd] Decompressing raw vectors in memory

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 02 May 2012 17:16:29 +0100

On 02/05/2012 16:43, Hadley Wickham wrote:
>>> I'm struggling to decompress a gzip'd raw vector in memory:
>>>
>>> content<- readBin("http://httpbin.org/gzip", "raw", 1000)
>>>
>>> memDecompress(content, type = "gzip")
>>> # Error in memDecompress(content, type = "gzip") :
>>> # internal error -3 in memDecompress(2)
>>>
>>> I'm reasonably certain that the file is correctly compressed, because
>>> if I save it out to a file, I can read the uncompressed data:
>>>
>>> tmp<- tempfile()
>>> writeBin(content, tmp)
>>> readLines(tmp)
>>>
>>> So that suggests I'm using memDecompress incorrectly. Any hints?
>>
>> Headers.
>
> Looking at http://tools.ietf.org/html/rfc1952:
>
> * the first two bytes are id1 and id2, which are 1f 8b as expected
>
> * the third byte is the compression: deflate (as.integer(content[3]))
>
> * the fourth byte is the flag
>
> rawToBits(content[4])
> [1] 00 00 00 00 00 00 00 00
>
> which indicates no extra header fields are present
>
> So the header looks ok to me (with my limited knowledge of gzip)
>
> Stripping off the header doesn't seem to help either:
>
> memDecompress(content[-(1:10)], type = "gzip")
> # Error in memDecompress(content[-(1:10)], type = "gzip") :
> # internal error -3 in memDecompress(2)
>
> I've read the help for memDecompress but I don't see anything there to help me.
>
> Any more hints?

Well, it seems what you get there depends on the client, but I did

tystie% curl -o foo "http://httpbin.org/gzip" tystie% file foo
foo: gzip compressed data, last modified: Wed May 2 17:06:24 2012, max compression

and the final part worried me: I do not know if memDecompress() knows about that format. The help page does not claim it can do anything other than de-compress the results of memCompress() (although past experience has shown that it can in some cases). gzfile() supports a much wider range of formats.

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed 02 May 2012 - 16:19:08 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 02 May 2012 - 17:00:53 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive