[Rd] Consistency of serialize(): please enlighten me

From: Henrik Bengtsson <hb_at_stat.berkeley.edu>
Date: Fri, 31 Aug 2007 12:45:34 -0700


I am puzzled with serialize(). It comes down generating identical hash codes for (apparently) identical objects using digest::digest(), which in turn relies on serialize(). Here is an example illustration the issue:

ser <- function(object, ...) {
    names = names(object),
    namesRaw = charToRaw(names(object)),     ser = serialize(names(object), connection=NULL, ascii=FALSE)   )
} # ser()

# Object to be serialized

key <- key0 <- list(abc="Hello");

# Store results

d <- list();

# 1. As is

d[[1]] <- ser(key);

# 2. Set names and redo (hardwired: identical to what's already there)
names(key) <- "abc";
d[[2]] <- ser(key);

# 3. Set names and redo (generic: char->raw->char)
key <- key0;
names(key) <- sapply(names(key), FUN=function(name) rawToChar(charToRaw(name)));
d[[3]] <- ser(key);

# All names are identical

for (kk in 2:length(d))
  stopifnot(identical(d[[1]]$names, d[[kk]]$names));

# All raw names are identical

for (kk in 2:length(d))
  stopifnot(identical(d[[1]]$namesRaw, d[[kk]]$namesRaw));

# But, the serialized names differ.

print(identical(d[[1]]$ser, d[[2]]$ser));
print(identical(d[[1]]$ser, d[[3]]$ser));
print(identical(d[[2]]$ser, d[[3]]$ser));

So, it seems like there is some extra information in the names attribute that is part of the serialization. Is it possible to show they differ at the R level? What is that extra information? Promises...?

Please enlighten me.


R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 31 Aug 2007 - 19:48:22 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 31 Aug 2007 - 20:41:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.