Re: [Rd] row names and identical

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Sun, 15 Jul 2007 23:30:24 -0400

The question which is whether the way identical works is desirable or intended.

On 7/15/07, miguel manese <jjonphl_at_gmail.com> wrote:
> I mean, starting 2.4 not all row names are stored as characters, some
> are stored as integers.
>
> On 7/16/07, miguel manese <jjonphl_at_gmail.com> wrote:
> > I was bit by this before. row.names are supposed to be characters, and
> > they are until around 2.3 iirc. Then at 2.4, they started storing it
> > as integers presumably to save space (and probably because ints are
> > more low-maintenance data types than strings). So to the user (e.g.
> > through row.names() ) they are still characters, but internally they
> > are not. I don't know what c(NA, 4L) stands for though (I see that it
> > has 4 elems...), or why the other is c(NA, -4L). However, in my code I
> > access them as a vector of integers,
> >
> > rownames = getAttrib(df, R_RowNamesSymbol);
> > if (IS_INTEGER(rownames)) {
> > for (int i = 0; i < LENGTH(rownames); i++) foo(INTEGER(rownames)[i] ...);
> > }
> >
> > Cheers,
> > M. Manese
> >
> > On 7/15/07, Gabor Grothendieck <ggrothendieck_at_gmail.com> wrote:
> > > Below x1, x2 and x3 all have the same data and all have the same value
> > > for row.names(x); however, the internal values of their row.names differ.
> > > The internal value of row.names is c(NA, -4L) for x1, c(NA, 4L) for x2 and
> > > c("1", "2", "3", "4") for x3; nevertheless, identical regards x1 and x2 as
> > > identical while x3 is not identical to either of x1 or x2.
> > >
> > > Is this intended?
> > > Desirable?
> > > Why do we need different internal representations for row.names
> > > for x1 and x2?
> > >
> > >
> > > > x1 <- x2 <- x3 <- data.frame(x = 11:14)
> > > > row.names(x2) <- 1:4
> > > > row.names(x3) <- as.character(1:4)
> > > >
> > > > # they all have the same value for row.names()
> > > > row.names(x1)
> > > [1] "1" "2" "3" "4"
> > > > row.names(x2)
> > > [1] "1" "2" "3" "4"
> > > > row.names(x3)
> > > [1] "1" "2" "3" "4"
> > > >
> > > > # but internally they are all different
> > > > dput(x1)
> > > structure(list(x = 11:14), .Names = "x", row.names = c(NA, -4L
> > > ), class = "data.frame")
> > > > dput(x2)
> > > structure(list(x = 11:14), .Names = "x", row.names = c(NA, 4L
> > > ), class = "data.frame")
> > > > dput(x3)
> > > structure(list(x = 11:14), .Names = "x", row.names = c("1", "2",
> > > "3", "4"), class = "data.frame")
> > > >
> > > > # identical regards x1 and x2 as the same while x3 differs
> > > > identical(x1, x2)
> > > [1] TRUE
> > > > identical(x1, x3)
> > > [1] FALSE
> > > > identical(x2, x3)
> > > [1] FALSE
> > > >
> > > > R.version.string # XP
> > > [1] "R version 2.5.1 (2007-06-27)"
> > >
> > > ______________________________________________
> > > R-devel_at_r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 16 Jul 2007 - 03:35:29 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 16 Jul 2007 - 11:36:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.