[Rd] rbind.data.frame: bug?

From: <Mark.Bravington_at_csiro.au>
Date: Sat, 07 Jul 2007 11:10:39 +1000


Consider the following, which is new behaviour under R 2.5+:

> df1 <- data.frame( x=2, y='cat')
> df2 <- data.frame( x=3, y='dog')
> rbind( df1[-1,], df2)$y == rbind( df1, df2)[-1,]$y
Error in Ops.factor(rbind(df1[-1, ], df2)$y, rbind(df1, :

        Level sets of factors are different
        

To me this seems illogical; it shouldn't matter whether you remove the first row of the data.frame before or after augmenting the rows (and didn't, in R prior to v2.5). And there's more!

> levels( rbind( df1[-1,], df2)$y)

[1] "dog"

> levels( rbind( df1[-1,], df2[-1,])$y)
[1] "cat"

but why should rbind suddenly acknowledge the levels of its first argument in the second case and not the first?

If the data.frames have more than one row, these issue don't arise; they occur because (as the documentation says) "The 'rbind' data frame method first drops all zero-column and zero-row arguments. (If that leaves none, it returns the first argument with columns otherwise a zero-column zero-row data

     frame.)...". So this behaviour is documented-- but isn't it nevertheless a bug?

>From the release notes, I gather that the new zero-row-stripping behaviour was introduced to get round a specific problem with 0*0 data.frames. Given the above, though, wouldn't it be preferable to just include special code to deal with the 0*0 case, leaving 0*N cases unaffected?

Mark Bravington



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 07 Jul 2007 - 01:13:22 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 07 Jul 2007 - 01:35:53 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.