Re: [Rd] as.data.frame.matrix() returns an invalid object

From: peter dalgaard <pdalgd_at_gmail.com>
Date: Sat, 13 Oct 2012 09:10:47 +0200

On Oct 11, 2012, at 16:02 , Bert Gunter wrote:

> ... and further
> 

>> identical(as.list(df2),as.list(df1))
> [1] TRUE
> 
> in R 2.15.0
> 
> Not sure whether these sorts of degenerate cases are of much value,
> though. But I'll leave that for the wizards.

Looks like this is easier to fix that to argue pro/con fixing it...

AFAICS, there's a gap in the logic in as.data.frame.matrix:

    if (length(row.names) != nrows)

        row.names <- .set_row_names(nrows)

but length(NULL) is 0 so we can end up leaving row.names at NULL and eventually nulling it in the result. An explicit check for is.null(row.names) should help.

-pd

> 
> -- Bert
> 
> On Wed, Oct 10, 2012 at 11:22 PM, Hervé Pagès <hpages_at_fhcrc.org> wrote:

>> Hi,
>>
>> Two ways to create what should normally be the same data frame:
>>
>>> df1 <- data.frame(a=character(0), b=character(0))> df1

>> [1] a b
>> <0 rows> (or 0-length row.names)
>>
>>> df2 <- as.data.frame(matrix(character(0), ncol=2, dimnames=list(NULL,

>> letters[1:2])))
>>> df2

>> [1] a b
>> <0 rows> (or 0-length row.names)
>>
>> unique() works as expected except that I get a warning on 'df2':
>>
>>> unique(df1)

>> [1] a b
>> <0 rows> (or 0-length row.names)
>>
>>> unique(df2)

>> [1] a b
>> <0 rows> (or 0-length row.names)
>> Warning message:
>> In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL'
>>
>> Look like the two data frames are not identical:
>>
>>> identical(df1, df2)

>> [1] FALSE
>>
>>> all.equal(df1, df2)

>> [1] "Attributes: < Length mismatch: comparison on first 1 components >"
>>
>>> attributes(df1)

>> $names
>> [1] "a" "b"
>>
>> $row.names
>> integer(0)
>>
>> $class
>> [1] "data.frame"
>>
>>> attributes(df2)

>> $names
>> [1] "a" "b"
>>
>> $class
>> [1] "data.frame"
>>
>> Actually 'df2' is considered broken by validObject():
>>
>>> validObject(df1)

>> [1] TRUE
>>
>>> validObject(df2)

>> Error in validObject(df2) :
>> invalid class “data.frame” object: slots in class definition but not in
>> object: "row.names"
>>
>> This is with R 2.15 and recent R devel.
>>
>> Cheers,
>> H.
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages_at_fhcrc.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>>
>> ______________________________________________
>> R-devel_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes_at_cbs.dk  Priv: PDalgd_at_gmail.com

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat 13 Oct 2012 - 07:14:41 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 13 Oct 2012 - 22:00:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive