RE: [Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.fram e

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Sat 12 Feb 2005 - 04:24:18 EST


> From: Gorjanc Gregor
>
> ! Look after character !
>
> From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk]
> You too have not give an reproducible example!
> ! Yes, I was not able to do it from my data. But bellow is one. It is
> ! a stupid one, but it works. The problem is use of as.data.frame in
> ! tmp1$L <- as.data.frame(tmp$L). This looks like to produce
> a corrupted
> ! data.frame. If I use just tmp1$L <- tmp$L, write.table and
> ! as.matrix.data.frame works OK. I still think that mine proposal can
> ! give benefit, since it works also on corrupted data frames.
>
> data(warpbreaks)
> tmp <- as.data.frame(tapply(breaks, list(wool, tension), mean))
> tmp1 <- data.frame(level=rownames(tmp))
> tmp1$L <- as.data.frame(tmp$L)

Here's the problem that Brian is referring to: Why do you make one variable in the data frame a data frame? That's what caused problem in write.table()!

Andy

> write.table(tmp1)
> Error in as.matrix.data.frame(x) : dim<- : dims [product 2]
> do not match the length of object [3]
>
> tmp1$L <- tmp$L
> write.table(tmp1)
> "level" "L"
> "1" "A" 44.55556
> "2" "B" 28.22222
>
> If you have a corrupt data frame, the function may fail,
> which is what
> happened in the PR# you quote.
>
> Please note: you should not be calling as.matrix.data.frame,
> but as.matrix.
> ! I called it because I had problems with write.table and
> that function
> ! calls as.matrix.data.frame.
>
> On Fri, 11 Feb 2005, Gorjanc Gregor wrote:
>
> > Hello R developers.
> >
> > I encountered the same problem as Uwe Ligges with
> as.matrix.data.frame()
> > in bug reports 3229 and 3242 - under section not-reproducible.
> >
> > Example I have is:
> >
> >> tmp
> > level 2100-D
> > 1 biological_process unknown NA
> > 2 cellular process -5.88
> > 3 development -8.42
> > 4 physiological process -6.55
> > 5 regulation of biological process NA
> > 6 viral life cycle NA
> >
> >> str(tmp)
> > `data.frame': 6 obs. of 2 variables:
> > $ level : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6
> > $ 2100-D_mean:`data.frame': 6 obs. of 1 variable:
> > ..$ 2100-D: num NA -5.88 -8.42 -6.55 NA NA
>
> I think you have a data frame column in a data frame, and
> that cannot be
> made directly into a matrix. It's the steps that got you
> here that are
> the problem.
>
> >> as.matrix.data.frame(tmp)
> > Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not
> > match the length of object [7]
> >
> > The error associated with this is comming up at the end of function
> > as.matrix.data.frame where it is used:
> >
> > dim(X) <- c(n, length(X)/n)
> >
> > ?dim says
> > 'dim' has a method for 'data.frame's, which returns the
> length of
> > the 'row.names' attribute of 'x' and the length of 'x' (the
> > numbers of "rows" and "columns").
> >
> > This part is ok. The problem is with X, which is "intensively"
> > modified through the function. Before this (dim(X) <- ...) call
> > X in my case is:
> >
> >> x <- tmp
> >> "code from as.matrix.data.frame down to dim(X) <- ..."
> >> X
> > [[1]]
> > [1] "biological_process unknown"
> >
> > [[2]]
> > [1] "cellular process"
> >
> > [[3]]
> > [1] "development"
> >
> > [[4]]
> > [1] "physiological process"
> >
> > [[5]]
> > [1] "regulation of biological process"
> >
> > [[6]]
> > [1] "viral life cycle"
> >
> > [[7]]
> > [1] NA -5.88 -8.42 -6.55 NA NA
> >
> > So we can see, that X is somehow destroyed - the first and second
> > column of tmp differ. For dim command this should really be one
> > long vector. So the problem lies in line
> >
> > X <- unlist(X, recursive = FALSE, use.names = FALSE)
> >
> > where it should be
> >
> > X <- unlist(X, recursive = TRUE, use.names = FALSE)
> > ^^^^
> >
> > I have checked source code for that function from R as well as
> > in R-devel sources. I was not succesfull in reproducing the above
> > with the data frame bellow though. It did not report any problems
> > with old as.matrix.data.frame. There must be some trick with
> > first column in my data. So I am quite sure my suggestion is
> > OK.
> >
> > tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8))
> >
> > --
> > Lep pozdrav / With regards,
> > Gregor GORJANC
> >
> > ---------------------------------------------------------------
> > University of Ljubljana
> > Biotechnical Faculty URI: http://www.bfro.uni-lj.si
> > Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si
> > Groblje 3 tel: +386 (0)1 72 17 861
> > SI-1230 Domzale fax: +386 (0)1 72 17 888
> > Slovenia
> >
> > ______________________________________________
> > R-devel@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Feb 12 03:59:12 2005

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:02:51 EST