Re: [Rd] apply: new behaviour for factors in R-2.4.0

From: Christoph Buser <buser_at_stat.math.ethz.ch>
Date: Thu 28 Sep 2006 - 14:32:45 GMT

Dear Brian

Thank you for your answer and the comment you included on the apply() help page.

1)

You are correct. My data.frame is coerced into a matrix in apply()

2)

I agree that the new version of unlist() is better and works correctly and that in array() (due to as.vector()) the factor "ans" is coerced into a character matrix.

Nevertheless I disagree that this is "feature freeze" with R version 2.3.1:

Since in R-2.3.1, unlist() on a list of factors returned an integer vector, the result of apply was an integer matrix and not a character matrix.

Therefore my question is if it would be desirable to return an integer matrix by changing apply. One could include additional code to handle the case if the output "should" be a factor matrix and coerce into an integer matrix.

Then the outcome would be consistent with R-2.3.1 without changing something in unlist() or array().

But in the end I am not sure if an integer matrix is better than a character matrix or a factor matrix. I am not sure what output is best if one uses as.factor in apply.

Regards,

Christoph



Christoph Buser <buser@stat.math.ethz.ch> Seminar fuer Statistik, LEO C13
ETH Zurich	8092 Zurich	 SWITZERLAND
phone: x-41-44-632-4673		fax: 632-1228

http://stat.ethz.ch/~buser/

Prof Brian Ripley writes:
> Christoph,
>
> This is more complicated than your analysis.
>
> 1) apply takes a matrix as an argument, not a data frame, and so first
> coerced 'dat' to a character matrix.
>
> 2) unlist is working quite correctly. The issue is array(), which
> contains as.vector(data). Thus although the result could be a factor
> matrix, as.vector is coercing it to a character matrix. It might be
> desirable to return a factor matrix, but we are not going to do that in
> feature freeze (if ever) and I really don't think it would be what you
> wanted.
>
> Perhaps the help page should contain an explicit statement that the result
> will be coerced to a basic vector type by as.vector().
>
> On Mon, 25 Sep 2006, Christoph Buser wrote:
>
> > Dear R-core
> >
> > There is a different output for the apply function due to the
> > change of unlist as mentioned in the R news.
> >
> > Newly, applying as.factor() (or factor()) in
> >
> > str(dat <- data.frame(x = 1:10, f1 = gl(2,5,labels = c("A", "B"))))
> > (d1 <- apply(dat,2,as.factor))
> >
> > newly returns a character matrix while in R-2.3.1 the same
> > command resulted in an integer matrix that was consistent (up to
> > the ordering of the factor levels) with data.matrix().
>
> That's coincidence -- try x=11:20.
>
> > The change is caused by the change of unlist() that, used for a
> > list of factors, newly returns a single factor instead of an
> > integer. I am happy with this change, but:
> >
> > Is it desirable to change apply so that it does not return a
> > character matrix in the example above or include a warning for
> > such a case?
> >
> > Thank you very much for an answer.
> >
> > Regards,
> >
> > Christoph Buser
> >
> > --------------------------------------------------------------
> > Christoph Buser <buser@stat.math.ethz.ch>
> > Seminar fuer Statistik, LEO C13
> > ETH Zurich 8092 Zurich SWITZERLAND
> > phone: x-41-44-632-4673 fax: 632-1228
> > http://stat.ethz.ch/~buser/
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Sep 29 00:49:22 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 28 Sep 2006 - 16:30:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.