Re: [Rd] apply: new behaviour for factors in R-2.4.0

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu 28 Sep 2006 - 15:23:55 GMT

On Thu, 28 Sep 2006, Christoph Buser wrote:

> Dear Brian
>
> Thank you for your answer and the comment you included on the
> apply() help page.
>
> 1)
>
> You are correct. My data.frame is coerced into a matrix in
> apply()
>
> 2)
>
> I agree that the new version of unlist() is better and works
> correctly and that in array() (due to as.vector()) the factor
> "ans" is coerced into a character matrix.
>
> Nevertheless I disagree that this is "feature freeze" with
> R version 2.3.1:

It is R 2.4.0 that is in 'feature freeze': we cannot change 2.4.0 now (and would not have done so when I answered).

See developer.r-project.org for a fuller explanation.

> Since in R-2.3.1, unlist() on a list of factors returned an
> integer vector, the result of apply was an integer matrix and
> not a character matrix.
>
> Therefore my question is if it would be desirable to return an
> integer matrix by changing apply. One could include additional
> code to handle the case if the output "should" be a factor
> matrix and coerce into an integer matrix.
>
> Then the outcome would be consistent with R-2.3.1 without
> changing something in unlist() or array().
>
>
> But in the end I am not sure if an integer matrix is better than
> a character matrix or a factor matrix. I am not sure what output
> is best if one uses as.factor in apply.

It seems best not to do so!

>
> Regards,
>
> Christoph
>
> --------------------------------------------------------------
> Christoph Buser <buser@stat.math.ethz.ch>
> Seminar fuer Statistik, LEO C13
> ETH Zurich 8092 Zurich SWITZERLAND
> phone: x-41-44-632-4673 fax: 632-1228
> http://stat.ethz.ch/~buser/
> --------------------------------------------------------------
>
> Prof Brian Ripley writes:
> > Christoph,
> >
> > This is more complicated than your analysis.
> >
> > 1) apply takes a matrix as an argument, not a data frame, and so first
> > coerced 'dat' to a character matrix.
> >
> > 2) unlist is working quite correctly. The issue is array(), which
> > contains as.vector(data). Thus although the result could be a factor
> > matrix, as.vector is coercing it to a character matrix. It might be
> > desirable to return a factor matrix, but we are not going to do that in
> > feature freeze (if ever) and I really don't think it would be what you
> > wanted.
> >
> > Perhaps the help page should contain an explicit statement that the result
> > will be coerced to a basic vector type by as.vector().
> >
> > On Mon, 25 Sep 2006, Christoph Buser wrote:
> >
> > > Dear R-core
> > >
> > > There is a different output for the apply function due to the
> > > change of unlist as mentioned in the R news.
> > >
> > > Newly, applying as.factor() (or factor()) in
> > >
> > > str(dat <- data.frame(x = 1:10, f1 = gl(2,5,labels = c("A", "B"))))
> > > (d1 <- apply(dat,2,as.factor))
> > >
> > > newly returns a character matrix while in R-2.3.1 the same
> > > command resulted in an integer matrix that was consistent (up to
> > > the ordering of the factor levels) with data.matrix().
> >
> > That's coincidence -- try x=11:20.
> >
> > > The change is caused by the change of unlist() that, used for a
> > > list of factors, newly returns a single factor instead of an
> > > integer. I am happy with this change, but:
> > >
> > > Is it desirable to change apply so that it does not return a
> > > character matrix in the example above or include a warning for
> > > such a case?
> > >
> > > Thank you very much for an answer.
> > >
> > > Regards,
> > >
> > > Christoph Buser
> > >
> > > --------------------------------------------------------------
> > > Christoph Buser <buser@stat.math.ethz.ch>
> > > Seminar fuer Statistik, LEO C13
> > > ETH Zurich 8092 Zurich SWITZERLAND
> > > phone: x-41-44-632-4673 fax: 632-1228
> > > http://stat.ethz.ch/~buser/
> > >
> > > ______________________________________________
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > --
> > Brian D. Ripley, ripley@stats.ox.ac.uk
> > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford, Tel: +44 1865 272861 (self)
> > 1 South Parks Road, +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri Sep 29 01:38:04 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 28 Sep 2006 - 16:30:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.