Re: [Rd] [Fwd: Re: [R-downunder] Beware unclass(factor)] (PR#9641)

From: <ripley_at_stats.ox.ac.uk>
Date: Tue, 01 May 2007 15:33:20 +0200 (CEST)


It really is unclear what is claimed to be a bug here. But see

https://stat.ethz.ch/pipermail/r-devel/2007-May/045592.html

for why the bug is not in R: your old and new data do not match. Your fit is to a category.

[The problem with the web interface to R-bugs was reported last week: it is being worked on.]

On Mon, 30 Apr 2007, r.darnell_at_uq.edu.au wrote:

> This is a multi-part message in MIME format.
> --------------040101030901070905010208
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
>
> The following "issue" was found using
>
> > version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 4.1
> year 2006
> month 12
> day 18
> svn rev 40228
> language R
> version.string R version 2.4.1 (2006-12-18)
> >
>
>
> and discussed on the R-downunder mailing list.
>
> I hope I have provided enough info. I tried to look at the Bugs
> Tracking page but got---
>
>
> The system encountered a fatal error
>
> *
>
> cannot open config file /home/sfe/r-bugs/jitterbug/R : No such file or directory
>
> *
>
> The last error code was: No such file or directory
>
> uid/gid=30/8
>
>
> Regards
>
> Ross Darnell
>
> --------------040101030901070905010208
> Content-Type: message/rfc822;
> name="Re: [R-downunder] Beware unclass(factor)"
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline;
> filename="Re: [R-downunder] Beware unclass(factor)"
>
> Return-path: <john.maindonald_at_anu.edu.au>
> Received: from mail2a.soe.uq.edu.au (mail2a.soe.uq.edu.au [130.102.3.87])
> by MAILSTORE (The University of Queensland Central Mail System)
> with ESMTP id <0JHB00BUB0WHC0_at_anode.soe.uq.edu.au> for r.darnell_at_uq.edu.au;
> Mon, 30 Apr 2007 19:26:41 +1000 (EST)
> Received: from mailhub4.uq.edu.au (mailhub4.uq.edu.au [130.102.149.131])
> by MAILSTORE (The University of Queensland Central Mail System)
> with ESMTP id <0JHB009DL0WH43_at_positive.soe.uq.edu.au> for r.darnell_at_uq.edu.au;
> Mon, 30 Apr 2007 19:26:41 +1000 (EST)
> Received: from customer-domains.icp-qv1-irony10.iinet.net.au
> (customer-domains.icp-qv1-irony10.iinet.net.au [203.59.1.145])
> by mailhub4.uq.edu.au (8.13.8/8.13.8) with ESMTP id l3U9QcOd021380 for
> <r.darnell_at_uq.edu.au>; Mon, 30 Apr 2007 19:26:41 +1000
> Received: from 203-173-2-10.dyn.iinet.net.au (HELO [192.168.0.2])
> ([203.173.2.10]) by iinet-mail.icp-qv1-irony10.iinet.net.au with ESMTP; Mon,
> 30 Apr 2007 17:25:10 +0800
> Date: Mon, 30 Apr 2007 19:25:09 +1000
> From: John Maindonald <john.maindonald_at_anu.edu.au>
> Subject: Re: [R-downunder] Beware unclass(factor)
> In-reply-to: <46359373.50504_at_uq.edu.au>
> To: Ross Darnell <r.darnell_at_uq.edu.au>
> Cc: r-downunder_at_stat.auckland.ac.nz
> Message-id: <68935773-EB35-4B4F-9970-0D241FDFF73C_at_anu.edu.au>
> MIME-version: 1.0 (Apple Message framework v752.3)
> X-Mailer: Apple Mail (2.752.3)
> Content-type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
> Content-transfer-encoding: 7bit
> X-IronPort-Anti-Spam-Filtered: true
> X-IronPort-Anti-Spam-Result: AgAAAARTNUbLrQIKUGdsb2JhbAANj3wBASo
> X-IronPort-AV: i="4.14,469,1170601200"; d="scan'208";
> a="80792155:sNHT7461584868"
> X-Sorbs: not_in_sorbs
> X-Spam-Score: 0 (), 5 = high
> X-UQ-Spam-Score: UQ-Spam-Score (0), 5 = high
> X-UQ-FilterTime: 1177925201
> X-Scanned-By: MIMEDefang 2.58 on UQ Mailhub on 130.102.149.131
> References: <46359373.50504_at_uq.edu.au>
> Original-recipient: rfc822;r.darnell_at_uq.edu.au
>
> Observe the following
>
> > z <- model.frame(cbind(moths,(20-moths)) ~sex+ doselin,data=worms)
> > class(z$doselin)
> [1] "other"
> > levels(z$doselin)
> [1] "1" "2" "4" "8" "16" "32"
> > attributes(z$doselin)
> $levels
> [1] "1" "2" "4" "8" "16" "32"
>
> $class
> [1] "other"
>
> The problem surfaces in the call for model.frame() from predict.lm()
> when it is called by predict.glm(). This call is jumping to conclusions
> when it uses the presence of a levels attribute as an indication that
> doselin is a factor, ironic as it was the call that was initiated by glm
> that seems to have given the column doselin of the object returned
> by model.frame() the class "other".
>
> This seems to me to be a bug. The call to unclass() does not
> strip the levels attribute from doselin. (This is not, I think, the
> bug; rather the problem is in the model matrix that is created.)
> The column worms$doselin does though have class "integer",
> at least as far as the function class() is concerned.
>
> You can fix the problem by setting:
>
> worms$doselin <- as.vector(unclass(worms$Dose))
>
> This strips off the levels attribute.
>
> In my view model.frame ought to have stripped the levels
> attribute from the column doselin in the object that it
> returned.
>
> I consider that this should be reported as a bug, or at least
> as an undesirable feature.
>
> John Maindonald email: john.maindonald_at_anu.edu.au
> phone : +61 2 (6125)3473 fax : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
>
>
> On 30 Apr 2007, at 4:57 PM, Ross Darnell wrote:
>
>> Just an observation about the use of unclass() to generate codes
>> for factors.
>>
>> As an example take the dataset from the MASS4 book
>>
>>> worms <- data.frame(sex=gl(2,6),Dose=factor(rep(2^(0:5),
>> 2)),moths=c(1,4,9,13,18,20,0,2,6,10,12,16))
>>
>>> worms$doselin <- unclass(worms$Dose)
>>
>>> worms.glm <- glm(cbind(moths,(20-moths)) ~sex+
>> doselin,data=worms,family=binomial)
>>
>>> predict(worms.glm,new=data.frame(sex="1",doselin=6))
>> Error: variable 'doselin' was fitted with class "other" but class
>> "numeric" was supplied
>> In addition: Warning message:
>> variable 'doselin' is not a factor in: model.frame.default(Terms,
>> newdata, na.action = na.action, xlev = object$xlevels)
>>>
>>
>>
>> The /doselin/ vector is "atomic" --- good enough for the glm()
>> function but not acceptable by predict()
>>
>>> str(worms$doselin)
>> atomic [1:12] 1 2 3 4 5 6 1 2 3 4 ...
>> - attr(*, "levels")= chr [1:6] "1" "2" "4" "8" ...
>>>
>>
>> Cheers
>>
>> Ross Darnell
>>
>> --
>> R-downunder_at_stat.auckland.ac.nz
>> http://www.stat.auckland.ac.nz/r-downunder
>>
>> To unsubscribe send an email to R-downunder-
>> unsubscribe_at_stat.auckland.ac.nz
>
> --------------040101030901070905010208--
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue 01 May 2007 - 17:13:51 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 May 2007 - 17:33:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.