Re: [Rd] delete.response leaves response in attribute dataClasses

From: William Dunlap <wdunlap_at_tibco.com>
Date: Fri, 06 Jan 2012 20:23:38 +0000

> -----Original Message-----
> From: Paul Johnson [mailto:pauljohn32_at_gmail.com]
> Sent: Friday, January 06, 2012 11:17 AM
> To: William Dunlap
> Cc: R Devel List
> Subject: Re: [Rd] delete.response leaves response in attribute dataClasses
>
> Thanks, Bill
>
> Counter-arguments at the end
>
> On Thu, Jan 5, 2012 at 3:15 PM, William Dunlap <wdunlap_at_tibco.com> wrote:
> > My feeling that everyone would index dataClasses by name was
> > wrong.  I looked through the packages that used dataClasses
> > and saw code that would break if the first (response) entry
> > were omitted.  (I didn't check to see if passing the output
> > of delete.response to these functions would be appropriate.)
> > E.g.,
> > file: AICcmodavg/R/predictSE.mer.r
> >  ##matrix with info on factors
> >  fact.frame <- attr(attr(orig.frame, "terms"), "dataClasses")[-1]
> >
> >  ##continue if factors
> >  if(any(fact.frame == "factor")) {
> >    id.factors <- which(fact.frame == "factor")
> >    fact.name <- names(fact.frame)[id.factors] #identify the rows for factors
> >
> > Some packages create a dataClass attribute for a model.frame
> > (not its terms attribute) that does not have any names:
> > file: caper/R/macrocaic.R
> >   attr(mf, "dataClasses") <- rep("numeric", dim(termFactors)[2])
> > .checkMFClasses() does not throw an error for that, but it
> > doesn't do any real checking either.
> >
> > Most users of dataClasses do pass it to .checkMFClasses() to
> > compare it with newdata and that doesn't care if you have extra
> > entries in dataClasses.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
>
> I can't understand what your point is. I agree we can work around the
> problem, but why should we have to?

I guess my point was that it would make sense for delete.response to drop the response element from dataClasses, as it has no use. It was almost certainly an oversight that it wasn't dropped, as most terms objects don't have the dataClasses attribute.

Properly written code, which only subscripted dataClasses by name (not by number) would not be affected by the change but improperly written code (e.g., AICcmodavg's predictSE, which assumes the response is in position 1) would be adversely affected in the unlikely case that someone passed it the output of delete.response.

I don't know how much you want to cater to "errors" by package writers.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

>
> If you confine yourself to the output of "delete.response" applied to
> a terms object from a regression, can you point to any package or
> usage that depends on leaving the response variable in the dataClasses
> attribute? I can't find one. In R base, these are all the references
> to delete.response:
>
> stats/R/models.R:delete.response <- function (termobj)
> stats/R/lm.R: Terms <- delete.response(tt)
> stats/R/lm.R: Terms <- delete.response(tt)
> stats/R/ppr.R: Terms <- delete.response(object$terms)
> stats/R/loess.R:
> as.matrix(model.frame(delete.response(terms(object)), newdata,
> stats/R/dummy.coef.R: Terms <- delete.response(Terms)
>
> I've looked it over carefully and predict.lm (in lm.R) would not be
> affected by the change I propose. I can't find any usage in loess.R of
> the dataClasses attribute.
>
> Furthermore, I can't see how a person would use the dataClasses
> attribute at all, after the other markers of the response are
> eliminated. How is a method to find which variable is the response,
> after response=0?
>
> I'm not disagreeing with you that I can workaround the peculiarity
> that the response is left in the dataClasses attribute of the output
> object from delete.response. I'm just saying it is a complication
> that programmers should not have to put up with, because I think
> delete.response should delete the response from all attributes of a
> terms object.
>
> pj
>
>
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 06 Jan 2012 - 20:27:05 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 06 Jan 2012 - 20:40:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive