Re: [Rd] delete.response leaves response in attribute dataClasses

From: Paul Johnson <pauljohn32_at_gmail.com>
Date: Fri, 06 Jan 2012 13:17:01 -0600

Thanks, Bill

Counter-arguments at the end

On Thu, Jan 5, 2012 at 3:15 PM, William Dunlap <wdunlap_at_tibco.com> wrote:
> My feeling that everyone would index dataClasses by name was
> wrong.  I looked through the packages that used dataClasses
> and saw code that would break if the first (response) entry
> were omitted.  (I didn't check to see if passing the output
> of delete.response to these functions would be appropriate.)
> E.g.,
> file: AICcmodavg/R/predictSE.mer.r
>  ##matrix with info on factors
>  fact.frame <- attr(attr(orig.frame, "terms"), "dataClasses")[-1]
>
>  ##continue if factors
>  if(any(fact.frame == "factor")) {
>    id.factors <- which(fact.frame == "factor")
>    fact.name <- names(fact.frame)[id.factors] #identify the rows for factors
>
> Some packages create a dataClass attribute for a model.frame
> (not its terms attribute) that does not have any names:
> file: caper/R/macrocaic.R
>   attr(mf, "dataClasses") <- rep("numeric", dim(termFactors)[2])
> .checkMFClasses() does not throw an error for that, but it
> doesn't do any real checking either.
>
> Most users of dataClasses do pass it to .checkMFClasses() to
> compare it with newdata and that doesn't care if you have extra
> entries in dataClasses.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>

I can't understand what your point is. I agree we can work around the problem, but why should we have to?

If you confine yourself to the output of "delete.response" applied to a terms object from a regression, can you point to any package or usage that depends on leaving the response variable in the dataClasses attribute? I can't find one. In R base, these are all the references to delete.response:

stats/R/models.R:delete.response <- function (termobj)
stats/R/lm.R:        Terms <- delete.response(tt)
stats/R/lm.R:        Terms <- delete.response(tt)
stats/R/ppr.R:        Terms <- delete.response(object$terms)
stats/R/loess.R:

as.matrix(model.frame(delete.response(terms(object)), newdata, stats/R/dummy.coef.R: Terms <- delete.response(Terms)

I've looked it over carefully and predict.lm (in lm.R) would not be affected by the change I propose. I can't find any usage in loess.R of the dataClasses attribute.

Furthermore, I can't see how a person would use the dataClasses attribute at all, after the other markers of the response are eliminated. How is a method to find which variable is the response, after response=0?

I'm not disagreeing with you that I can workaround the peculiarity that the response is left in the dataClasses attribute of the output object from delete.response. I'm just saying it is a complication that programmers should not have to put up with, because I think delete.response should delete the response from all attributes of a terms object.

pj

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri 06 Jan 2012 - 19:29:39 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 06 Jan 2012 - 20:30:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive