Re: [Rd] model.matrix.default chokes on backquote (PR#7202)

From: Gabor Grothendieck <ggrothendieck_at_myway.com>
Date: Sun 29 Aug 2004 - 01:15:00 EST

>
> From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
>
> "Gabor Grothendieck" <ggrothendieck@myway.com> writes:
>
> > > ggrothendieck@myway.com writes:
> > >
> > > > The following gives an error:
> > > >
> > > > > `a(b)` <- 1:4
> > > > > `c(d)` <- (1:4)^2
> > > > > lm(`a(b)` ~ `c(d)`)
> > > > Error in model.matrix.default(mt, mf, contrasts) :
> > > > model frame and formula mismatch in model.matrix()
> > > >
> > > > To fix it replace this line in model.matrix.default:
> > > >
> > > > reorder <- match(attr(t, "variables")[-1], names(data))
> > > >
> > > > with these two lines:
> > > >
> > > > strip.backquote <- function(x) gsub("^`(.*)`", "\\1", x)
> > > > reorder <- match(strip.backquote(attr(t, "variables"))[-1],
> > > > strip.backquote(names(data)))
> > >
> > > Hmm.. Yes, there's a bug (and it's likely not the only one we have
> > > relating to odd variable names in model formulas), but I suspect that
> > > the fix is wrong.
> > >
> > > The backquotes are not part of the variable names, but get added by
> > > deparsing -- sometimes! Other times they do not: Try for instance
> > > as.character(quote(`a(b)`)). (Which is as it should be. Other pieces
> > > of logic relating to nonsyntactical names represent some rather
> > > awkward compromises.)
> > >
> > > When backquotes have found their way into names(data) or the
> > > "variables" attribute, I would rather suspect that they were created
> > > by the wrong tool and fix that, not cure the symptom by stripping them
> > > off at a later stage.
> >
> > In model.frame.default there is a line:
> >
> > varnames <- as.character(vars[-1])
> >
> > that turns part of a call object, vars, into a character string.
> > We could change that to:
> >
> > varnames <- strip.backquote(as.character(as.list(vars[-1])))
> >
> > or perhaps as.character should not return the backquotes in the
> > first place in which case the fix would be to fix as.character.
>
> Or not use it in this way. I forget what the reasoning was behind the
> current behaviour of as.character, but the point is that
>
> > as.character(attr(terms(`a(b)`~`c(d)`),"variables"))
> [1] "list" "`a(b)`" "`c(d)`"
>
> whereas for instance
>
> > sapply(attr(terms(`a(b)`~`c(d)`),"variables")[-1],as.character)
> [1] "a(b)" "c(d)"

  1. That is quite subtle but a fix based on that would appear to solve it.
  2. Your example and possibly some verbiage should be added to ?as.character .
  3. In looking for the offending spot, I seem to remember (though I did not keep track of it) that one or more of lm, model.frame.default, terms.formula, etc. had additional applications of as.character directly to a list as in your first example and these should probably be changed to correspond to your second example, as well, where as.character is applied to the elements of the list rather than the lsit itself.

R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sun Aug 29 01:17:52 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 08:59:42 EST