Re: [Rd] y ~ X -1 , X a matrix

From: Peter Dalgaard <pdalgd_at_gmail.com>
Date: Thu, 18 Mar 2010 08:17:18 +0100

Ross Boylan wrote:

> On Thu, 2010-03-18 at 00:57 +0000, Ted.Harding_at_manchester.ac.uk wrote:
>> On 17-Mar-10 23:32:41, Ross Boylan wrote:

>>> While browsing some code I discovered a call to lm that used
>>> a formula y ~ X - 1, where X was a matrix.
>>>
>>> Looking through the documentation of formula, lm, model.matrix
>>> and maybe some others I couldn't find this useage (R 2.10.1).
>>> Is it anything I can count on in future versions? Is there
>>> documentation I've overlooked?
>>>
>>> For the curious: model.frame on the above equation returns a
>>> data.frame with 2 columns. The second "column" is the whole X
>>> matrix. model.matrix on that object returns the expected matrix,
>>> with the transition from the odd model.frame to the regular
>>> matrix happening in an .Internal call.
>>>
>>> Thanks.
>>> Ross
>>>
>>> P.S. I would appreciate cc's, since mail problems are preventing
>>> me from seeing list mail.
>> Hmmm ... I'm not sure what is the problem with what you describe.
> There is no problem in the "it doesn't work" sense.
> There is a problem that it seems undocumented--though the help you quote
> could rather indirectly be taken as a clue--and thus, possibly, subject
> to change in later releases.

I'm pretty sure that it is per original design that data frames can have matrix columns, although data.frame() and as.data.frame() are quite trigger-happy when it comes to converting them to individual columns. You need things like d <- data.frame(X=I(X)) to prevent it.

As you have seen, matrices can be handy on the RHS of formulas, but there are at least two cases where they are crucial on the LHS, multivariate linear models and one version of glm(Y~..., binomial).

Without being able to store matrices as individual components in a data frame, I don't think you can avoid internally expanding model formula into (say) Y ~ X1 + X2 - 1, which could get rather unwieldy, so I don't think the feature will be going away. (Someone with too much time on his/her hand might want to rationalize the whole data frame concept, but that should go in the direction of handling all matrix-like structures consistently, including date-time objects etc.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes_at_cbs.dk  Priv: PDalgd_at_gmail.com

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 18 Mar 2010 - 07:19:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 18 Mar 2010 - 17:51:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive