From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>

Date: Mon, 16 Jul 2007 15:21:44 +0100

On Mon, 2007-07-16 at 14:57 +0100, ted.harding_at_nessie.mcc.ac.uk wrote:

> On 16-Jul-07 13:28:50, Gabor Grothendieck wrote:

*> > The formula attribute of the builtin CO2 dataset seems a bit strange:
**> >
**> >> formula(CO2)
**> > Plant ~ Type + Treatment + conc + uptake
**> >
**> > What is one supposed to do with that? Certainly its not suitable
**> > for input to lm and none of the examples in ?CO2 use the above.
**>
**> I think one is supposed to ignore it! (Or maybe be inspired to
**> write a mail to the list ... ).
**>
**> I couldn't find anything that looked like the above formula from
**> str(CO2). But I did spot that the order of terms in the formula:
**> Plant, Type, treatment, conc, uptake, is the same as the order
**> of the "columns" in the dataframe.
*

CO2 is a groupedData object not a data.frame per se.

> class(CO2)

[1] "nfnGroupedData" "nfGroupedData" "groupedData" "data.frame"

What Gabor saw was the result of formula.data.frame being applied to CO2, which, as you surmise Ted, is being produced from the columns of CO2.

> formula(CO2)

Plant ~ Type + Treatment + conc + uptake

> stats:::formula.data.frame(CO2)

Plant ~ Type + Treatment + conc + uptake

But if we load nlme, we see that now (via a groupedData method for formula) a more useful formula is displayed.

> require(nlme)

Loading required package: nlme

**[1] TRUE
**

> formula(CO2)

uptake ~ conc | Plant

as this can then be directly used in lme

> lme(CO2)

Linear mixed-effects model fit by REML

Data: CO2

Log-restricted-likelihood: -283.1447

Fixed: uptake ~ conc

(Intercept) conc

19.50028981 0.01773059

<snip />

But like Gabor, I'm struggling to see where this [formula(data.frame)] might be used (useful)?

G

