Re: R-beta: Powers ('x^2') in lm/glm

Martin Maechler (maechler@stat.math.ethz.ch)
Fri, 15 Aug 1997 15:07:07 +0200


Date: Fri, 15 Aug 1997 15:07:07 +0200
Message-Id: <199708151307.PAA02583@sophie.ethz.ch>
From: Martin Maechler <maechler@stat.math.ethz.ch>
To: mrn@superh.hg.med.umich.edu
Subject: Re: R-beta: Powers ('x^2') in lm/glm


>>>>> "Matt" == Matthew R Nelson <mrn@superh.hg.med.umich.edu> writes:

    Matt> R users, I was a bit surprised to find that when I attempted to
    Matt> add a polynomial term to a linear model using either lm or glm as
    Matt> could be done in S resulted in a fit without that term included
    Matt> and without warning(!!), e.g.

    >> lm(response ~ x + x^2, data).

    Matt> As far as I can gather, there is no poly() yet in R, and if
    Matt> lm/glm do not allow functions of variables as their formula
    Matt> arguements, is our only option to add variables to our dataframe
    Matt> before using lm/glm?

    >> data2 <- cbind(data,x2=data$x^2)

    Matt> I did search the archive for a discussion of this topic, but did
    Matt> not come up with any discussion on this topic.  Have I missed
    Matt> something?

This is an older topic that has been discussed a bit, however not resolved.

You would have found the answer in the archives,
i.e.,
	ftp://ftp.stat.math.ethz.ch/Mail-archives/r-help-97-04-28--97-05-09

and then search for 'x^2'.

------------------
>From that discussion, my feeling was that it was accepted to a be bug
rather than a feature.....

Unfortunately, it did not make its way into 
either  $RHOME/TASKS  nor the R-FAQ. 
		(I think it should go into BOTH -- yes it IS important).   !!!

---


The gist is :  You  MUST use  I(x^2)   etc. instead of  'x^2' etc
		--------------------

As Bill Venables had pointed out, the problem really becomes a pain
for anova with several factors,e.g.
	(w + x + y + z)^3
gives something very different in R than in S
[and you would have to replace it by

	(w + x + y + z)^3 +
		 I(w^2) + I(x^2) + I(y^2) + I(z^2) +
		 I(w^3) + I(x^3) + I(y^3) + I(z^3)

 which is quite long, especially when the variables have more than 1-letter
 names ...
]
	

As Ross had pointed out, it's really the    terms(.) function which  
either needs fixing or should be replaced by another function for term
extraction of model formulas.

This is R :

> attr(terms(formula(y ~ x + x^2)),"variables")
			     ---
model.data.frame(y, x)
		   ===
> attr(terms(formula(y ~ x + I(x^2))),"variables")
			     ------
model.data.frame(y, x, I(x^2))
		   ==========

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=