Re: [R] keeping interaction terms

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Sat 08 Oct 2005 - 23:14:55 EST


Adding a bit to Frank Harrell's good comments.

  1. Regarding HTML infection: I rolled up my sleeves, washed my hands carefully, took a fine sharp knife, cut it all out, and then sowed up the incisions.
  2. For the rest, see below.

On 08-Oct-05 Christian Jones wrote:
> Hello,
>
> while doing my thesis in habitat modelling Ive come across a
> problem with interaction terms. My question concerns the usage
> of interaction terms for linear regression modelling with R.
> If an interaction-term (predictor) is chosen for a multiple model,
> then, according to Crawley its single term has to be added to the
> multiple model: lrm(N~a*b+a+b).
>
> This nearly always leads to high correlation rates between the
> interaction term a*b and its single term a or b. With regards to
> the law of colinearity modelling should not include correlated
> variables with an Spearman index >0,7. Does this mean that the
> interaction term has to be discarded or can the variables stay
> within the model when correlated?
> I do not necessarily want to do a PCA on this issue.

There's more than a suggestion in your statements that you tend to be drawn along by people's prescriptions. Instead, try to think simply about it.

If, after fitting "a+b", you make a "significant difference" by further including "a:b", then the interaction between a and b matters, even if you observe high correlations. The latter should not lead you to ignore the former.

How much it matters is of course another question. You could examine this, in R, by comparing the predicted values from the "a+b" model with the predicted values from the "a*b" model. Though they will be different, you will have to judge whether the amount of difference is large enough to be of real importance in your application. (It is possible to get highly "significant" results, i.e. small P-values, from small effects).

Even if it does matter, in real terms, you are left with the fundamental difficulty, indicated by Frank, that interpreting interaction between variables a and b is simple only when the variables a and b are orthogonal in the data (either by accident or by design). If they are non-orthogonal, then you have to think carefully about how to interpret it, and this does depend on what it all means.

Maybe we could help more with this if we knew more about your investigation (perhaps off-list, if you prefer).

Best wishes,
Ted.



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 08-Oct-05                                       Time: 14:14:48
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Oct 08 23:25:13 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:32:03 EST