# Re: [R] Help - linear regression

From: Mark Difford <mark_difford_at_yahoo.co.uk>
Date: Fri, 25 Jan 2008 09:43:28 -0800 (PST)

Hi All,

Thanjuvar wrote:
>> model2<-lm(lavi~age+sex+age*race+diabetes+hypertension, data=tb1)

David wrote:

```>>  in the second equation you are only including the interaction of
>> age*race,
>>  the main effect of age, but not the main effect of race which is what
>> came out significant

```

I am sorry, but this is wrong. Read up about model formulae in http://cran.r-project.org/doc/manuals/R-intro.html#Statistical-models-in-R

The expression age * race expands to age + race + age:race. That is, main effects of age and race, plus the interaction between age and race [age:race]. The expansion is done automatically.

Thanjavur: Model selection is a huge subject. However, once you taken in the above fact, you will see that the __only__ difference between your two models is that you have added an interaction term for age:race You have two simple, but still very effective approaches.

## 1: Test the two models by doing:
anova(model1, model2)

##1: Use stepAIC (you need MASS installed) on model 2, and see what happens to the
## interaction term
require(MASS)
stepAIC(model2, test="Chi")\$anova

See:
?anova
?stepAIC

HTH,
Mark.

David Young-18 wrote:
>
> Thanjavur,
>
> I'm new to R, so it is possible I'm interpreting you syntax
> incorrectly, but it looks like in the second equation you are only
> including the interaction of age*race, the main effect of age, but
> not the main effect of race which is what came out significant in your
> first model.
>
> In effect you have measured two different things and one of them is
> significant. In the first regression you have measured a general
> shift in the regression giving each racial group a different
> intercept. In the second, you are measuring whether there should be
> two different slopes for the line relating to age. One for european
> ages and one for non-european ages, which did not turn out to be
> significant.
>
> Based on the information you have presented you should not include the
> interaction, but should include the main effect for race. HOWEVER, as
> a general rule, you should include the main effects along with your
> test for interactions between them. age,race,age*race
> When you do this it is possible that the interaction will then also be
> significant.
>
> Hope that helps.
>
> Dave
>
> Tuesday, January 22, 2008, 11:20:01 AM, you wrote:
>
>
> TB> Hi,
>
> TB> I am trying a linear regression model where the dependent variable is
> the size of the heart corrected for the patient's height and weight. This
> is labelled as LAVI. The independent variables are
> TB> race (european or non-eurpoean), age, sex (male or female) of the
> patient and whether they have diabetes and high blood pressure. sample
> size 2000 patients selected from a community.
>
> TB> when I model
> TB> model1<-lm(lavi~age+sex+race+diabetes+hypertension, data=tb1)
> TB> and
> TB> model2<-lm(lavi~age+sex+age*race+diabetes+hypertension, data=tb1)
>
> TB> in the first model race comes out as a significant predictor (p<0.005)
> where as in the second model race is not a significant predictor of lavi
> (p=.076)
>
> TB> in my dataset mean age is 55.2 years in the non-europeans and 56.7
> years in the europeans (p <0.0001 by t.test).
>
> TB> should I or should I not include the interaction (age*race) in the
> model. Is it an acceptable rule to put in interactions if there is a
> significant relation between the indepenedent variables in
> TB> univariate analyses.
>
> TB> Many thanks
>
> TB> _________________________________________________________________
> TB> Helping your favorite cause is as easy as instant messaging. You IM,
> we give.
>
> TB> [[alternative HTML version deleted]]
>
>
>
>
> --
> Best regards,
>
> David Young
> mailto:dyoung_at_telefonica.net
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

```--
View this message in context: http://www.nabble.com/Help---linear-regression-tp15016515p15093097.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help