# [R] Ancova_non-normality of errors

From: Tobias Erik Reiners <Tobias.Reiners_at_bio.uni-giessen.de>
Date: Sun, 04 May 2008 11:56:09 +0200

I have some problems with fitting the model for my data...
-->my Literatur says (crawley testbook)=
Non-normality of errors-->I get a banana shape Q-Q plot with opening of banana downwards

Structure of data:

```      origin   wt   pes gender
1      wild 5.35 147.0   male
2      wild 5.90 148.0   male
3      wild 6.00 156.0   male
4      wild 7.50 157.0   male
5      wild 5.90 148.0   male
6      wild 5.95 148.0   male
7      wild 8.55 160.5   male
8      wild 5.90 148.0   male
9      wild 8.45 161.0   male
10     wild 4.90 147.0   male
11     wild 6.80 153.0   male
12     wild 5.75 146.0   male
13     wild 8.60 160.0   male
14  captive 6.85 159.0   male
```

15 captive 7.00 160.0 male
16 captive 6.80 155.0 male
..
...
```283    site 4.10 130.4 female
284    site 3.55 131.1 female
285    site 4.20 135.7 female
286    site 3.45 128.0 female
287    site 3.65 125.3 female

```

The goal of my analysis is to work out what effect the categorial factors(origin, gender) on the relation between log(wt)~log(pes)(-->Condition, fett ressource), have. Does the source(origin) of translocated animals have an affect on performance(condition)in the new area?
I have already a best fit model and it looks quite good (or not?see below).

two slopes(gender difference)and 6 intercepts(3origin levels*2gender levels)

lm(formula = log(wt) ~ log(pes) + origin + gender + gender:log(pes))

Residuals:

Min 1Q Median 3Q Max
-0.54181 -0.07671 0.01520 0.09474 0.28818

Coefficients:

```                     Estimate Std. Error t value Pr(>|t|)
(Intercept)         -7.39879    1.97605  -3.744 0.000219 ***
log(pes)             1.78020    0.40118   4.437 1.31e-05 ***
originsite           0.06572    0.01935   3.397 0.000781 ***
originwild           0.07655    0.03552   2.155 0.032011 *
gendermale          -9.32418    2.37476  -3.926 0.000109 ***
```
log(pes):gendermale 1.90393 0.47933 3.972 9.06e-05 ***
---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1433 on 281 degrees of freedom Multiple R-Squared: 0.7227, Adjusted R-squared: 0.7177 F-statistic: 146.4 on 5 and 281 DF, p-value: < 2.2e-16

When plot this model I get a banana-shape in Normal Q-Q Plot(with open site pointing downwards) , indicating non-normality of my data....how to handle this?

-->Do I have unbalanced data?

```       captive    site    wild
n-->     119     149      19

```

My problem is that I see that my data is not as good as the modelsummary tells.
Should I include another term in my model formular?

I think I have to differenciate more, but I don't know how.(contrasts?, TukeyHSD?,Akaike Information Criterion? or lme())to many different ways out there.

Cheers,
Tobi

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 04 May 2008 - 10:09:35 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 04 May 2008 - 17:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.