[R] meaning of tests presented in anova(ols(...)) {Design package}

From: Dylan Beaudette <dylan.beaudette_at_gmail.com>
Date: Mon, 14 Jul 2008 21:34:33 -0700


Hi,

I am curious about how to interpret the table produced by anova(ols(...)), from the Design package. I have a multiple linear regression model, with some interaction, defined by:

ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,

    3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,     y = TRUE)

         n Model L.R.       d.f.         R2      Sigma
      1834       1203         14       0.48        1.2

Residuals:
   Min     1Q Median     3Q    Max

-5.033 -0.859 0.016 0.739 4.868

Coefficients:

                       Value Std. Error     t        Pr(>|t|)
Intercept         11.3886790  2.0220171  5.63 0.0000000205580
sar               -4.3991263  1.0157588 -4.33 0.0000156609226
activity         -40.0591221  5.6907822 -7.04 0.0000000000027
activity^2        33.0570116  5.0578520  6.54 0.0000000000819
activity^3        -8.1645147  1.3750370 -5.94 0.0000000034548
conc               0.3841260  0.0813200  4.72 0.0000024942478
sand              -0.0096212  0.0327415 -0.29 0.7689032898947
sand^2             0.0008495  0.0008589  0.99 0.3227487169683
sand^3             0.0000025  0.0000066  0.39 0.6994987342042
sar * activity    12.8134698  2.9513942  4.34 0.0000149300007
sar * activity^2  -9.9981381  2.6310765 -3.80 0.0001494462966
sar * activity^3   2.1481278  0.7168339  3.00 0.0027662261037
conc * sand       -0.0157426  0.0076013 -2.07 0.0384966958735
conc * sand^2      0.0003419  0.0001989  1.72 0.0857381555491
conc * sand^3     -0.0000027  0.0000015 -1.77 0.0777025949762


Looking at what I 'think' are "marginal p-values" i.e. results of a test against coef_i != 0, there are several terms with non-significant coefficients (at p<0.05). Does a non-significant coefficient warrant removal from the model, or perhaps a mention in the discussion?

Compared to the above example, what tests are performed when calling anova() on this object? Here is the output in R:

               Analysis of Variance Response: log(ksat * 60 * 60)

 Factor                                        d.f. Partial SS MS     F
 sar  (Factor+Higher Order Factors)               4  168.43     42.11  27.0
  All Interactions                                3  142.13     47.38  30.4
 activity  (Factor+Higher Order Factors)          6  536.84     89.47  57.3
  All Interactions                                3  142.13     47.38  30.4
  Nonlinear (Factor+Higher Order Factors)         4  257.25     64.31  41.2
 conc  (Factor+Higher Order Factors)              4  443.02    110.75  71.0
  All Interactions                                3   76.74     25.58  16.4
 sand  (Factor+Higher Order Factors)              6 1906.29    317.71 203.6
  All Interactions                                3   76.74     25.58  16.4
  Nonlinear (Factor+Higher Order Factors)         4  263.00     65.75  42.1
 sar * activity  (Factor+Higher Order Factors)    3  142.13     47.38  30.4
  Nonlinear                                       2   95.32     47.66  30.5
  Nonlinear Interaction : f(A,B) vs. AB           2   95.32     47.66  30.5
 conc * sand  (Factor+Higher Order Factors)       3   76.74     25.58  16.4
  Nonlinear                                       2    4.98      2.49   1.6
  Nonlinear Interaction : f(A,B) vs. AB           2    4.98      2.49   1.6
 TOTAL NONLINEAR                                  8  455.20     56.90  36.5
 TOTAL INTERACTION                                6  218.87     36.48  23.4
 TOTAL NONLINEAR + INTERACTION                   10  573.36     57.34  36.7
 REGRESSION                                      14 2631.53    187.97 120.4
 ERROR                                         1819 2839.25      1.56
 P
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
 0.203
 0.203

<.0001
<.0001
<.0001
<.0001

Are more of the 'terms' significant (at p<0.05) due to pooling of model terms? I have looked through Frank's book on the topic, but can't quite wrap my head around what the above is telling me. I am mostly interested in presenting a model for use as a applied tool, and interpretation of terms / interaction is very important.

Thanks,

Dylan



R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 15 Jul 2008 - 05:45:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 16 Jul 2008 - 02:32:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive