Re: [R] retrieving p-values in lm

From: Marc Schwartz <MSchwartz_at_mn.rr.com>
Date: Sat 10 Dec 2005 - 01:00:02 EST

On Fri, 2005-12-09 at 14:19 +0100, Patrick Kuss wrote:
> Dear list,
>
> I want to retrieve the p-value of a two-polynomial regression. For a
> one-polynomial lm I can easily do this with:
> summary(lm(b~a, data=c)[[4]][[8]].
>
> But how do I find the final p-value in the two-polynomial regression? Under
> $coefficients I don't find it
>
> Any suggestions?
>
> Patrick
>
> alt <-(2260,2183,2189,1930,2435,
> 2000,2100,2050,2020,2470,
> 1700,2310,2090,1560,2060,
> 1790,1940,2100,2250,2010)
>
> H <- c(0.2034,0.1845,0.2053,0.1788,0.2196,
> 0.2037,0.1655,0.2176,0.1844,0.2033,
> 0.1393,0.2019,0.1975,0.1490,0.1917,
> 0.2180,0.2064,0.1943,0.2139,0.1320)
>
> X <- data.frame(alt,H)
>
> lm.res <- summary(lm(H~alt,data=X))
> lm.res
> p1 <- lm.res[[4]][[8]]
> p1
>
> lm.res.2 <- summary(lm(H~alt+I(alt^2),data=X))
> lm.res.2
> str(lm.res.2) # where is p
>
> p2 <- lm.res.2[[???]][[????]]

First, you might want to review Chapter 11: Statistical Models in R in An Introduction to R, which is available with your R installation or from the main R web site under Documentation. Specifically, page 53 describes the extractor functions to be used for getting model information.

In this case using coef() will extract the model coefficients in both cases:

> coef(lm.res)

                Estimate   Std. Error  t value   Pr(>|t|)
(Intercept) 6.245371e-02 4.713400e-02 1.325024 0.20173833
alt         6.179038e-05 2.261665e-05 2.732074 0.01368545


> coef(lm.res.2)
Estimate Std. Error t value Pr(>|t|) (Intercept) -9.433748e-02 3.133627e-01 -0.3010488 0.7670283 alt 2.178857e-04 3.091330e-04 0.7048283 0.4904618
I(alt^2) -3.838002e-08 7.579576e-08 -0.5063610 0.6191070

In both models, the coefficients are present if you review the structure as you have in your code above:

> names(lm.res)

 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled" 


> names(lm.res.2)
[1] "call" "terms" "residuals" "coefficients" [5] "aliased" "sigma" "df" "r.squared"
 [9] "adj.r.squared" "fstatistic" "cov.unscaled"

So, you can get the term p values by using:

> coef(lm.res)[, 4]

(Intercept) alt
 0.20173833 0.01368545

> coef(lm.res.2)[, 4]

(Intercept) alt I(alt^2)
  0.7670283 0.4904618 0.6191070

In terms of the overall model p value, this is actually calculated when you display (print) the model. It is not stored as part of the model object itself. If you review the code for print.summary.lm() using:

> stats:::print.summary.lm

...

   pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3],

      lower.tail = FALSE)
...

Where the first argument is the F statistic and the other two are the degrees of freedom:

> lm.res$fstatistic

    value numdf dendf
 7.464231 1.000000 18.000000

> lm.res.2$fstatistic

    value numdf dendf
 3.706139 2.000000 17.000000

So, in the case of your two models:

> x <- lm.res
> pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3],

     lower.tail = FALSE)
     value 

0.01368545

> x <- lm.res.2
> pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3],

     lower.tail = FALSE)
    value
0.0461472

HTH, Marc Schwartz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Dec 10 01:06:09 2005

This archive was generated by hypermail 2.1.8 : Sat 10 Dec 2005 - 02:26:16 EST