# Re: [R] Standard error of coefficient in linear regression

From: Greg Snow <Greg.Snow_at_intermountainmail.org>
Date: Mon 18 Sep 2006 - 15:41:53 GMT

I believe that your confusion is due to a typo in the formula in [3], it is missing a sumation sign (and a subscript on x if you want to be picky). To get the denominator you subtract the mean of your x variable from all the x-values, square the differences, then sum them up (the missing sumation sign) and take the square root. This is essentially the standard deviation of your x variable but without dividing by (n-1).

If you want to do this in R (a good thing while learning, there are better ways for actual analysis) you could use code like:

> x.e <- exped - mean(exped)
> x.e2 <- x.e^2
> sx2 <- sqrt(sum(x.e2))
> sb <- Se/sx2 # where Se is your residual standard error from below

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow@intermountainmail.org
(801) 408-8111

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Maciej Blizinski
Sent: Sunday, September 17, 2006 12:22 PM
To: R - help
Subject: [R] Standard error of coefficient in linear regression

Hello R users,

I have a substantial question about statistics, not about R itself, but I would love to have an answer from an R user, in form of an example in R syntax. I have spent whole Sunday searching in Google and browsing the books. I've been really close to the answer but there are at least three standard errors you can talk about in the linear regression and I'm really confused. The question is:

How exactly are standard errors of coefficients calculated in the linear regression?

Here's an example from a website I've read [1]. A company wants to know if there is a relationship between its advertising expenditures and its sales volume.

========================================================

> exped <- c(4.2, 6.1, 3.9, 5.7, 7.3, 5.9) sales <- c(27.1, 30.4, 25.0,

> 29.7, 40.1, 28.8) S <- data.frame(exped, sales) summary(lm(sales ~
> exped, data = S))

Call:
lm(formula = sales ~ exped, data = S)

Residuals:
1       2       3       4       5       6
1.7643 -1.9310  0.7688 -1.1583  3.3509 -2.7947

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   9.8725     5.2394   1.884   0.1326
exped         3.6817     0.9295   3.961   0.0167 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.637 on 4 degrees of freedom
Multiple R-Squared: 0.7968,     Adjusted R-squared: 0.7461
F-statistic: 15.69 on 1 and 4 DF,  p-value: 0.01666 ========================================================

I can calculate the standard error of the estimate, according to the equation [2]...

> S.m <- lm(sales ~ exped, data = S)

> S$pred <- predict(S.m) > S$ye <- S$sales - S$pred
> S$ye2 <- S$ye ^ 2
> Se <- sqrt(sum(S$ye2)/(length(S$sales) - 1 - 1)) Se
[1] 2.636901

...which matches the "Residual standard error" and I'm on the right track. Next step would be to use the equation [3] to calculate the standard error of the regression coefficient (here: exped). The equation [3] uses two variables, meaning of which I can't really figure out. As the calculated value Sb is scalar, all the parameters need also to be scalars. I've already calculated Se, so I'm missing x and \bar{x}. The latter could be the estimated coefficient. What is x then?

Regards,
Maciej

[1] http://www.statpac.com/statistics-calculator/correlation-regression.htm

--
Maciej Bliziñski <m.blizinski_at_wit.edu.pl> http://automatthias.wordpress.com

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help