Re: [R] Understanding output of summary(glm(...))

From: <Bill.Venables_at_csiro.au>
Date: Wed, 20 Aug 2008 17:03:48 +1000

The 'Std. Error' values listed in the coefficients table of the summary have nothing to do with the sub-class standard deviations. They are the standard errors associated with the estimates of the class means (the way you have fitted the model) and as the design has equal replication and the estimated standard errors are based on the pooled estimate of variance from all samples, they are equal. That's why.

Your second 'example' was incomplete and I couldn't follow it, but the answer is almost certainly "hell no!".

Finally, a question for you. Why do you use glm(...) when all you are doing is fitting linear models? Either lm(...) or aov(...) would have been much more sensible.

Bill Venables
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Daren Tan
Sent: Wednesday, 20 August 2008 4:37 PM
To: r-help_at_stat.math.ethz.ch
Subject: [R] Understanding output of summary(glm(...))

Simple example of 5 groups of 4 replicates.  

>set.seed(5)
 

>tmp <- rnorm(20)
 

>gp <- as.factor(rep(1:5,each=4))
 

>summary(glm(tmp ~ -1 + gp, data=data.frame(tmp, gp)))$coefficients
Estimate Std. Error t value Pr(>|t|)gp1 -0.1604613084 0.4899868 -0.3274809061 0.7478301gp2 0.0002487984 0.4899868 0.0005077655 0.9996016gp3 0.0695463698 0.4899868 0.1419352018 0.8890200gp4 -0.6121682841 0.4899868 -1.2493567852 0.2306791gp5 -0.6999545014 0.4899868 -1.4285171713 0.1736348  

>m <- data.frame(tmp, gp)
>sapply(gp, function(x) sd(m[m[,"gp"]==x,1])) [1] 1.169284 1.169284

1.169284 1.169284 1.142974 1.142974 1.142974 1.142974 [9] 0.862423
0.862423 0.862423 0.862423 0.535740 0.535740 0.535740 0.535740[17]
1.047538 1.047538 1.047538 1.047538

Why doesn't the standard deviation of each group correlates with the Pr e.g., gp = 4 has the smallest sd of 0.535740, but its Pr is not the lowest (i.e., only 0.23 vs 0.1736 of gp = 5).  

Another example with new tmp1  

>tmp1

 [1] 9.577969 9.310792 9.666767 9.610164 10.181692 10.155899

10.025943 [8]  9.971243 10.177766  9.265793  9.415818 10.099874
10.238829  9.575591[15]  9.560879  9.617891  9.617891 10.158160
10.592377 10.068443
 

>summary(glm(tmp1 ~ -1 + age,
data=data.frame(as.vector(as.matrix(tmp1)), age)))$coefficients Estimate Std. Error t value Pr(>|t|)age1 9.541423 0.1611603 59.20456 3.380085e-19age2 10.083694 0.1611603 62.56935 1.479781e-19age3 9.739813 0.1611603 60.43557 2.485380e-19age4 9.748297 0.1611603 60.48821 2.453251e-19age5 10.109218 0.1611603 62.72773 1.424913e-19 m1 <- data.frame(tmp1, gp)  

>sapply(age, function(x) sd(m1[m1[,"age"]==x,1])) [1] 0.1580745

0.1580745 0.1580745 0.1580745 0.1013207 0.1013207 0.1013207 [8]
0.1013207 0.4658736 0.4658736 0.4658736 0.4658736 0.3279128
0.3279128[15] 0.3279128 0.3279128 0.3995426 0.3995426 0.3995426
0.3995426
 

Can I conclude from the Pr of summary that tmp1 are of better "quality" than tmp, given that its Pr. values are signficantly smaller ?  


        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 20 Aug 2008 - 07:13:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 20 Aug 2008 - 07:34:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive