RE: [R] How to intepret a factor response model?

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Wed 04 May 2005 - 18:21:11 EST


On 04-May-05 Maciej Bliziński wrote:
> Hello,
>
> I'd like to create a model with a factor-type response variable.
> This is an example:

>

>> mydata <- data.frame(factor_var = as.factor(c(rep('one', 100),
>> rep('two', 100), rep('three', 100))), real_var = c(rnorm(150),
>> rnorm(150) + 5))
>> summary(mydata)

> factor_var real_var
> one :100 Min. :-2.742877
> three:100 1st Qu.:-0.009493
> two :100 Median : 2.361669
> Mean : 2.490411
> 3rd Qu.: 4.822394
> Max. : 6.924588
>> mymodel = glm(factor_var ~ real_var, family = 'binomial', data = >> mydata) >> summary(mymodel)

>
> Call:
> glm(formula = factor_var ~ real_var, family = "binomial", data =
> mydata)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -1.7442 -0.6774 0.1849 0.3133 2.1187
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.6798 0.1882 -3.613 0.000303 ***
> real_var 0.8971 0.1066 8.417 < 2e-16 ***
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 381.91 on 299 degrees of freedom
> Residual deviance: 213.31 on 298 degrees of freedom
> AIC: 217.31
>
> Number of Fisher Scoring iterations: 6

Have you noticed that you get identical results with

set.seed(214354)
mydata <- data.frame(factor.var = as.factor(c(rep('one', 100),

   rep('two',100), rep('three', 100))),
   real.var = c(rnorm(150), rnorm(150) + 5))

mymodel <- glm(factor.var ~ real.var, family='binomial', data=mydata) summary(mymodel)

and

set.seed(214354)
mydata <- data.frame(factor.var = as.factor(c(rep('one', 100),

   rep('two',200))),real.var = c(rnorm(150),rnorm(150) + 5))

mymodel <- glm(factor.var ~ real.var, family='binomial', data=mydata) summary(mymodel)

(I've left out the "summary(mydata)" since these do naturally differ, and I've replaced "factor_var" with "factor.var" and "real_var" with "real.var" because of potential complications with "_"; also "mymodel =" to "mymodel <-").

So I think the interpretation of the results from your first model is that, because of the "family='binomial'", glm is treating "factor.var='one'" as binomial response "0", say, and "factor.var='two'" or "factor.var='three'" as binomial response "1".

You're trying to fit a multinomial response, but you've specified a binomial family to 'glm'. 'glm' does not have a multinomial response family.

You could try 'multinom' from package 'nnet' which fits loglinear models to factor responses with more than 2 levels.

E.g.

  library(nnet)
  mymodel <- multinom(factor.var ~ real.var,data=mydata)

   ### weights:  9 (4 variable)
   ##  initial  value 329.583687 
   ##  iter  10 value 209.780666
   ##  final  value 209.779951 
   ##  converged

  summary(mymodel)
   ## Re-fitting to get Hessian
   ## Call:
   ## multinom(formula = factor.var ~ real.var, data = mydata)
   ##  Coefficients:
   ##        (Intercept)  real.var
   ##  three  -3.4262565 1.3838231
   ##  two    -0.6754253 0.7116955
   ##
   ## Std. Errors:
   ##   (Intercept)  real.var
   ## three   0.5028541 0.1480138
   ## two     0.1846827 0.1068821
   ##
   ## Residual Deviance: 419.5599 
   ## AIC: 427.5599 
   ##
   ## Correlation of Coefficients:
   ##             three:(Intercept) three:real.var two:(Intercept)
   ## three:real.var  -0.7286258                                      
   ## two:(Intercept)  0.1986995        -0.1261034                    
   ## two:real.var    -0.1411377         0.7012481     -0.3285741

This output does suggest a fairly clear interpretation!

Hoping this helps,
Ted.



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 04-May-05                                       Time: 09:18:03
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed May 04 18:34:32 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:33 EST