[R] Using VGAM's vglm function for ordinal logistic regression

From: Inman, Brant A. M.D. <Inman.Brant_at_mayo.edu>
Date: Sat 06 Jan 2007 - 22:56:43 GMT

R-Experts:

I am using the vglm function of the VGAM library to perform proportional odds ordinal logistic regression. The issue that I would like help with concerns the format in which the response variable must be provided for this function to work correctly. Consider the following example:


library(VGAM)
library(MASS)

attach(pneumo)
pneumo # Inspect the format of the original dataset

# Restructure the pneumo dataset into a different format pneumo2 <- data.frame(matrix(ncol=3, nrow=24)) colnames(pneumo2) <- c('exposure.time', 'severity', 'freq') pneumo2[,1] <- rep(pneumo[,1],3)
pneumo2[,2] <-
as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8)))

pneumo2[1:8,3]   <- pneumo[,2]
pneumo2[9:16,3]  <- pneumo[,3]
pneumo2[17:24,3] <- pneumo[,4]
pneumo2	# Inspect the format of the new modified dataset


The problem occurs when I try to analyze these two datasets, which are identical in content, with the proportional odds model using vglm:


# Analyze the original dataset with vglm, get one set of results

> vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time),
data=pneumo,
+ family=cumulative(parallel=T))

Coefficients:

     (Intercept):1      (Intercept):2 log(exposure.time) 
          9.676093          10.581725          -2.596807 

Degrees of Freedom: 16 Total; 13 Residual Residual Deviance: 5.026826
Log-likelihood: -204.2742

# Analyzing the modified dataset with vglm gives another set of results

> vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2,
+ family=cumulative(parallel=T))

Coefficients:

     (Intercept):1      (Intercept):2 log(exposure.time) 
        -1.6306499          2.5630170         -0.1937956 

Degrees of Freedom: 48 Total; 45 Residual Residual Deviance: 503.7251
Log-likelihood: -251.8626
Warning messages:
1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon)
2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon)
3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon)
4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon)
5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon)

# Analyzing the modified dataset with polr reproduces these second results

> polr(severity ~ log(exposure.time), weights=freq, data=pneumo2)

Coefficients:
log(exposure.time)

         0.1938154

Intercepts:
  mild|normal normal|severe
    -1.630612 2.563055

Residual Deviance: 503.7251
AIC: 509.7251


Can someone explain what I am doing wrong when using vglm and polr with the modified dataset? I do not understand why these two formulations should give different results.

Brant Inman



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun Jan 07 10:03:03 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 07 Jan 2007 - 09:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.