Re: [R] MICE for Cox model

From: Inman, Brant A. M.D. <Inman.Brant_at_mayo.edu>
Date: Thu, 17 May 2007 17:30:48 -0500

Thanks Adai, I got it to work. You were right, I had called the wrong pool function.

Brant

-----Original Message-----
From: Adaikalavan Ramasamy [mailto:ramasamy_at_cancer.org.uk] Sent: Thursday, May 17, 2007 1:56 PM
To: Inman, Brant A. M.D.
Cc: r-help_at_stat.math.ethz.ch
Subject: Re: [R] MICE for Cox model

Are you sure you used my pool function? Because as I just have discovered, it had a minor typo in the code. After replacing " df <- (m - 1) * (1 + 1/r)2" with "df <- (m - 1) * (1 + 1/r)^2" in my pool() function, I get

  library(survival); data(pbc)
  d <- pbc[,c('time', 'status', 'age', 'sex',

              'hepmeg', 'platelet', 'trt', 'trig')]   d[d==-9] <- NA
  d[,c(4,5,7)] <- lapply(d[,c(4,5,7)], FUN=as.factor)

  library(mice)
  imp <- mice(d, m=10, maxit=10, diagnostics=T, seed=500,

               defaultImputationMethod=c('norm', 'logreg', 'polyreg'))   fit <- coxph.mids( Surv(time,status) ~ age + sex + hepmeg + platelet

                                          + trt + trig, imp)

  pool(fit)

Call: pool(object = fit)

Pooled coefficients:

          age sex1 hepmeg1 platelet trt2    trig
  0.034924182 -0.208897827 0.987641362 -0.001559426 0.070124108 0.004122421

Fraction of information about the coefficients missing due to nonresponse:

        age sex1 hepmeg1 platelet trt2 trig 0.06624167 0.19490517 0.27300965 0.21950332 0.27768153 0.40658964

Regards, Adai

Inman, Brant A. M.D. wrote:
> Adai,
>
> Thanks for the functions. I tried using your functions and I get the
> same error message during the pooling part:
>
>> pool(micefit)
> Error in names(df) <- names(f) <- names : 'names' attribute [5] must
be
> the same length as the vector [0]
>
> Brant
> -----Original Message-----
> From: Adaikalavan Ramasamy [mailto:ramasamy_at_cancer.org.uk]
> Sent: Thursday, May 17, 2007 4:56 AM
> To: Inman, Brant A. M.D.
> Cc: r-help_at_stat.math.ethz.ch
> Subject: Re: [R] MICE for Cox model
>
> I encountered this problem about 18 months ago. I contacted Prof. Fox
> and Dr. Malewski (the R package maintainers for mice) but they
referred
> me to Prof. van Buuren. I wrote to Prof. van Buuren but am unable to
> find his reply (if he did reply).
>
> Here are the functions I used at that time, if you want to take it
with
> lots of salt. Let me know if you find anything fishy with it.
>
>
> coxph.mids <- function (formula, data, ...) {
>
> call <- match.call()
> if (!is.mids(data)) stop("The data must have class mids")
>
> analyses <- as.list(1:data$m)
>
> for (i in 1:data$m) {
> data.i <- complete(data, i)
> analyses[[i]] <- coxph(formula, data = data.i, ...)
> }
>
> object <- list(call = call, call1 = data$call,
> nmis = data$nmis, analyses = analyses)
>
> oldClass(object) <- if (.SV4.) "mira" else c("mira", "coxph")
> return(object)
> }
>
>
> And in the function 'pool', the small sample adjustment requires
> residual degrees of freedom (i.e. dfc). For a cox model, I believe
that
> this is simply the number of events minus the regression coefficients.

> There is support for this from middle of page 149 of the book by
Parmer
> & Machin (ISBN 0471936405). Please correct me if I am wrong.
>
> Here is the slightly modified version of pool :
>
>
> pool <- function (object, method = "smallsample") {
>
> call <- match.call()
> if (!is.mira(object)) stop("The object must have class 'mira'")
>
> if ((m <- length(object$analyses)) < 2)
> stop("At least two imputations are needed for pooling.\n")
>
> analyses <- object$analyses
>
> k <- length(coef(analyses[[1]]))
> names <- names(coef(analyses[[1]]))
> qhat <- matrix(NA, nrow = m, ncol = k, dimnames = list(1:m,
names))
> u <- array(NA, dim = c(m, k, k),
> dimnames = list(1:m, names, names))
>
> for (i in 1:m) {
> fit <- analyses[[i]]
> qhat[i, ] <- coef(fit)
> u[i, , ] <- vcov(fit)
> }
>
> qbar <- apply(qhat, 2, mean)
> ubar <- apply(u, c(2, 3), mean)
> e <- qhat - matrix(qbar, nrow = m, ncol = k, byrow = TRUE)
> b <- (t(e) %*% e)/(m - 1)
> t <- ubar + (1 + 1/m) * b
> r <- (1 + 1/m) * diag(b/ubar)
> f <- (1 + 1/m) * diag(b/t)
> df <- (m - 1) * (1 + 1/r)2
>
> if (method == "smallsample") {
>
> if( any( class(fit) == "coxph" ) ){
>
> ### this loop is the hack for survival analysis ###
>
> status <- fit$y[ , 2]
> n.events <- sum(status == max(status))
> p <- length( coefficients( fit ) )
> dfc <- n.events - p
>
> } else {
>
> dfc <- fit$df.residual
> }
>
> df <- dfc/((1 - (f/(m + 1)))/(1 - f) + dfc/df)
> }
>
> names(r) <- names(df) <- names(f) <- names
> fit <- list(call = call, call1 = object$call, call2 = object$call1,
> nmis = object$nmis, m = m, qhat = qhat, u = u,
> qbar = qbar, ubar = ubar, b = b, t = t, r = r, df = df,
> f = f)
> oldClass(fit) <- if (.SV4.) "mipo" else c("mipo", oldClass(object))
> return(fit)
> }
>
>
> print.miro only gives the coefficients. Often I need the standard
errors
> as well since I want to test if each regression coefficient from
> multiple imputation is zero or not. Since the function summary.mipo
does
> not exist, can I suggest the following
>
>
> summary.mipo <- function(object){
>
> if (!is.null(object$call1)){
> cat("Call: ")
> dput(object$call1)
> }
>
> est <- object$qbar
> se <- sqrt(diag(object$t))
> tval <- est/se
> df <- object$df
> pval <- 2 * pt(abs(tval), df, lower.tail = FALSE)
>
> coefmat <- cbind(est, se, tval, pval)
> colnames(coefmat) <- c("Estimate", "Std. Error",
> "t value", "Pr(>|t|)")
>
> cat("\nCoefficients:\n")
> printCoefmat( coefmat, P.values=T, has.Pvalue=T, signif.legend=T )
>
> cat("\nFraction of information about the coefficients
> missing due to nonresponse:",
"\n")
> print(object$f)
>
> ans <- list( coefficients=coefmat, df=df,
> call=object$call1, fracinfo.miss=object$f )
> invisible( ans )
>
> }
>
>
> Hope this helps.
>
> Regards, Adai
>
>
>
> Inman, Brant A. M.D. wrote:

>> R-helpers:
>>
>> I have a dataset that has 168 subjects and 12 variables.  Some of the
>> variables have missing data and I want to use the multiple imputation
>> capabilities of the "mice" package to address the missing data. Given
>> that mice only supports linear models and generalized linear models

> (via

>> the lm.mids and glm.mids functions) and that I need to fit Cox models,
> I
>> followed the previous suggestion of John Fox and have created my own
>> function "cox.mids" to use coxph to fit models to the imputed

> datasets.
>> (http://tolstoy.newcastle.edu.au/R/help/06/03/22295.html)
>>
>> The function I created is:
>>
>> ------------
>>
>> cox.mids <- function (formula, data, ...)
>> {
>>     call <- match.call()
>>     if (!is.mids(data)) 
>>         stop("The data must have class mids")
>>     analyses <- as.list(1:data$m)
>>     for (i in 1:data$m) {
>>         data.i <- complete(data, i)
>>         analyses[[i]] <- coxph(formula, data = data.i, ...)
>>     }
>>     object <- list(call = call, call1 = data$call, nmis = data$nmis, 
>>         analyses = analyses)
>>     oldClass(object) <- c("mira", "coxph")
>>     return(object)
>> }
>>
>> ------------
>>
>> The problem that I encounter occurs when I try to use the "pool"
>> function to pool the fitted models into one general model. Here is

> some
>> code that reproduces the error using the pbc dataset.
>>
>> ------------
>>
>> d <- pbc[,c('time','status','age','sex','hepmeg','platelet', 'trt',
>> 'trig')]
>> d[d==-9] <- NA 
>> d[,c(4,5,7)] <- lapply(d[,c(4,5,7)], FUN=as.factor)
>> str(d)
>>
>> imp <- mice(d, m=10, maxit=10, diagnostics=T, seed=500, 
>> 	defaultImputationMethod=c('norm', 'logreg', 'polyreg'))
>>
>> fit <- cox.mids(Surv(time,status) ~ age + sex + hepmeg + platelet +

> trt
>> + trig, imp) >> >> pool(fit) >> >> ------------ >> >> I have looked at the "pool" function but cannot figure out what I have >> done wrong. Would really appreciate any help with this. >> >> Thanks, >> >> Brant Inman >> Mayo Clinic >> >> ______________________________________________ >> R-help_at_stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide

> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>

>
>

>

R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 17 May 2007 - 22:36:18 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 18 May 2007 - 00:31:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.