Re: [R] Fitting weibull and exponential distributions to left censoring data

From: Göran Broström <goran.brostrom_at_gmail.com>
Date: Sat, 01 Nov 2008 21:35:09 +0100

On Fri, Oct 31, 2008 at 2:27 PM, Terry Therneau <therneau_at_mayo.edu> wrote:
> Use the survreg function.

The survreg function cannot fit left censored data (correct me if I am wrong!), neither can phreg or aftreg (package eha). On the other hand, if Borja instead wanted to fit left truncated data (it is a common mistake to confuse left truncation with left censoring), it is possible to use phreg or aftreg, but still not survreg (again, correct me if I am wrong).

If instead Borja really wants to fit left censored data, it is fairly simple with the aid of the function optim, for instance by calling this function:

left <- function(x, d){

    ## d[i] = FALSE: x[i] is left censored     ## d[i] = TRUE: x[i] is observed exactly     loglik <- function(param){# The loglihood function

        lambda <- exp(param[2])
        p <- exp(param[1])
        sum(ifelse(d,
                        dweibull(x, p, lambda, log = TRUE),
                        pweibull(x, p, lambda, log.p = TRUE)
                   )
            )

    }
    par <- c(0, 0)
    res <- optim(par, loglik, control = list(fnscale = -1), hessian = TRUE)
    list(log.shape = res$par[1],
         log.scale = res$par[2],
         shape = exp(res$par[1]),
         scale = exp(res$par[2]),
         var.log = solve(-res$hessian)
         )

}

Use like this:

> x <- rweibull(500, shape = 2, scale = 1)
> d <- x > median(x) # 50% left censoring, Type II.
> y <- ifelse(d, x, median(x))
> left(y, d)

$log.shape
[1] 0.707023

$log.scale
[1] -0.007239744

$shape
[1] 2.027945

$scale
[1] 0.9927864

$var.log

             [,1] [,2]
[1,] 0.0022849526 0.0005949114
[2,] 0.0005949114 0.0006508635

>

> There are many different ways to parameterize a Weibull. The survreg function
> imbeds it a general location-scale familiy, which is a different
> parameterization than the rweibull function.
>
>> y <- rweibull(1000, shape=2, scale=5)
>> survreg(Surv(y)~1, dist="weibull")
>

> Coefficients:
> (Intercept)
> 1.592543
>

> Scale= 0.5096278
>

> Loglik(model)= -2201.9 Loglik(intercept only)= -2201.9
>

> ----
>

> survreg's scale = 1/(rweibull shape)
> survreg's intercept = log(rweibull scale)
> For the log-likelihood all parameterizations lead to the same value.
>
> There is not "right" or "wrong" parameterization for a Weibull (IMHO),

Correct, but there are two points I would like to add to that: (i) It is a good idea to perform optimisation with a parametrization that give no range restrictions.

(ii) It is a good idea to transform back the results to the parametrization that is standard in R, for comparative reasons.

See for example the function 'left' above.

> but
> there certainly is a lot of room for confusion. This comes up enough that I
> have just added it as an example in the survreg help page, which will migrate to
> the general R distribution in due course.

>

> Terry Therneau
>

> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Göran Broström
______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 01 Nov 2008 - 20:38:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 02 Nov 2008 - 15:30:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive