[R] comparing lm(), survreg( ... , dist="gaussian") and survreg( ... , dist="lognormal")

From: Charles Annis, P.E. <Charles.Annis_at_statisticalengineering.com>
Date: Tue 03 May 2005 - 13:37:46 EST


Dear R-Helpers:

I have tried everything I can think of and hope not to appear too foolish when my error is pointed out to me.

I have some real data (18 points) that look linear on a log-log plot so I used them for a comparison of lm() and survreg. There are no suspensions.

survreg.df <- data.frame(Cycles=c(2009000, 577000, 145000, 376000, 37000, 979000, 17420000, 71065000, 46397000, 70168000, 69120000, 68798000, 72615000, 133051000, 38384000, 15204000, 1558000, 14181000), stress=c(90, 100, 110, 90, 100, 80, 70, 60, 56, 62, 62, 59, 56, 53, 59, 70, 90, 70), event=rep(1, 18))

sN.lm<- lm(log(Cycles) ~ log10(stress), data=survreg.df)

and

                                             vvvvvvvvvvv
gaussian.survreg<- survreg(formula=Surv(time=log(Cycles), event) ~ log10(stress), dist="gaussian", data=survreg.df)

produce identical parameter estimates and differ slightly in the residual standard error and scale, which is accounted for by scale being the MLE and thus biased. Correcting by sqrt(18/16) produces agreement. Using predict() for the lm, and predict.survreg() for the survreg model and correcting for the differences in stdev, produces identical plots of the fit and the upper and lower confidence intervals. All of this is as it should be.

And,

                                               vvvvvv
lognormal.survreg<- survreg(formula=Surv(time=(Cycles), event) ~ log10(stress), dist="lognormal", data=survreg.df)

produces summary() results that are identical to the earlier call to survreg(), except for the call, of course. The parameter estimates and SE are identical. Again this is as I would expect it.

But since the call uses Cycles, rather than log(Cycles) predict.survreg() returns $fit in Cycles units, rather than logs, and of course the fits are identical when plotted on a log-log grid and also agree with lm()

Here is the fly in the ointment: The upper and lower confidence intervals, based on the $se.fit for the dist="lognormal" are quite obviously different from the other two methods, and although I have tried everything I could imagine I cannot reconcile the differences.

I believe that the confidence bounds for both models should agree. After all, both calls to survreg() produce identical parameter estimates.

So I have missed something. Would some kind soul please point out my error?

Thanks.

Charles Annis, P.E.

Charles.Annis@StatisticalEngineering.com phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com
 



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 03 13:43:03 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:32 EST