**From:** Paul Johnson (*pauljohn@ku.edu*)

**Date:** Thu 13 May 2004 - 17:10:35 EST

**Next message:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Previous message:**Uwe Ligges: "Re: [R] Formula of power.t.test"**In reply to:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Next in thread:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Reply:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Reply:**Thomas Lumley: "Re: [R] Explaining Survival difference between Stata and R"

Message-id: <40A31F6B.8050102@ku.edu>

Dear Goran (and others)

I did not know about the "eha" package, but reading the docs, I see many

things I've been looking for, including the parametric hazard model with

Weibull baseline. Thanks for the tip, and the package.

I still don't quite understand your point about the reason that coxph

crashes. Why does the difference of scale between the variables cause

trouble? I understand that a redundant set of variables would cause

singularity, but do not see why differing scales for variables would

cause that. Does the explanation lie in the optimization algorithm

itself? I'll read the eha package coxreg source code. I bet that will

show me what's going on in your fix, anyway.

Still, I wish coxph just handled all that stuff.

Just for fun, here are the Stata results for the same small model that

you report. If I ask stata to use the efron method of breaking ties,

then the results do almost exactly match what you found.

stset yrs2, failure (ratify)

stcox haz_wst pol_free , robust nohr efron

Iteration 0: log pseudo-likelihood = -45.380139

Iteration 1: log pseudo-likelihood = -45.00981

Iteration 2: log pseudo-likelihood = -45.001196

Iteration 3: log pseudo-likelihood = -45.001193

Refining estimates:

Iteration 0: log pseudo-likelihood = -45.001193

Cox regression -- Efron method for ties

No. of subjects = 21 Number of obs =

21

No. of failures = 21

Time at risk = 78

Wald chi2(2) =

1.09

Log pseudo-likelihood = -45.001193 Prob > chi2 =

0.5785

------------------------------------------------------------------------------

| Robust

_t | Coef. Std. Err. z P>|z| [95% Conf.

Interval]

-------------+----------------------------------------------------------------

haz_wst | 8.48e-08 8.63e-08 0.98 0.326 -8.43e-08

2.54e-07

pol_free | .0089626 .1047656 0.09 0.932 -.1963741

.2142994

------------------------------------------------------------------------------

Göran Broström wrote:

*> Paul and Thomas,
*

*>
*

*> 'coxph' or the data (I got it from Paul) indeed has a problem:
*

*> ------------------------------------------------------------
*

*> Call:
*

*> coxph(formula = Surv(yrs2, ratify) ~ haz.wst + pol.free, data = dat)
*

*>
*

*>
*

*> coef exp(coef) se(coef) z p
*

*> haz.wst 8.53e-08 1 9.47e-08 0.901 0.37
*

*> pol.free NA NA 0.00e+00 NA NA
*

*>
*

*> Likelihood ratio test=0.76 on 1 df, p=0.385 n= 21
*

*> Warning message:
*

*> X matrix deemed to be singular; variable 2 in:
*

*> coxph(Surv(yrs2, ratify) ~ haz.wst + pol.free, data = dat)
*

*> ---------------------------------------------------------------
*

*>
*

*> Is it the data? Let's try 'coxreg' (eha):
*

*> ---------------------------------------------------------------
*

*> Call:
*

*> coxreg(formula = Surv(yrs2, ratify) ~ haz.wst + pol.free, data = dat)
*

*>
*

*> Covariate Mean Coef RR Wald p
*

*> haz.wst 2054901 0.000 1.000 0.372
*

*> pol.free 2.090 0.009 1.009 0.958
*

*>
*

*> Events 21
*

*> Total time at risk 78
*

*> Max. log. likelihood -45.001
*

*> LR test statistic 0.76
*

*> Degrees of freedom 2
*

*> Overall p-value 0.684583
*

*> ----------------------------------------------------------------
*

*>
*

*> This worked just fine (Paul, same results as in Stata?). But, we
*

*> seem to have a scaling problem; lok at the means of the covariates!
*

*> Some rescaling gives:
*

*> ----------------------------------------------------------------
*

*> Call:
*

*> coxph(formula = Surv(yrs2, ratify) ~ I(haz.wst * 1e-06) + pol.free,
*

*> data = dat)
*

*>
*

*>
*

*> coef exp(coef) se(coef) z p
*

*> I(haz.wst * 1e-06) 0.08479 1.09 0.095 0.8920 0.37
*

*> pol.free 0.00896 1.01 0.170 0.0526 0.96
*

*>
*

*> Likelihood ratio test=0.76 on 2 df, p=0.685 n= 21
*

*> ----------------------------------------------------------------
*

*> and now 'coxph' gets the same results as 'coxreg'. I don't know about coxph
*

*> for sure, but I do know that coxreg centers all covariates before the NR
*

*> procedure starts. Maybe we also should rescale to unit variance? And of
*

*> course scale back the coefficients and se:s at the end?
*

*>
*

*> BTW, Paul's data is heavily tied. Could be an idea to use a discrete time
*

*> version of the PH model: you can do that with 'mlreg' (eha):
*

*> ---------------------------------------------------------------
*

*> Call:
*

*> mlreg(formula = Surv(yrs2, ratify) ~ I(haz.wst * 1e-06) + pol.free,
*

*> data = dat)
*

*>
*

*> Covariate Mean Coef RR Wald p
*

*> I(haz.wst * 1e-06) 2.055 0.097 1.102 0.320
*

*> pol.free 2.090 0.003 1.003 0.985
*

*>
*

*> Events 21
*

*> Total time at risk 78
*

*> Max. log. likelihood -36.324
*

*> LR test statistic 0.93
*

*> Degrees of freedom 2
*

*> Overall p-value 0.627056
*

*> ----------------------------------------------------------------
*

*> Doesn't make much of a difference, though. Efron's method is quite
*

*> robust. (You might check if there are any differences with 'breslow')
*

*>
*

*> Best,
*

*>
*

*> Göran
*

*>
*

*> [...]
*

-- Paul E. Johnson email: pauljohn@ku.edu Dept. of Political Science http://lark.cc.ku.edu/~pauljohn 1541 Lilac Lane, Rm 504 University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66044-3177 FAX: (785) 864-5700______________________________________________ R-help@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

**Next message:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Previous message:**Uwe Ligges: "Re: [R] Formula of power.t.test"**In reply to:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Next in thread:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Reply:**Göran Broström: "Re: [R] Explaining Survival difference between Stata and R"**Reply:**Thomas Lumley: "Re: [R] Explaining Survival difference between Stata and R"

*
This archive was generated by hypermail 2.1.3
: Mon 31 May 2004 - 23:05:09 EST
*