Re: [R] Discrepancy in the regression coefficients for Cox regression - PBC data set

From: Ravi Varadhan <RVaradhan_at_jhmi.edu>
Date: Fri, 21 Nov 2008 14:30:26 -0500

Peter,

I did check the data in the Appendix of F&H with the data in "survival" package. I couldn't find any differences in the "time" and "status" variables.

May be Terry Therneau knows the answer?!

Ravi.



Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvaradhan_at_jhmi.edu

Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Peter Dalgaard
Sent: Friday, November 21, 2008 1:58 PM
To: Ravi Varadhan
Cc: r-help_at_r-project.org
Subject: Re: [R] Discrepancy in the regression coefficients for Cox regression - PBC data set

Ravi Varadhan wrote:
> Hi David,
>
> I did look at Appendix D.3 of T&G, but am not sure if the data set
> analyzed in F&H and that attached with "survival" are different. They
> both have
> n=418 (312 from RCT and 106 observational).

Well, as David implies, if the observation times are longer and a few more people died, that could easily explain the differences.

Someone borrowed our copy of F&H so I can't check, but presumably you have one (and it is your problem anyway...).

>
> There is a major difference in the coefficient for "edema" 0.66 vs
> 0.86. In any case, the point is not whether the differences in
> coefficient affect interpretation of the model, but to understand why
> there are differences in the results.
>
> Best,
> Ravi.
>
>
> ----------------------------------------------------------------------
> ------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan_at_jhmi.edu
>
> Webpage:
> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
> ----------------------------------------------------------------------
> ------
> --------
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius_at_comcast.net]
> Sent: Friday, November 21, 2008 12:34 PM
> To: Ravi Varadhan
> Cc: r-help_at_r-project.org
> Subject: Re: [R] Discrepancy in the regression coefficients for Cox
> regression - PBC data set
>
> There is a discussion in Appendix D.3 of "Modeling Survival Data" by
> Thereau and Grambsch regarding the differences in the datasets
> including the fact that "there was significantly more follow-up for
> many patients at the time this dataset was assembled". I do not see a
> material difference in the estimates.
>
> --
> David Winsemius, MD
> Heritage Labs
>
> On Nov 21, 2008, at 12:16 PM, Ravi Varadhan wrote:
>

>> Hi,
>>
>> When I run the following Cox proportional hazards model on the Mayo 
>> clinic's PBC data set (given in the "survival" package), the 
>> regression coefficients do not agree with the results presented in 
>> Table 4.6.3 (p. 195) of Fleming & Harrington's book.
>>
>> library(survival)
>>
>> data(pbc)
>>
>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>
>> ans.cox
>>
>>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>> ans.cox
>> Call:
>> coxph(formula = Surv(time, status) ~ log(bili) + log(alb) + age +
>>    log(protime) + edema)
>>
>>
>>                coef exp(coef) se(coef)     z       p
>> log(bili)     0.8975     2.453  0.08271 10.85 0.0e+00
>> log(alb)     -2.4524     0.086  0.65707 -3.73 1.9e-04
>> age           0.0382     1.039  0.00768  4.97 6.5e-07
>> log(protime)  2.3458    10.442  0.77425  3.03 2.4e-03
>> edema         0.6613     1.937  0.20595  3.21 1.3e-03
>>
>> Likelihood ratio test=234  on 5 df, p=0  n= 418 These coefficients, 
>> however, are significantly different (i.e. the differences can't be 
>> just attributed to round-off's) from that reported in Table 4.6.3 (in 
>> the "Final model" column) of Fleming and Harrington (p.
>> 195).  The coefficients reported are: 0.8707, -2.533, 0.0394, 2.380, 
>> 0.8592.
>> Note the big difference for the "edema" variable.
>>
>> It seems like the data set considered in the book and that available 
>> in "survival" package are the same (with n=418).
>>
>> I also re-ran the Cox PH model with the 2 "data-errors" discussed in
>> p.188
>> of F&H, but still I could not match the results in Table 4.6.3.
>>
>> Is it possible that the results could be explained due to difference 
>> in convergence during maximization of partial likelihood?
>>
>> Can anyone help me figure out why this diescrepancy exists?
>>
>> Thanks very much,
>> Ravi.
>> ---------------------------------------------------------------------
>> -
>> ------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: rvaradhan_at_jhmi.edu
>>
>> Webpage:  
>> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>>
>>
>>
>> ---------------------------------------------------------------------
>> -
>> ------
>> --------
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 21 Nov 2008 - 19:33:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 21 Nov 2008 - 20:30:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive