# [R] Problem in anova with coxph object

From: Matthias Gondan <matthias-gondan_at_gmx.de>
Date: Tue, 08 Jan 2008 18:30:40 +0100

I noticed a problem in the anova command when applied on a single coxph object if there are missing observations in the data:

This example code was run on R-2.6.1:

> library(survival)
> data(colon)
> colondeath = colon[colon\$etype==2, ]
> m = coxph(Surv(time, status) ~ rx + sex + age + perfor, data=colondeath)
> m

Call:
coxph(formula = Surv(time, status) ~ rx + sex + age + perfor,

data = colondeath)

```               coef exp(coef) se(coef)      z      p
rxLev     -0.028895     0.972  0.11037 -0.262 0.7900
rxLev+5FU -0.374286     0.688  0.11885 -3.149 0.0016
sex       -0.000754     0.999  0.09431 -0.008 0.9900
age        0.002442     1.002  0.00405  0.603 0.5500
perfor     0.155695     1.168  0.26286  0.592 0.5500

```

Likelihood ratio test=12.8 on 5 df, p=0.0251 n= 929

> anova(m, test='Chisq')

Analysis of Deviance Table
Cox model: response is Surv(time, status) Terms added sequentially (first to last)

```        Df  Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL                       929     5860.4
rx       2      12.1       927     5848.2 2.302e-03
sex      1 2.054e-05       926     5848.2       1.0
age      1       0.3       925     5847.9       0.6
perfor   1       0.3       924     5847.6       0.6

```

Now I include nodes which has some missing data:

> m = coxph(Surv(time, status) ~ rx + sex + age + perfor + nodes,
data=colondeath)
> m

Call:
coxph(formula = Surv(time, status) ~ rx + sex + age + perfor +

nodes, data = colondeath)

```              coef exp(coef) se(coef)      z       p
rxLev     -0.08245     0.921  0.11168 -0.738 0.46000
rxLev+5FU -0.40310     0.668  0.12054 -3.344 0.00083
sex       -0.02854     0.972  0.09573 -0.298 0.77000
age        0.00547     1.005  0.00405  1.350 0.18000
perfor     0.19040     1.210  0.26335  0.723 0.47000
nodes      0.09296     1.097  0.00889 10.460 0.00000

```

> anova(m, test='Chisq')

Analysis of Deviance Table
Cox model: response is Surv(time, status) Terms added sequentially (first to last)

```        Df  Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL                       911     5700.6
rx       2       0.0       909     5848.2       1.0
sex      1 2.054e-05       908     5848.2       1.0
age      1       0.3       907     5847.9       0.6
perfor   1       0.3       906     5847.6       0.6
nodes    1     235.3       905     5612.3 4.253e-53

```

The strange thing is that rx is not significant anymore.

In the documentation for anova.coxph, there is a warning that

```> The comparison between two or more models by |anova| or will only be
> valid if they are fitted to the same dataset. This may be a problem if
> there are missing values.
>
```

However, I inserted a single object to be analyzed sequentially. Is this a bug in R, or is it covered by the warning?

Best wishes,

Matthias

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jan 2008 - 17:42:42 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jan 2008 - 18:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.