# [Rd] fitting problems in coxph.fit

From: Torsten Hothorn <Torsten.Hothorn_at_rzmail.uni-erlangen.de>
Date: Fri 17 Dec 2004 - 02:30:55 EST

Dear Thomas & Dear List,

the fitting function `coxph.fit' called by `coxph' may fail to estimate the regression coefficients when some values of the design matrix are very large. For example

library(survival)

### copied from `coxph.fit'
coxfit <- .C("coxfit2", iter=as.integer(maxiter),

```                   as.integer(n),
as.integer(nvar), stime,
sstat,
x= x[sorted,] ,
as.double(offset[sorted] - mean(offset)),
as.double(weights),
newstrat,
means= double(nvar),
coef= as.double(init),
u = double(nvar),
imat= double(nvar*nvar), loglik=double(2),
flag=integer(1),
double(2*n + 2*nvar*nvar + 3*nvar),
as.double(control\$eps),
as.double(control\$toler.chol),
sctest=as.double(method=="efron")
,PACKAGE="survival")

```

produces

R> coxfit\$coef
[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

because (?)

R> summary(x[,1])

Min. 1st Qu. Median Mean 3rd Qu. Max. 2.378e-01 8.758e+00 5.872e+01 1.640e+04 2.732e+02 4.000e+06

One the other hand

x[,1] <- x[,1]/max(x[,1])

coxfit <- .C("coxfit2", iter=as.integer(maxiter),

```                   as.integer(n),
as.integer(nvar), stime,
sstat,
x= x[sorted,] ,
as.double(offset[sorted] - mean(offset)),
as.double(weights),
newstrat,
means= double(nvar),
coef= as.double(init),
u = double(nvar),
imat= double(nvar*nvar), loglik=double(2),
flag=integer(1),
double(2*n + 2*nvar*nvar + 3*nvar),
as.double(control\$eps),
as.double(control\$toler.chol),
sctest=as.double(method=="efron")
,PACKAGE="survival")

```

looks much better

R> coxfit\$coef
[1] 0.25123203 0.00000000 -0.42595541 -0.04488913 -0.26061995
0.44426458
[7] -0.38954286 -0.43081374 0.79573107 0.48234405 0.94636357
-0.25193465
[13] 1.15619712 0.32651765 -1.06731019 -0.24249939

I nailed the problem down to lines 261ff of `coxfit2.c' where

```            zbeta = offset[person];
for (i=0; i<nvar; i++)
zbeta += newbeta[i]*covar[i][person];
risk = exp(zbeta ) * weights[person];

```

and `zbeta' may become very large and thus `exp' returns `nan'. Of course one could transform the independent variables prior to fitting but I'm not sure if this is the intented solution.

Best,

Torsten

R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Dec 17 01:39:12 2004

This archive was generated by hypermail 2.1.8 : Fri 17 Dec 2004 - 02:19:40 EST