Re: [R] Predicted Cox survival curves - factor coding problems..

From: Prof Brian Ripley <>
Date: Mon, 07 May 2007 14:45:59 +0100 (BST)

On Mon, 7 May 2007, Terry Therneau wrote:

> The combination of survfit, coxph, and factors is getting confused. It is
> not smart enough to match a new data frame that contains a numeric for sitenew
> to a fit that contained that variable as a factor. (Perhaps it should be smart
> enough to at least die gracefully -- but it's not).

The 'standard' model-fitting functions in R do make an attempt to match the new data to that used for fitting, or die gracefully. Perhaps Thomas could look into adding this to survift and coxph (see

> The simple solution is to not use factors.
> site1 <- 1*(coxsnps$sitenew==1)
> site2 <- 1*(coxsnps$sitenew==2)
> test1 <- coxph(Surv(time, censor) ~ snp1 + sex + site1 + site2 + gene +
> eth.self + strata(edu), data= coxsnps)
> output
> profile1 <- data.frame(snp1=c(0,1), site2=c(0,0), sex=c(0,0),
> site1=c(0,0), site2=c(0,0), geno=c(0,0) eth.self=c(0,0))
> plot(survfit(test1, newdata=profile1))
> Note that you do not have to explicitly make "edu" a factor. Since it is
> included in a strata statement, the coxph routine must treat it as discrete
> groups.
> Terry Therneau

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 07 May 2007 - 13:57:49 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 07 May 2007 - 14:31:19 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.