Re: [R] changes in coxph in "survival" from older version?

From: Frank Harrell <f.harrell_at_vanderbilt.edu>
Date: Thu, 19 May 2011 17:42:11 -0700 (PDT)

Hi Tao,

For you situation (and even MUCH larger number of events), multivariable modeling will be unreliable unless you use shrinkage, variable selection will select the wrong variables, and univariable screening leads to massive bias in later stages.

Terry converted me from SAS to S-Plus in 1991 when I visited Mayo Clinic and he showed me how natural the language was to put a loop around the kind of stepwise analyses requested by users. The bootstrap showed that the list of predictors selected was very random.

Another demonstration of this is to bootstrap the ranks of the predictors, ranked by any measure you want (adjusted chi-square, univariable chi-square, ROC area). The confidence intervals for the ranks will be extremely wide. Frank

Shi, Tao wrote:
>
> Thank you, Frank and Terry, for all your answers! I'll upgrade my
> "survival"
> package for sure!
>
> It seems to me that you two are pointing to two different issues: 1) Is
> stepwise
> model selection a good approach (for any data)? 2) Whether the data I
> have has
> enough information that even worth to model? For #1, I'm not in a good
> position
> to judge and need to read up on it. For #2, I'm still a bit confused
> about
> Terry's last comment. If we forget about multivariate model building and
> just
> look at variable one by one and select the best predictor (let's say it's
> highly
> significant, e.g. p<0.0001), the resulting univariate model still can be
> wrong?
>
> What if I use this data as a validation set to validate an existing model?
> Anything different?
>
> Many thanks!
>
> ...Tao
>
>
>
>
> ----- Original Message ----

>> From: Frank Harrell &lt;f.harrell_at_vanderbilt.edu&gt;
>> To: r-help_at_r-project.org
>> Sent: Tue, May 17, 2011 10:51:02 AM
>> Subject: Re: [R] changes in coxph in "survival" from older version?
>> 
>> It's worse if the model does converge because then you don't have a 
>> warning
>> about the result being nonsense.
>> Frank
>> 
>> 
>> Terry Therneau-2  wrote:
>> > 
>> > -- begin included message ---
>> > I did realize that  there are way more predictors in the model.  My
>> > initial thinking  was use that as an initial model for stepwise model
>> > selection.  Now  I  wonder if the model selection result is still valid
>> > if the  initial model didn't even converge?
>> > --- end inclusion ---
>> > 
>> > You have 17 predictors with only 22 events.  All methods of  "variable
>> > selection" in such a scenario will give essentially random  results.
>> > There is simply not enough information present to determine a  best
>> > predictor or best subset of predictors.  
>> > 
>> >  Terry Therneau
>> > 
>> >  ______________________________________________
>> > R-help_at_r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the  posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and  provide commented, minimal, self-contained, reproducible code.
>> > 
>> 
>> 
>> -----
>> Frank Harrell
>> Department of Biostatistics, Vanderbilt  University
>> --
>> View this message in context:  
>>http://r.789695.n4.nabble.com/changes-in-coxph-in-survival-from-older-version-tp3516101p3530024.html
>>
>> Sent  from the R help mailing list archive at Nabble.com.
>> 
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting  guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented,  minimal, self-contained, reproducible code.
>>

>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/changes-in-coxph-in-survival-from-older-version-tp3516101p3537322.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 20 May 2011 - 00:44:45 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 20 May 2011 - 02:00:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive