From: Frank Harrell <f.harrell_at_vanderbilt.edu>

Date: Mon, 16 May 2011 06:01:16 -0700 (PDT)

>> I think you are doing this correctly except for one thing. The

*>> validation
*

*>> and other inferential calculations should be done on the full model. Use
*

*>> the approximate model to get a simpler nomogram but not to get standard
*

*>> errors. With only dropping one variable you might consider just running
*

*>> the
*

*>> nomogram on the entire model.
*

*>> Frank
*

*>>
*

*>>
*

*>> KH wrote:
*

*>>>
*

*>>> Hi,
*

*>>> I am trying to construct a logistic regression model from my data (104
*

*>>> patients and 25 events). I build a full model consisting of five
*

*>>> predictors with the use of penalization by rms package (lrm, pentrace
*

*>>> etc) because of events per variable issue. Then, I tried to approximate
*

*>>> the full model by step-down technique predicting L from all of the
*

*>>> componet variables using ordinary least squares (ols in rms package) as
*

*>>> the followings. I would like to know whether I am doing right or not.
*

*>>>
*

*>>>> library(rms)
*

*>>>> plogit<- predict(full.model)
*

*>>>> full.ols<- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure,
*

*>>>> sigma=1)
*

*>>>> fastbw(full.ols, aics=1e10)
*

*>>>
*

*>>> Deleted Chi-Sq d.f. P Residual d.f. P AIC R2
*

*>>> stenosis 1.41 1 0.2354 1.41 1 0.2354 -0.59 0.991
*

*>>> x2 16.78 1 0.0000 18.19 2 0.0001 14.19 0.882
*

*>>> procedure 26.12 1 0.0000 44.31 3 0.0000 38.31 0.711
*

*>>> ClinicalScore 25.75 1 0.0000 70.06 4 0.0000 62.06 0.544
*

*>>> x1 83.42 1 0.0000 153.49 5 0.0000 143.49 0.000
*

*>>>
*

*>>> Then, fitted an approximation to the full model using most imprtant
*

*>>> variable (R^2 for predictions from the reduced model against the
*

*>>> original Y drops below 0.95), that is, dropping "stenosis".
*

*>>>
*

*>>>> full.ols.approx<- ols(plogit ~ x1+x2+ClinicalScore+procedure)
*

*>>>> full.ols.approx$stats
*

*>>> n Model L.R. d.f. R2 g Sigma
*

*>>> 104.0000000 487.9006640 4.0000000 0.9908257 1.3341718 0.1192622
*

*>>>
*

*>>> This approximate model had R^2 against the full model of 0.99.
*

*>>> Therefore, I updated the original full logistic model dropping
*

*>>> "stenosis" as predictor.
*

*>>>
*

*>>>> full.approx.lrm<- update(full.model, ~ . -stenosis)
*

*>>>
*

*>>>> validate(full.model, bw=F, B=1000)
*

*>>> index.orig training test optimism index.corrected n
*

*>>> Dxy 0.6425 0.7017 0.6131 0.0887 0.5539 1000
*

*>>> R2 0.3270 0.3716 0.3335 0.0382 0.2888 1000
*

*>>> Intercept 0.0000 0.0000 0.0821 -0.0821 0.0821 1000
*

*>>> Slope 1.0000 1.0000 1.0548 -0.0548 1.0548 1000
*

*>>> Emax 0.0000 0.0000 0.0263 0.0263 0.0263 1000
*

*>>>
*

*>>>> validate(full.approx.lrm, bw=F, B=1000)
*

*>>> index.orig training test optimism index.corrected n
*

*>>> Dxy 0.6446 0.6891 0.6265 0.0626 0.5820 1000
*

*>>> R2 0.3245 0.3592 0.3428 0.0164 0.3081 1000
*

*>>> Intercept 0.0000 0.0000 0.1281 -0.1281 0.1281 1000
*

*>>> Slope 1.0000 1.0000 1.1104 -0.1104 1.1104 1000
*

*>>> Emax 0.0000 0.0000 0.0444 0.0444 0.0444 1000
*

*>>>
*

*>>> Validatin revealed this approximation was not bad.
*

*>>> Then, I made a nomogram.
*

*>>>
*

*>>>> full.approx.lrm.nom<- nomogram(full.approx.lrm,
*

*>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
*

*>>>> plot(full.approx.lrm.nom)
*

*>>>
*

*>>> Another nomogram using ols model,
*

*>>>
*

*>>>> full.ols.approx.nom<- nomogram(full.ols.approx,
*

*>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
*

*>>>> plot(full.ols.approx.nom)
*

*>>>
*

*>>> These two nomograms are very similar but a little bit different.
*

*>>>
*

*>>> My questions are;
*

*>>>
*

*>>> 1. Am I doing right?
*

*>>>
*

*>>> 2. Which nomogram is correct
*

*>>>
*

*>>> I would appreciate your help in advance.
*

*>>>
*

*>>> --
*

*>>> KH
*

*>>>
*

*>>> ______________________________________________
*

*>>> R-help_at_r-project.org mailing list
*

*>>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>>> PLEASE do read the posting guide
*

*>>> http://www.R-project.org/posting-guide.html
*

*>>> and provide commented, minimal, self-contained, reproducible code.
*

*>>>
*

*>>
*

*>>
*

*>> -----
*

*>> Frank Harrell
*

*>> Department of Biostatistics, Vanderbilt University
*

*>> --
*

*>> View this message in context:
*

*>> http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3525372.html
*

*>> Sent from the R help mailing list archive at Nabble.com.
*

*>>
*

*>> ______________________________________________
*

*>> R-help_at_r-project.org mailing list
*

*>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>> PLEASE do read the posting guide
*

*>> http://www.R-project.org/posting-guide.html
*

*>> and provide commented, minimal, self-contained, reproducible code.
*

Frank Harrell

Department of Biostatistics, Vanderbilt University

Date: Mon, 16 May 2011 06:01:16 -0700 (PDT)

The choice is not clear, and requires some simulations to estimate the average absolute error of the covariance matrix estimators. Frank

細田弘吉 wrote:

> > Thank you for your reply, Prof. Harrell. > > I agree with you. Dropping only one variable does not actually help a lot. > > I have one more question. > During analysis of this model I found that the confidence > intervals (CIs) of some coefficients provided by bootstrapping (bootcov > function in rms package) was narrower than CIs provided by usual > variance-covariance matrix and CIs of other coefficients wider. My data > has no cluster structure. I am wondering which CIs are better. > I guess bootstrapping one, but is it right? > > I would appreciate your help in advance. > -- > KH > > > > (11/05/16 12:25), Frank Harrell wrote:

>> I think you are doing this correctly except for one thing. The

> > > E-mail address > Office: khosoda_at_med.kobe-u.ac.jp > Home : khosoda_at_venus.dti.ne.jp > > ______________________________________________ > R-help_at_r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >

Frank Harrell

Department of Biostatistics, Vanderbilt University

-- View this message in context: http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3526155.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon 16 May 2011 - 13:04:39 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 18 May 2011 - 13:20:07 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*