Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>
Date: Mon, 21 Jul 2008 13:41:10 -0500

Michal Figurski wrote:
> Hello all,
>
> I am trying to optimize my logistic regression model by using bootstrap.
> I was previously using SAS for this kind of tasks, but I am now
> switching to R.
>
> My data frame consists of 5 columns and has 109 rows. Each row is a
> single record composed of the following values: Subject_name, numeric1,
> numeric2, numeric3 and outcome (yes or no). All three numerics are used
> to predict outcome using LR.
>
> In SAS I have written a macro, that was splitting the dataset, running
> LR on one half of data and making predictions on second half. Then it
> was collecting the equation coefficients from each iteration of
> bootstrap. Later I was just taking medians of these coefficients from
> all iterations, and used them as an optimal model - it really worked well!

Why not use maximum likelihood estimation, i.e., the coefficients from the original fit. How does the bootstrap improve on that?

>
> Now I want to do the same in R. I tried to use the 'validate' or
> 'calibrate' functions from package "Design", and I also experimented
> with function 'sm.binomial.bootstrap' from package "sm". I tried also
> the function 'boot' from package "boot", though without success - in my
> case it randomly selected _columns_ from my data frame, while I wanted
> it to select _rows_.

validate and calibrate in Design do resampling on the rows

Resampling is mainly used to get a nearly unbiased estimate of the model performance, i.e., to correct for overfitting.

Frank Harrell

>
> Though the main point here is the optimized LR equation. I would
> appreciate any help on how to extract the LR equation coefficients from
> any of these bootstrap functions, in the same form as given by 'glm' or
> 'lrm'.
>
> Many thanks in advance!
>

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 21 Jul 2008 - 18:45:35 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 21 Jul 2008 - 20:34:03 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive