[R] logistic regression: wls and unbalanced samples

From: Andre Guimaraes <alsguimaraes_at_gmail.com>
Date: Tue, 26 Apr 2011 19:22:07 -0300

Greetings from Rio de Janeiro, Brazil.

I am looking for advice / references on binary logistic regression with weighted least squares (using lrm & weights), on the following context:

  1. unbalanced sample (n0=10000, n1=700);
  2. sampling weights used to rebalance the sample (w0=1, w1=14.29); e
  3. after modelling, adjust the intercept in order to reflect the expected % of 1s in the population (e.g., circa 7%, as opposed to 50%).

I have identified references that deal with the last point, but no conclusive article or book dealing with this specific use of weights in unbalaced samples.

The area under the ROC is about 0.70, and the estimated probabilities are close to the frequencies of 1s in different ranges, which looks satisfactory. Hosmer & Lemeshows test is not significant, as expected.

Can someone comment on the adopted strategy, or suggest some specific bibliography that might address the issue of weights and unbalanced samples in logistic regression?

Thanks in advance,

Andr Guimares

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 27 Apr 2011 - 01:16:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 27 Apr 2011 - 09:00:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive