From: Peter Flom <flom_at_ndri.org>

Date: Tue 13 Jul 2004 - 05:17:06 EST

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jul 13 05:28:06 2004

Date: Tue 13 Jul 2004 - 05:17:06 EST

There's also an article by Agresti and Coull

author = {Alan Agresti and B. A. Coull}, title = {Approximate is better than "exact" for interval estimation of binomial proportions},

journal = {American Statistician},

year = {1998},

volume = {52},

number = {},

pages = {119-126},

which may be of interest.

Peter

Peter L. Flom, PhD

Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research

National Development and Research Institutes
71 W. 23rd St

www.peterflom.com

New York, NY 10010

(212) 845-4485 (voice)

(917) 438-0894 (fax)

*>>> Spencer Graves <spencer.graves@pdf.com> 7/12/2004 3:07:00 PM >>>
*

According to Brown, Cai and DasGupta (cited below), the "exact" confidence intervals are hyperconservative, as they are designed to produce actual coverage probabilities at least the nominal. Thus for a

95% confidence interval, the actual coverage could be 98% or more,
depending on the true but unknown proportion; please check their
papers

for exact numbers. They report that the Wilson procedure performs
reasonably well, as does the asymptotic logit procedure. I can't say
without checking, but I would naively expect that confint.glm would
likely also be among the leaders.

By the way, confint.glm is independent of the parameterization, assuming 2*log(likelihood ratio) is approximately chi-square. It is therefore subject to intrinsic nonlinearity but is at least free of parameter effects (see, e.g., Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley)). To check this, consider the following:

fit10c <- glm(y~1, family=binomial(link=cloglog), data=DF10,
weights=size)

fit100c <- glm(y~1, family=binomial(link=cloglog), data=DF100,
weights=size)

(CI10c <- confint(fit10c))

(CI100c <- confint(fit100c))

*> 1-exp(-exp(CI10c))
*

2.5 % 97.5 %

0.005989334 0.371562793

*> 1-exp(-exp(CI100c))
*

2.5 % 97.5 %

0.05140762 0.16875918

These are precisely the number reported below with the default binomial link = logit.

hope this helps. spencer graves

Chuck Cleland wrote:

> Darren also might consider binconf() in library(Hmisc).

*>
**> > library(Hmisc)
**>
**> > binconf(1, 10, method="all")
**> PointEst Lower Upper
**> Exact 0.1 0.002528579 0.4450161
**> Wilson 0.1 0.005129329 0.4041500
**> Asymptotic 0.1 -0.085938510 0.2859385
**>
**> > binconf(10, 100, method="all")
**> PointEst Lower Upper
**> Exact 0.1 0.04900469 0.1762226
**> Wilson 0.1 0.05522914 0.1743657
**> Asymptotic 0.1 0.04120108 0.1587989
**>
**> Spencer Graves wrote:
**>
**>> Please see:
**>> Brown, Cai and DasGupta (2001) Statistical Science, 16:
*

101-133

>> and (2002) Annals of Statistics, 30: 160-2001

*>> They show that the actual coverage probability of the standard
*

>> approximate confidence intervals for a binomial proportion are quite

>> poor, while the standard asymptotic theory applied to logits

produces

>> rather better answers.

*>> I would expect "confint.glm" in library(MASS) to give decent
**>> results, possibly the best available without a very careful study of
*

>> this particular question. Consider the following:

*>> library(MASS)# needed for confint.glm
**>> library(boot)# needed for inv.logit
**>> DF10 <- data.frame(y=.1, size=10)
**>> DF100 <- data.frame(y=.1, size=100)
**>> fit10 <- glm(y~1, family=binomial, data=DF10, weights=size)
**>> fit100 <- glm(y~1, family=binomial, data=DF100, weights=size)
**>> inv.logit(coef(fit10))
**>>
**>> (CI10 <- confint(fit10))
**>> (CI100 <- confint(fit100))
**>>
**>> inv.logit(CI10)
**>> inv.logit(CI100)
**>>
**>> In R 1.9.1, Windows 2000, I got the following:
**>>
**>>> inv.logit(coef(fit10))
**>>
**>>
**>> (Intercept)
**>> 0.1
**>>
**>>>
**>>> (CI10 <- confint(fit10))
**>>
**>>
**>> Waiting for profiling to be done...
**>> 2.5 % 97.5 %
**>> -5.1122123 -0.5258854
**>>
**>>> (CI100 <- confint(fit100))
**>>
**>>
**>> Waiting for profiling to be done...
**>> 2.5 % 97.5 %
**>> -2.915193 -1.594401
**>>
**>>>
**>>> inv.logit(CI10)
**>>
**>>
**>> 2.5 % 97.5 %
**>> 0.005986688 0.371477058
**>>
**>>> inv.logit(CI100)
**>>
**>>
**>> 2.5 % 97.5 %
**>> 0.0514076 0.1687655
**>>
**>>>
**>>> (naiveCI10 <- .1+c(-2, 2)*sqrt(.1*.9/10))
**>>
**>>
**>> [1] -0.08973666 0.28973666
**>>
**>>> (naiveCI100 <- .1+c(-2, 2)*sqrt(.1*.9/100))
**>>
**>>
**>> [1] 0.04 0.16
**>
**>
*

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jul 13 05:28:06 2004

*
This archive was generated by hypermail 2.1.8
: Wed 03 Nov 2004 - 22:54:52 EST
*