[R] SURVEY PREDICTED SEs: Problem

From: <mo2259_at_columbia.edu>
Date: Wed 26 Jul 2006 - 23:50:47 EST

Hello R-list,
I'm attempting to migrate from Stata to R for my complex survey work. It has been straight-forward so far except for the following problem:

I have some code below, but first I'll describe the problem.

When I compute predicted logits from a logistic regression, the standard errors of the predicted logits are way off (but the predicted logits are fine). Furthermore, the model logit coefficients have appropriate SEs. As a comparison, I ran the same model without the survey design; the predicted SEs come out fine.

Here is example code (first no survey design model and predictions; then survey design model and predictions):

> #MODEL COEF. ESTIMATES (NO SURVEY DESIGN)
> model.l.nosvy <- glm(qn58~t8l,data=all.stratum,family=binomial)
> summary(model.l.nosvy)

Call:
glm(formula = qn58 ~ t8l, family = binomial, data = all.stratum)

Deviance Residuals:

   Min 1Q Median 3Q Max
-1.310 -1.245 1.050 1.111 1.158

Coefficients:

             Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.175890   0.006176   28.48   <2e-16 ***
t8l         -0.018643   0.001376  -13.55   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 145934  on 105857  degrees of freedom
Residual deviance: 145750  on 105856  degrees of freedom
AIC: 145754

Number of Fisher Scoring iterations: 3


>
> #PREDICTED SEs
> phat.l.se.logit.nosvy <- predict(model.l.nosvy,se=TRUE)
> as.matrix(table(phat.l.se.logit.nosvy$se.fit))
[,1] 0.00632408017609573 14456 0.00633130215261306 15188 0.00741988836010757 12896 0.00743834214717549 10392 0.00923404822144662 13207 0.00925875968615561 15864 0.0114294663004145 12235 0.0114574202170594 11620
>
> #MODEL COEF. ESTIMATES (SURVEY DESIGN)
> model.l <- svyglm(qn58~t8l,design=all.svy,family=binomial)
> summary(model.l)
Call: svyglm(qn58 ~ t8l, design = all.svy, family = binomial) Survey design: svydesign(id = ~psu, strata = ~stratum, weights = ~weight, data = all.stratum, nest = T) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.016004 0.023267 -0.688 0.492 t8l -0.024496 0.004941 -4.958 1.13e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 0.934964) Number of Fisher Scoring iterations: 2
>
> #PREDICTED SEs
> phat.l.logit.se <- predict(model.l,se=TRUE)
> as.matrix(table(phat.l.logit.se$se.fit))
[,1] 2.04867522818685 15188 2.05533753780321 14456 2.39885304369985 10392 2.41588959524594 12896 2.98273190185571 15864 3.00556161422958 13207 3.69102305734136 11620 3.71685978156846 12235 #THESE SEs are too large. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Wed Jul 26 23:56:24 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 27 Jul 2006 - 02:16:31 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.