Re: [R] Multistage Sampling

From: <markleeds_at_verizon.net>
Date: Sat 08 Jul 2006 - 05:17:40 EST


>From: Mark Hempelmann <e.rehak@t-online.de>
>Date: Fri Jul 07 14:05:29 CDT 2006
>To: r-help@stat.math.ethz.ch
>Subject: [R] Multistage Sampling

i also find it an truly amazing group also. the general kindness and generosity of everyone is beyond belief. it will be a long time coming but i hope i can help some day also. unfortunately, i can't help you with your question either.

                                           mark

            

>Dear WizaRds, dear Thomas,
>
> First of all, I want to tell you how grateful I am for all your
>support. I wish I will be able to help others along one day the same way
>you do. Thank you so much. I am struggling with a multistage sampling
>design:
>
>library(survey)
>multi3 <- data.frame(cluster=c(1,1,1,1 ,2,2,2, 3,3), id=c(1,2,3,4,
>1,2,3, 1,2),
>nl=c(4,4,4,4, 3,3,3, 2,2), Nl=c(100,100,100,100, 50,50,50, 75,75),
>M=rep(23,9),
>y=c(23,33,77,25, 35,74,27, 37,72) )
>
>dmulti3 <- svydesign(id=~cluster+id, fpc=~M+Nl, data=multi3)
>svymean (~y, dmulti3)
> mean SE
>y 45.796 5.5483
>
>svytotal(~y, dmulti3)
> total SE
>y 78999 13643
>
>and I estimate the population total as N=M/m sum(Nl) =
>23/3*(100+50+75)=1725. With this, my variance estimator is:
>y1<-mean(multi3$y[1:4]) # 39.5
>y2<-mean(multi3$y[5:7]) # 45.33
>y3<-mean(multi3$y[8:9]) # 54.5
>
>yT1<-100*y1 # 3950 total cluster 1
>yT2<-50*y2 # 2266.67 total cluster 2
>yT3<-75*y3 # 4087.5 total cluster 3
>ybarT<-1/3*sum(yT1,yT2,yT3) # 3434.722
>s1 <- var(multi3$y[1:4]) # 643.67 var cluster 1
>s2 <- var(multi3$y[5:7]) # 632.33 var cluster 2
>s3 <- var(multi3$y[8:9]) # 612.5 var cluster 3
>
>var.yT <- 23^2*( 20/23*1/6*sum(
>(yT1-ybarT)^2,(yT2-ybarT)^2,(yT3-ybarT)^2 ) +
>1/69 * sum(100*96*s1, 50*47*s2, 75*73*s3) ) # 242 101 517
>
>but
>var.yT/1725^2 = 81.36157
>SE = 9.02006,
>but it should be SE=13643/1725=7.90899
>
>Is this calculation correct? I remember svytotal using a different
>variance estimator compared to svymean, and that svytotal gives the
>unbiased estimation. To solve the problem, I went ahead and tried to

>calibrate the design object, telling Survey the population total N=1725:
>
>dmulti3.cal <- calibrate(dmulti3, ~1, pop=1725)
>svymean (~y, dmulti3.cal)
> mean SE
>y 45.796 5.5483
>
>svytotal(~y, dmulti3.cal)
> total SE
>y 78999 9570.7
>
>, which indeed gives me the computed svymean SE, but alas, I still don't
>know why my variance is so different. I think it might have sthg to do
>with a differently computed N and the fact that your estimator formula
>is a different one. Since I calculated the Taylor Series solution, i
>suppose there must be another approach? The calibration help page tells
>me to enter a list of population total vectors for each cluster, which
>would result in:
>
>dmulti3.cal <- calibrate(dmulti3, ~1, pop=c(100,50,75))
>Error in regcalibrate.survey.design2(design, formula, population,
>aggregate.stage = aggregate.stage, :
>Population and sample totals are not the same length.
>
>I am very grateful for your help and wish you alle the best
>Yours
>mark
>
>______________________________________________
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jul 08 05:22:03 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 08 Jul 2006 - 06:16:01 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.