From: John Fox <jfox_at_mcmaster.ca>

Date: Tue, 11 Mar 2008 07:39:14 -0400

John Fox, Professor

Department of Sociology

McMaster University

Hamilton, Ontario, Canada L8S 4M4

905-525-9140x23604

http://socserv.mcmaster.ca/jfox

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 11 Mar 2008 - 11:45:35 GMT

Date: Tue, 11 Mar 2008 07:39:14 -0400

Dear JRG, Rolf, Ben, and Peter,

"Frequency" weights, possibly even non-integer weights, are useful for surveys where observations are sampled with unequal probabilities of selection. The approach in SPSS gives correct point estimates in this situation but incorrect standard errors. The survey package, for example, provides a better solution.

Regards,

John

John Fox, Professor

Department of Sociology

McMaster University

Hamilton, Ontario, Canada L8S 4M4

905-525-9140x23604

http://socserv.mcmaster.ca/jfox

> -----Original Message-----

*> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
**> project.org] On Behalf Of JRG
**> Sent: March-10-08 10:27 PM
**> To: Rolf Turner; r-help_at_r-project.org; Ben Domingue
**> Cc: r-help_at_r-project.org
**> Subject: Re: [R] Mimicking SPSS weighted least squares
**>
**> On 11 Mar 2008 at 14:09, Rolf Turner wrote:
**>
**> >
**> > It would appear that the SPSS procedure would then give exactly the
**> same
**> > point estimates of the parameters, and change the inference structure
**> by
**> > changing the ``denominator degrees of freedom'' from n-p to sum(w) -
**> p.
**> >
**>
**> Well, if that IS what SPSS does, then it sounds like what Stata calls
**> frequency weights, the
**> general idea being that each "observation" in fact represents some non-
**> negative number (w) of
**> actual observations that have identical values. Not much more than a
**> glorified version of a
**> frequency distribution table.
**>
**> I don't see anything fundamentally wrong with frequency weights, given
**> an appropriate situation.
**>
**> ---JRG
**>
**> John R. Gleason
**>
**>
**>
**> > This seems to me to make little sense ... But then, it ***is***
**> > SPSS. :-)
**> >
**> > cheers,
**> >
**> > Rolf
**> >
**> > On 11/03/2008, at 11:35 AM, Peter Dalgaard wrote:
**> >
**> > > Rolf Turner wrote:
**> > >> On 11/03/2008, at 4:04 AM, Ben Domingue wrote:
**> > >>
**> > >>
**> > >>> Howdy,
**> > >>> In SPSS, there are 2 ways to weight a least squares regression:
**> > >>> 1. You can do it from the regression menu.
**> > >>> 2. You can set a global weight switch from the data menu.
**> > >>> These two options have no, in my experience, been equivalent.
**> > >>> Now, when I run lm in R with the weights= switch set accordingly,
**> I
**> > >>> get the same set of results you would see with option #1 in SPSS.
**> > >>> Does anybody know how to duplicate option #2 from SPSS in R?
**> > >>>
**> > >>
**> > >> I think it's up to you to find out what ``option #2 from SPSS''
**> > >> actually
**> > >> *does*. If you know that, then you can (with a modicum of effort)
**> > >> duplicate that option in R. The help file for lm() tells you that
**> > >> R uses the weights by minimizing sum(w*e^2) where w = weights and
**> > >> e = ``errors'' or residuals.
**> > >>
**> > >>
**> > >>
**> > > I believe case weighting in SPSS effectively replicates the
**> > > relevant row (not sure if anything sensible comes out if weights
**> > > are non-integer). So
**> > >
**> > > lm(...., data=mydata[rep(1:nrow(mydata),w),])
**> > >
**> > > or thereabouts should do it. Might not be too efficient though.
**> > >
**> > > --
**> > > O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B
**> > > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
**> > > (*) \(*) -- University of Copenhagen Denmark Ph: (+45)
**> > > 35327918
**> > > ~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk) FAX: (+45)
**> > > 35327907
**> > >
**> > >
**> >
**> >
**> ######################################################################
**> > Attention:
**> > This e-mail message is privileged and confidential. If you are not
**> the
**> > intended recipient please delete the message and notify the sender.
**> > Any views or opinions presented are solely those of the author.
**> >
**> > This e-mail has been scanned and cleared by MailMarshal
**> > www.marshalsoftware.com
**> >
**> ######################################################################
**> >
**> > ______________________________________________
**> > R-help_at_r-project.org mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide http://www.R-project.org/posting-
**> guide.html
**> > and provide commented, minimal, self-contained, reproducible code.
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-
**> guide.html
**> and provide commented, minimal, self-contained, reproducible code.
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 11 Mar 2008 - 11:45:35 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Tue 11 Mar 2008 - 16:30:21 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*