From: John Fox <jfox_at_mcmaster.ca>

Date: Sat 24 Sep 2005 - 23:04:14 EST

John Fox

Department of Sociology

McMaster University

Hamilton, Ontario

Canada L8S 4M4

905-525-9140x23604

http://socserv.mcmaster.ca/jfox

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Sep 24 23:15:38 2005

Date: Sat 24 Sep 2005 - 23:04:14 EST

Dear Peter, Doug, and Felipe,

My effects package (on CRAN, also see the article at http://www.jstatsoft.org/counter.php?id=75&url=v08/i15/effect-displays-revis ed.pdf) will compute and graph adjusted effects of various kinds for linear and generalized linear models -- generalizing so-called "least-squares means" (or "population marginal means" or "adjusted means").

A couple of comments:

By default, the all.effects() function in the effects package computes effects for high-order terms in the model, absorbing terms marginal to them. You can ask the effect() function to compute an effect for a term that's marginal to a higher-order term, and it will do so with a warning, but this is rarely sensible.

Peter's mention of floating variances (or quasi-variances) in this context is interesting, but what would most like to see, I think, are the quasi-variances for the adjusted effects, that is for terms merged with their lower-order relatives. These, for example, are unaffected by contrast coding. How to define reasonable quasi-variances in this context has been puzzling me for a while.

Regards,

John

John Fox

Department of Sociology

McMaster University

Hamilton, Ontario

Canada L8S 4M4

905-525-9140x23604

http://socserv.mcmaster.ca/jfox

> -----Original Message-----

*> From: r-help-bounces@stat.math.ethz.ch
**> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Peter Dalgaard
**> Sent: Friday, September 23, 2005 10:23 AM
**> To: Douglas Bates
**> Cc: Felipe; R-help@stat.math.ethz.ch
**> Subject: Re: [R] Are least-squares means useful or appropriate?
**>
**> Douglas Bates <dmbates@gmail.com> writes:
**>
**> > On 9/20/05, Felipe <felipe@unileon.es> wrote:
**> > > -----BEGIN PGP SIGNED MESSAGE-----
**> > > Hash: SHA1
**> > >
**> > > Hi.
**> > > My question was just theoric. I was wondering if someone who were
**> > > using SAS and R could give me their opinion on the topic. I was
**> > > trying to use least-squares means for comparison in R, but then I
**> > > found some indications against them, and I wanted to know if they
**> > > had good basis (as I told earlier, they were not much detailed).
**> > > Greetings.
**> > >
**> > > Felipe
**> >
**> > As Deepayan said in his reply, the concept of least squares
**> means is
**> > associated with SAS and is not generally part of the theory
**> of linear
**> > models in statistics. My vague understanding of these (I
**> too am not a
**> > SAS user) is that they are an attempt to estimate the
**> "mean" response
**> > for a particular level of a factor in a model in which that
**> factor has
**> > a non-ignorable interaction with another factor. There is
**> no clearly
**> > acceptable definition of such a thing.
**>
**> (PD goes and fetches the SAS manual....)
**>
**> Well, yes. it'll do that too, although only if you ask for
**> the lsmeans of A when an interaction like A*B is present in
**> the model. This is related to the tests of main effects when
**> an interaction is present using type III sums of squares,
**> which has been beaten to death repeatedly on the list. In
**> both cases, there seems to be an implicit assumption that
**> categorical variables by nature comes from an underlying
**> fully balanced design.
**>
**> If the interaction is absent from the model, the lsmeans are
**> somewhat more sensible in that they at least reproduce the
**> parameter estimates as contrasts between different groups.
**> All continuous variables in the design will be set to their
**> mean, but values for categorical design variables are
**> weighted inversely as the number of groups. So if you're
**> doing an lsmeans of lung function by smoking adjusted for age
**> and sex you get estimates for the mean of a population of
**> which everyone has the same age and half are male and half
**> are female. This makes some sense, but if you do it for sex
**> adjusting for smoking and age, you are not only forcing the
**> sexes to smoke equally much, but actually adjusting to
**> smoking rates of 50%, which could be quite far from reality.
**>
**> The whole operation really seems to revolve around 2 things:
**>
**> (1) pairwise comparisons between factor levels. This can alternatively
**> be done fairly easily using parameter estimates for the relevant
**> variable and associated covariances. You don't really need all the
**> mumbo-jumbo of adjusting to particular values of other variables.
**>
**> (2) plotting effects of a factor with error bars as if they were
**> simple group means. This has some merit since the standard
**> parametrizations are misleading at times (e.g. if you choose the
**> group with the least data as the reference level, std. err. for
**> the other groups will seem high). However, it seems to me that
**> concepts like floating variances (see float() in the Epi package)
**> are more to the point.
**>
**> > R is an interactive language where it is a simple matter to fit a
**> > series of models and base your analysis on a model that is
**> > appropriate. An approach of "give me the answer to any possible
**> > question about this model, whether or not it make sense" is
**> > unnecessary.
**> >
**> > In many ways statistical theory and practice has not caught up with
**> > statistical computing. There are concepts that are
**> regarded as part
**> > of established statistical theory when they are, in fact,
**> > approximations or compromises motivated by the fact that you can't
**> > compute the answer you want - except now you can compute
**> it. However,
**> > that won't stop people who were trained in the old system from
**> > assuming that things *must* be done in that way.
**> >
**> > In short, I agree with Deepayan - the best thing to do is to ask
**> > someone who uses SAS and least squares means to explain to you what
**> > they are.
**> >
**> > ______________________________________________
**> > R-help@stat.math.ethz.ch mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide!
**> > http://www.R-project.org/posting-guide.html
**> >
**>
**> --
**> O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B
**> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
**> (*) \(*) -- University of Copenhagen Denmark Ph:
**> (+45) 35327918
**> ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX:
**> (+45) 35327907
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide!
**> http://www.R-project.org/posting-guide.html
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Sep 24 23:15:38 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 17:42:46 EST
*