Re: [R] Are least-squares means useful or appropriate?

From: Douglas Bates <>
Date: Sat 24 Sep 2005 - 00:00:26 EST

On 9/20/05, Felipe <> wrote:
> Hash: SHA1
> Hi.
> My question was just theoric. I was wondering if someone who were using
> SAS and R could give me their opinion on the topic. I was trying to use
> least-squares means for comparison in R, but then I found some
> indications against them, and I wanted to know if they had good basis
> (as I told earlier, they were not much detailed).
> Greetings.
> Felipe

As Deepayan said in his reply, the concept of least squares means is associated with SAS and is not generally part of the theory of linear models in statistics. My vague understanding of these (I too am not a SAS user) is that they are an attempt to estimate the "mean" response for a particular level of a factor in a model in which that factor has a non-ignorable interaction with another factor. There is no clearly acceptable definition of such a thing.

To understand why there should be an attempt to answer a question that doesn't make sense, remember the history of SAS, which was developed in the era of punched cards and magnetic tape. Beneath the surface of SAS with its GUI, etc. is the fundamental assumption that your data are on a reel of magnetic tape over in the "Computer Center" that houses an IBM Sytem/360 computer and that the way you are going to use this program is by keypunching a deck of punched cards, putting some mysterious JCL (the IBM Job Control Language which no one understood and you learned only by imitation) cards at the beginning and end, and submitting them at the I/O Window. The next day you will go to the computer center to pick up your output only to discover that you had a JCL error. You will spend most of the morning tracking down the one person on campus who can tell you that "ERROR IEH92345" was caused by the blank between the "DD" and the "*" in the card that reads //SYSIN DD * so you change that and submit again. After two or three days of this you get the JCL right but discover that you have a syntax error in your SAS code. Another two or three cycles finally gets you to the point where you have a card deck that runs and produces output. At that point you don't really care if the output makes sense or not - all you want is some numbers for the report that is now a week overdue. You also want all the numbers that you might possibly need, which is why SAS PROCs always have the potential to produce tons of output if you ask for it.

R is an interactive language where it is a simple matter to fit a series of models and base your analysis on a model that is appropriate. An approach of "give me the answer to any possible question about this model, whether or not it make sense" is unnecessary.

In many ways statistical theory and practice has not caught up with statistical computing. There are concepts that are regarded as part of established statistical theory when they are, in fact, approximations or compromises motivated by the fact that you can't compute the answer you want - except now you can compute it. However, that won't stop people who were trained in the old system from assuming that things *must* be done in that way.

In short, I agree with Deepayan - the best thing to do is to ask someone who uses SAS and least squares means to explain to you what they are. mailing list PLEASE do read the posting guide! Received on Sat Sep 24 00:11:44 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:26 EST