From: Cleber N. Borges <cborges_at_iqm.unicamp.br>

Date: Thu 26 Jan 2006 - 03:32:58 EST

1 0 0 11

1 0 0 12.4

0 1 0 8.8

0 1 0 10

0 0 1 16.8

0 0 1 16

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 26 03:53:52 2006

Date: Thu 26 Jan 2006 - 03:32:58 EST

I was quite interested in this thread (discussion), once that I am chemistry student and I work with Mixtures Designs that are models without intercept.

I thought quite attention the follow afirmation:

' Thus SST, the "corrected" total

sum of squares, should be used when you have a model with an intercept
term but the uncorrected total sum of squares should be used when you
do not have an intercept term. ' (Douglas Bates)

I have as reference a book called:

"Experiments with Mixtures: Designs, Models, and the Analysis of Mixture
Data"

second edition

John A. Cornell

(Professor of Statistics in University Of Florida)

In this book, pg 42: item 2.7 - THE ANALYSIS OF VARIANCE TABLE, I have the model below:

y(x) = 11.7x1 + 9.4x2 + 16.4x3 + 19.0x1x2 + 11.4x1x3 - 9.6x2x3

with the follow ANOVA Table:

source of variation D.F. SS MS Regression p-1 SSR=\sum( y_{pred} - y_{mean} )^2 ssR/(p-1) Residual N-p SSE=\sum( y_{exp} - y_{pred} )^2 ssE/(N-p) Total N-1 SSE=\sum( y_{exp} - y_{mean} )^2

pred = predicted

exp = experimental

and in many others books.

I always see the ANOVA Table of Mixtures systems with SST, the
"corrected" total

sum of squares ( N-1 degrees freedom ).

I would like to ask:

- What is approach ( point view ) more adequate ?
- Could someone indicate some reference about this subject ?

Thanks a lot.

Regards

Cleber N. Borges

############################ Dados x1 x2 x3 y

1 0 0 11

1 0 0 12.4

0.5 0.5 0 15 0.5 0.5 0 14.8 0.5 0.5 0 16.1

0 1 0 8.8

0 1 0 10

0 0.5 0.5 10 0 0.5 0.5 9.7 0 0.5 0.5 11.8

0 0 1 16.8

0 0 1 16

0.5 0 0.5 17.7 0.5 0 0.5 16.4 0.5 0 0.5 16.6 ############################## Model

d.lm <- lm( y ~ -1 + x1*x2*x3 - x1:x2:x3, data = Dados )

### Anova like in the book

d.aov <- aov( y ~ x1*x2*x3 - x1:x2:x3, data = Dados )
#### SSR (fitted Model) = 128.296

Douglas Bates wrote:

>On 1/24/06, Lance Westerhoff <lance@quantumbioinc.com> wrote:

*>
**>
**>>Hi-
**>>
**>>On Jan 24, 2006, at 12:08 PM, Peter Dalgaard wrote:
**>>
**>>
**>>
**>>>Lance Westerhoff <lance@quantumbioinc.com> writes:
**>>>
**>>>
**>>>
**>>>>Hello All-
**>>>>
**>>>>I found an inconsistency between the R-squared reported in Excel vs.
**>>>>that in R, and I am wondering which (if any) may be correct and if
**>>>>this is a known issue. While it certainly wouldn't surprise me if
**>>>>Excel is just flat out wrong, I just want to make sure since the R-
**>>>>squared reported in R seems surprisingly high. Please let me know if
**>>>>this is the wrong list. Thanks!
**>>>>
**>>>>
**>>>Excel is flat out wrong. As the name implies, R-squared values cannot
**>>>be less than zero (adjusted R-squared can, but I wouldn't think
**>>>that is what Excel does).
**>>>
**>>>
**>>I had thought the same thing, but then I came across the following
**>>site which states: "Note that it is possible to get a negative R-
**>>square for equations that do not contain a constant term. If R-square
**>>is defined as the proportion of variance explained by the fit, and if
**>>the fit is actually worse than just fitting a horizontal line, then R-
**>>square is negative. In this case, R-square cannot be interpreted as
**>>the square of a correlation." Since
**>>
**>>R^2 = 1 - (SSE/SST)
**>>
**>>I guess you can have SSE > SST which would result in a R^2 of less
**>>then 1.0. However, it still seems very strange which made me wonder
**>>what is going on in Excel needless to say!
**>>
**>>http://www.mathworks.com/access/helpdesk/help/toolbox/curvefit/
**>>ch_fitt9.html
**>>
**>>
**>
**>This seems to be a case of using the wrong formula. R^2 should
**>measure the amount of variation for which the given model accounts
**>relative to the amount of variation for which the *appropriate* null
**>model does not account. If you have a constant or intercept term in a
**>linear model then the null model for comparison is one with the
**>intercept only. If you have a linear model without an intercept term
**>then the appropriate null model for comparison is the model that
**>predicts all the responses as zero. Thus SST, the "corrected" total
**>sum of squares, should be used when you have a model with an intercept
**>term but the uncorrected total sum of squares should be used when you
**>do not have an intercept term.
**>
**>It is disappointing to see the MathWorks propagating such an
**>elementary misconception.
**>
**>______________________________________________
**>R-help@stat.math.ethz.ch mailing list
**>https://stat.ethz.ch/mailman/listinfo/r-help
**>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
**>
**>.
**>
**>
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 26 03:53:52 2006

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:42:10 EST
*