Re: [R] adjusted R^2 vs. ordinary R^2

From: Lucke, Joseph F <LUCKE_at_uthscsa.edu>
Date: Mon 20 Jun 2005 - 23:42:37 EST


James  

The main reason for the adjusted R^2 (Fisher) is that it is less biased than the ordinary R^2. The ordinary R^2 has a positive bias that is a function of the true Rho^2, the number of predictors p, and the sample size n. The maximum bias occurs at Rho^2 = 0, where the expected R^2 is p/(n-1). The adjusted R^2 has a slightly negative bias (max being on the order of -1/2n at Rho^2 = .5) which is not a function of p.  

In your example, the R^2 for the 1st equation will be 1 even if Rho^2 = 0, but the expected R^2 will be Rho^2 + .04 for the second. ( I am interpreting "parameters" as "predictors", which is strictly speaking not true, as the regression intercept and error variance are also parameters.) The adjR^2 will have max expected bias of -.1 in the first and -0.005 in the second.  

Any between-regression comparions using R^2 will founder on differences in bias induced by differences in Rho^2, p, and n. Any between-regressions comparisons using adjR^2 will be founder on differences in bias induced by differences in Rho^2 and n. However, the maximum possible difference in bias for adjR^2 may not be large.  

Note also

1.

        The standard errors of the estimators should also be taken in account in such comparisons. 2.

        There is an unbiased estimator of Rho^2 (Olkin and Pratt); and 3.

        There is another adjR^2 has slightly better MSE than the Fisher adjR^2. 4.

        There is a difference in results if the predictors are considered fixed rather than multivariate normal (Barten)

Joe  

References  

Barten AP. Note on unbiased estimation of the squared multiple correlation coefficient. Statistica Neerlandica, 1962, 16, 151-163  

Fisher RA. The influence of rainfall in the yield of wheat at Rothamstead. Philosophical Transactions of the Royal Society of London, Series B, 1924, 213, 89-142.  

Lucke JF and Embreston SE. Biases and mean squared errors of estimators of multinormal squared multiple correlation. Journal of Education Statistics, 1984, 9(3), 183-192.  

Olkin I and Pratt JW. Unbiased estimation of certain correlation coefficients. Annals of Mathematical Statistics, 1958, 29, 201-211.


From: r-help-bounces@stat.math.ethz.ch on behalf of James Salsman Sent: Fri 6/17/2005 4:16 PM
To: r-help@stat.math.ethz.ch
Subject: [R] adjusted R^2 vs. ordinary R^2

I thought the point of adjusting the R^2 for degrees of freedom is to allow comparisons about goodness of fit between similar models with different numbers of data points. Someone has suggested to me off-list that this might not be the case.

Is an ADJUSTED R^2 for a four-parameter, five-point model reliably comparable to the adjusted R^2 of a four-parameter, 100-point model? If such values can't be reliably compared with one another, then what is the reasoning behind adjusting R^2 for degrees of freedom?

What are the good published authorities on this topic?

Sincerely,
James Salsman



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jun 21 00:10:54 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:54 EST