Re: [R] Calculation of r squared from a linear regression

From: kMan <>
Date: Tue, 15 Jun 2010 01:24:44 -0600

Dear Sandra,

R^2 is just a ratio between the amount of error explained between two models.

PRE (proportional reduction in error) = R^2 = (SSE model C - SSE model A)/SSE model C.
This is sometimes expressed as (SSEc-SSEa)/SSEc = SSR/SSEc |SSR=sum squared reduced

Given your example with some extensions: x<- c(1,2,3,4)
y<- c(1.6,4.4,5.5,8.3)


# The model is fit as before with all parameters. fit1<-lm(y~x) # includes intercept term
summary(fit1) # PRE = 0.9749
fit1.SSE<-sum(resid(fit1)^2) # SSE=0.578

fit2<-lm(y~x-1) # excludes intercept, as in the original example (forces the intercept to zero)
summary(fit2) # PRE = 0.9946
fit2.SSE<-sum(resid(fit2)^2) # SSE=0.6596667

# In order to understand the comparison taking place in fit1 SSEc <-sum(y.demean^2) #SSE of a model predicting only the mean SSEa <-fit1.SSE

fit1.PRE <-(SSEc-fit1.SSE)/SSEc   #   = 0.9749 as by summary(lm(fit1))
SSEc.noint <-sum(y^2) # =121.06
fit2.PRE<-(SSEc.noint-fit2.SSE)/SSEc.noint # = 0.994551 or 0.9946 as before

Hope this helps.

On 2010-06-11 2:16, Sandra Hawthorne wrote:
> Hi,
> I'm trying to verify the calculation of coefficient of determination (r
squared) for linear regression. I've done the calculation manually with a simple test case and using the definition of r squared outlined in summary(lm) help. There seems to be a discrepancy between the what R produced and the manual calculation. Does anyone know why this is so? What does the multiple r squared reported in summary(lm) represent?
> # The test case:
> x<- c(1,2,3,4)
> y<- c(1.6,4.4,5.5,8.3)
> dummy<- data.frame(x, y)
> fm1<- lm(y ~ x-1, data = dummy)
> summary(fm1)
> betax<- fm1$coeff[x] * sd(x) / sd(y)
> # cd is coefficient of determination
> cd<- betax * cor(y, x) mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Tue 15 Jun 2010 - 07:28:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 Jun 2010 - 07:30:34 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive