Re: [R] R2 always increases as variables are added?

From: 李俊杰 <klijunjie_at_gmail.com>
Date: Mon, 21 May 2007 08:23:56 +0800

Hi, Mark

What I want to do exactly is that I want to make a comparison between a model with intercept and one without intercept on adjusted r2 term, since we know that minimizing adjusted r-square is a variable selection strategy. I know there are other alternatives to conduct a variable selection, but I really have to try this one.

Thanks.

2007/5/21, Leeds, Mark (IED) <Mark.Leeds_at_morganstanley.com>:
>
> Hi : You can put in the -1 and then create your own vector of 1's which
> which will be a "variable" but I'm not sure if I undersrand what you want
> and I don't think others do either because
> I didn't see other responses. I don't mnean to be offensive or rude but
> can you explain what you want to do more clearly. If you do that, I'm sure
> you willg et more responses.
>
>
>
> -----Original Message-----
> From: r-help-bounces_at_stat.math.ethz.ch [mailto:
> r-help-bounces_at_stat.math.ethz.ch] On Behalf Of ???
> Sent: Saturday, May 19, 2007 2:54 AM
> To: Paul Lynch
> Cc: r-help_at_stat.math.ethz.ch
> Subject: Re: [R] R2 always increases as variables are added?
>
> I know that "-1" indicates to remove the intercept term. But my question
> is why intercept term CAN NOT be treated as a variable term as we place a
> column consited of 1 in the predictor matrix.
>
> If I stick to make a comparison between a model with intercept and one
> without intercept on adjusted r2 term, now I think the strategy is always to
> use another definition of r-square or adjusted r-square, in which
> r-square=sum((y.hat)^2)/sum((y)^2).
>
> Am I in the right way?
>
> Thanks
>
> Li Junjie
>
>
> 2007/5/19, Paul Lynch <plynchnlm_at_gmail.com>:
> >
> > In case you weren't aware, the meaning of the "-1" in y ~ x - 1 is to
> > remove the intercept term that would otherwise be implied.
> > --Paul
> >
> > On 5/17/07, 李俊杰 <klijunjie_at_gmail.com> wrote:
> > > Hi, everybody,
> > >
> > > 3 questions about R-square:
> > > ---------(1)----------- Does R2 always increase as variables are
> added?
> > > ---------(2)----------- Does R2 always greater than 1?
> > > ---------(3)----------- How is R2 in summary(lm(y~x-1))$r.squared
> > > calculated? It is different from (r.square=sum((y.hat-mean
> > > (y))^2)/sum((y-mean(y))^2))
> > >
> > > I will illustrate these problems by the following codes:
> > > ---------(1)----------- R2 doesn't always increase as variables
> > > are
> > added
> > >
> > > > x=matrix(rnorm(20),ncol=2)
> > > > y=rnorm(10)
> > > >
> > > > lm=lm(y~1)
> > > > y.hat=rep(1*lm$coefficients,length(y))
> > > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > > [1] 2.646815e-33
> > > >
> > > > lm=lm(y~x-1)
> > > > y.hat=x%*%lm$coefficients
> > > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > > [1] 0.4443356
> > > >
> > > > ################ This is the biggest model, but its R2 is not the
> > biggest,
> > > why?
> > > > lm=lm(y~x)
> > > > y.hat=cbind(rep(1,length(y)),x)%*%lm$coefficients
> > > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > > [1] 0.2704789
> > >
> > >
> > > ---------(2)----------- R2 can greater than 1
> > >
> > > > x=rnorm(10)
> > > > y=runif(10)
> > > > lm=lm(y~x-1)
> > > > y.hat=x*lm$coefficients
> > > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > > [1] 3.513865
> > >
> > >
> > > ---------(3)----------- How is R2 in summary(lm(y~x-1))$r.squared
> > > calculated? It is different from (r.square=sum((y.hat-mean
> > > (y))^2)/sum((y-mean(y))^2))
> > > > x=matrix(rnorm(20),ncol=2)
> > > > xx=cbind(rep(1,10),x)
> > > > y=x%*%c(1,2)+rnorm(10)
> > > > ### r2 calculated by lm(y~x)
> > > > lm=lm(y~x)
> > > > summary(lm)$r.squared
> > > [1] 0.9231062
> > > > ### r2 calculated by lm(y~xx-1)
> > > > lm=lm(y~xx-1)
> > > > summary(lm)$r.squared
> > > [1] 0.9365253
> > > > ### r2 calculated by me
> > > > y.hat=xx%*%lm$coefficients
> > > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > > [1] 0.9231062
> > >
> > >
> > > Thanks a lot for any cue:)
> > >
> > >
> > >
> > >
> > > --
> > > Junjie Li, klijunjie_at_gmail.com
> > > Undergranduate in DEP of Tsinghua University,
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help_at_stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> > --
> > Paul Lynch
> > Aquilent, Inc.
> > National Library of Medicine (Contractor)
> >
>
>
>
> --
> Junjie Li, klijunjie_at_gmail.com
> Undergranduate in DEP of Tsinghua University,
>
> [[alternative HTML version deleted]]
> --------------------------------------------------------
>
> This is not an offer (or solicitation of an offer) to buy/sell the
> securities/instruments mentioned or an official confirmation. Morgan
> Stanley may deal as principal in or own or act as market maker for
> securities/instruments mentioned or may advise the issuers. This is not
> research and is not from MS Research but it may refer to a research
> analyst/research report. Unless indicated, these views are the author's and
> may differ from those of Morgan Stanley research or others in the Firm. We
> do not represent this is accurate or complete and we may not update
> this. Past performance is not indicative of future returns. For additional
> information, research reports and important disclosures, contact me or see
> https://secure.ms.com/servlet/cls. You should not use e-mail to request,
> authorize or effect the purchase or sale of any security or instrument, to
> send transfer instructions, or to effect any other transactions. We cannot
> guarantee that any such requests received via e-mail will be processed in a
> timely manner. This communication is solely for the addressee(s) and may
> contain confidential information. We do not waive confidentiality by
> mistransmission. Contact me if you do not wish to receive these
> communications. In the UK, this communication is directed in the UK to
> those persons who are market counterparties or intermediate customers (as
> defined in the UK Financial Services Authority's rules).
>

-- 
Junjie Li,                  klijunjie_at_gmail.com
Undergranduate in DEP of Tsinghua University,

	[[alternative HTML version deleted]]

______________________________________________ R-help_at_stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Mon 21 May 2007 - 00:33:33 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 21 May 2007 - 01:31:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.