Re: [R] Estimate of variance and prediction for multiple linear regression

From: cc super <supercc110_at_gmail.com>
Date: Wed, 23 Jun 2010 16:30:12 -0700

What if the size of the newdata is different from the previous one used to generate the regression model?

Let's say

pdat <- data.frame(x1 = rnorm(5, 2), x2 = rnorm(5)) predict(lin, pdat)

It comes up with warning and the result is not correct.

Thanks!

2010/6/23 Gavin Simpson <gavin.simpson_at_ucl.ac.uk>

> On Tue, 2010-06-22 at 23:11 -0700, cc super wrote:
> > Hi, everyone,
> >
> > Night. I have three questions about multiple linear regression in R.
> >
> > Q1:
> >
> > y=rnorm(10,mean=5)
> > x1=rnorm(10,mean=2)
> > x2=rnorm(10)
> > lin=lm(y~x1+x2)
> > summary(lin)
> >
> > ## In the summary, 'Residual standard error: 1.017 on 7 degrees of
> freedom',
> > 1.017 is the estimate of the constance variance?
>
> Yes, it is sigma.
>
> Just a note, in order for the above code to yield the same results as
> you quote, you need a call to set.seed() to fix the pseudo random number
> generator.
>
> > Q2:
> >
> > beta0=lin$coefficients[1]
> > beta1=lin$coefficients[2]
> > beta2=lin$coefficients[3]
> >
> > y_hat=beta0+beta1*x1+beta2*x2
> >
> > ## Is there any built-in function in R to obtain y_hat directly?
>
> fitted(lin)
>
> Note that there are quite a few standard extractor functions like fitted
> available for modelling functions in R. coef() for example should be
> used to extract the coefficients, resid() will extract residuals etc.
>
> > Q3:
> >
> > If I want to apply this regression result to another dataset, that is,
> new
> > x1 and x2. Is the built-in function in 2 still available?
>
> It is called predict() (although if you called predict(lin) above
> instead of fitted(lin) it would have produced the same answer; the
> fitted values for the observations).
>
> One gotcha that catches people out is that in the new dataset, the
> variables (used in the model) must have the same names as the data frame
> used to fit it. So we could do:
>
> pdat <- data.frame(x1 = rnorm(10, 2), x2 = rnorm(10))
> predict(lin, pdat)

>
> to get predictions at the new values of x1 an x2.
>
> > Thank you in advance!
>
> HTH
>
> G
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/>
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Jun 2010 - 01:14:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 Jun 2010 - 07:20:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive