From: Henric Nilsson (Public) <nilsson.henric_at_gmail.com>

Date: Sun, 13 Jan 2008 17:17:40 +0100

>> Jarek:

*>>
*

*>> Although it is not universally agreed on, I believe the first step in any
*

*>> data analysis is to PLOT YOUR DATA.
*

*>>
*

*>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
*

*>> plot(b ~ a, data=dd)
*

*>> simple.model <- lm(b~a,data=dd)
*

*>> abline(simple.model)
*

*>>
*

*>> Why to you think you need a cubic model to describe 6 observations?
*

*>>
*

*>> Your model is overparameterized - it has two more parameters than the number
*

*>> of observations can reasonably justify, something that would be obvious from
*

*>> your plot.
*

*>>
*

*>> The summary of the simple.linear model shows both the intercept and the
*

*>> slope are statistically meaningful. (That's what the asterisks mean.)
*

*>>
*

*>> Call:
*

*>> lm(formula = b ~ a, data = dd)
*

*>>
*

*>> Residuals:
*

*>> 1 2 3 4 5 6
*

*>> -0.23810 0.39048 0.01905 -0.35238 0.27619 -0.09524
*

*>>
*

*>> Coefficients:
*

*>> Estimate Std. Error t value Pr(>|t|)
*

*>> (Intercept) 1.86667 0.30132 6.195 0.00345 **
*

*>> a 1.37143 0.07737 17.725 5.95e-05 ***
*

*>> ---
*

*>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
*

*>>
*

*>> Residual standard error: 0.3237 on 4 degrees of freedom
*

*>> Multiple R-Squared: 0.9874, Adjusted R-squared: 0.9843
*

*>> F-statistic: 314.2 on 1 and 4 DF, p-value: 5.952e-05
*

*>>
*

*>> I think you should invest a small amount of your time, and an even smaller
*

*>> amount of your money to purchase and read - cover-to-cover - one of the
*

*>> several very good books on elementary statistics and R. My recommendation
*

*>> is _Introductory Statistics with R_ by Peter Dalgaard (Paperback - Jan 9,
*

*>> 2004). Amazon.com carries it.
*

*>>
*

*>> Best wishes.
*

*>>
*

*>>
*

*>>
*

*>> Charles Annis, P.E.
*

*>>
*

*>> Charles.Annis_at_StatisticalEngineering.com
*

*>> phone: 561-352-9699
*

*>> eFax: 614-455-3265
*

*>> http://www.StatisticalEngineering.com
*

*>>
*

*>>
*

*>> -----Original Message-----
*

*>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On
*

*>> Behalf Of Jarek Jasiewicz
*

*>> Sent: Saturday, January 12, 2008 2:06 PM
*

*>> To: Charles.Annis_at_statisticalengineering.com
*

*>> Cc: R-help_at_r-project.org
*

*>> Subject: Re: [R] glm expand model to more values
*

*>>
*

*>> Charles Annis, P.E. wrote:
*

*>>
*

*>>> How many parameters are you trying to estimate? How many observations do
*

*>>> you have?
*

*>>>
*

*>>> What is wrong is that half of your parameter estimates are statistically
*

*>>> meaningless:
*

*>>>
*

*>>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
*

*>>>
*

*>>> overparameterized.model <- glm(b~poly(a,3),data=dd)
*

*>>>
*

*>>> summary(overparameterized.model)
*

*>>>
*

*>>>
*

*>>> Coefficients:
*

*>>> Estimate Std. Error t value Pr(>|t|)
*

*>>>
*

*>>> (Intercept) 6.6667 0.1725 38.644 0.000669 ***
*

*>>>
*

*>>> poly(a, 3)1 5.7371 0.4226 13.576 0.005382 **
*

*>>>
*

*>>> poly(a, 3)2 -0.1091 0.4226 -0.258 0.820395
*

*>>>
*

*>>> poly(a, 3)3 0.2236 0.4226 0.529 0.649562
*

*>>>
*

*>>>
*

*>>>
*

*>>>
*

*>>> Charles Annis, P.E.
*

*>>>
*

*>>> Charles.Annis_at_StatisticalEngineering.com
*

*>>> phone: 561-352-9699
*

*>>> eFax: 614-455-3265
*

*>>> http://www.StatisticalEngineering.com
*

*>>>
*

*>>>
*

*>>> -----Original Message-----
*

*>>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
*

*>>>
*

*>> On
*

*>>
*

*>>> Behalf Of Jarek Jasiewicz
*

*>>> Sent: Saturday, January 12, 2008 11:50 AM
*

*>>> To: R-help_at_r-project.org
*

*>>> Subject: [R] glm expand model to more values
*

*>>>
*

*>>> Hi
*

*>>>
*

*>>> I have the problem with fitting curve to data with lm and glm. When I
*

*>>> use polynominal dependiency, fitted values from model are OK, but I
*

*>>> cannot recive proper values when I use coefficents to caltulate this.
*

*>>> Let me present simple example:
*

*>>>
*

*>>> I have simple data.frame: (dd)
*

*>>> a: 1 2 3 4 5 6
*

*>>> b: 3 5 6 7 9 10
*

*>>>
*

*>>> I try to fit it to model:
*

*>>>
*

*>>> model=glm(b~poly(a,3),data=dd)
*

*>>> I have following data fitted to model (as I expected)
*

*>>> > fitted(model)
*

*>>> 1 2 3 4 5 6
*

*>>> 3.095238 4.738095 6.095238 7.333333 8.619048 10.119048
*

*>>>
*

*>>> and coef(model)
*

*>>> (Intercept) poly(a, 3)1 poly(a, 3)2 poly(a, 3)3
*

*>>> 6.6666667 5.7370973 -0.1091089 0.2236068
*

*>>>
*

*>>> so when I try to expand the model to other data (simple extrapolation),
*

*>>> let say: s=seq(1:10,by=1)
*

*>>>
*

*>>> I do:
*

*>>> extra=sapply(s,function(x) coef(model) %*% x^(0:3))
*

*>>> and here is result:
*

*>>> [1] 12.51826 19.49328 28.93336 42.18015 60.57528 85.46040 118.17714
*

*>>> [8] 160.06715 212.47207 276.73354
*

*>>>
*

*>>> the data form expanding coefs are completly differnd from fitted
*

*>>>
*

*>>> What's going wrong?
*

*>>>
*

*>>> Jarek
*

*>>>
*

*>>> ______________________________________________
*

*>>> R-help_at_r-project.org mailing list
*

*>>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>>> PLEASE do read the posting guide
*

*>>>
*

*>> http://www.R-project.org/posting-guide.html
*

*>>
*

*>>> and provide commented, minimal, self-contained, reproducible code.
*

*>>>
*

*>>>
*

*>>>
*

*>> sorry but I cannot understand. What does it means data are statistically
*

*>> meanningless?
*

*>>
*

*>> It is examle with very simple data which I use according to simpleR
*

*>> manual example to check why I cannot recive expected result. I need
*

*>> simple model y~x^3+x^2....+z to extrapolate data
*

*>> Jarek
*

*>>
*

*>> ______________________________________________
*

*>> R-help_at_r-project.org mailing list
*

*>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
*

*>> and provide commented, minimal, self-contained, reproducible code.
*

*>>
*

*>>
*

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Jan 2008 - 16:21:52 GMT

Date: Sun, 13 Jan 2008 17:17:40 +0100

Jarek Jasiewicz wrote:

> Charles Annis, P.E. wrote:

>> Jarek:

> I understand that data are not well example. But I try to find rather > general solution. > Original data are list 98 dataframes and are calculated by over 100 > lines R script I thought that it is too much to attach them, so I typed > few digits to ilustrate problem. > > The question was asked wrong. It shoud be: > > if formulas: > pol3_model=lm(b~poly(a,3)) > p3_model=lm(b~a+I(a^2)+I(a^3)) > > are the same? according R documetation - Yes

They're not using the same parameterization. You need to read `?poly'.

> both gives the same fitted() values, but completly different coef()

`poly' returns *orthogonal* polynomials unless told otherwise -- see the `raw' argument. Try setting `raw = TRUE', but for numerical reasons you may prefer the default.

**HTH,
**

Henric

> > ______________________________________________ > R-help_at_r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Jan 2008 - 16:21:52 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sun 13 Jan 2008 - 18:30:07 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*