Re: [R] glm expand model to more values

From: Henric Nilsson (Public) <nilsson.henric_at_gmail.com>
Date: Sun, 13 Jan 2008 17:17:40 +0100

Jarek Jasiewicz wrote:

> Charles Annis, P.E. wrote:

>> Jarek:
>>
>> Although it is not universally agreed on, I believe the first step in any
>> data analysis is to PLOT YOUR DATA.
>>
>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
>> plot(b ~ a, data=dd)
>> simple.model <- lm(b~a,data=dd)
>> abline(simple.model)
>>
>> Why to you think you need a cubic model to describe 6 observations?
>>
>> Your model is overparameterized - it has two more parameters than the number
>> of observations can reasonably justify, something that would be obvious from
>> your plot.
>>
>> The summary of the simple.linear model shows both the intercept and the
>> slope are statistically meaningful. (That's what the asterisks mean.)
>>
>> Call:
>> lm(formula = b ~ a, data = dd)
>>
>> Residuals:
>> 1 2 3 4 5 6
>> -0.23810 0.39048 0.01905 -0.35238 0.27619 -0.09524
>>
>> Coefficients:
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 1.86667 0.30132 6.195 0.00345 **
>> a 1.37143 0.07737 17.725 5.95e-05 ***
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> Residual standard error: 0.3237 on 4 degrees of freedom
>> Multiple R-Squared: 0.9874, Adjusted R-squared: 0.9843
>> F-statistic: 314.2 on 1 and 4 DF, p-value: 5.952e-05
>>
>> I think you should invest a small amount of your time, and an even smaller
>> amount of your money to purchase and read - cover-to-cover - one of the
>> several very good books on elementary statistics and R. My recommendation
>> is _Introductory Statistics with R_ by Peter Dalgaard (Paperback - Jan 9,
>> 2004). Amazon.com carries it.
>>
>> Best wishes.
>>
>>
>>
>> Charles Annis, P.E.
>>
>> Charles.Annis_at_StatisticalEngineering.com
>> phone: 561-352-9699
>> eFax: 614-455-3265
>> http://www.StatisticalEngineering.com
>>
>>
>> -----Original Message-----
>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On
>> Behalf Of Jarek Jasiewicz
>> Sent: Saturday, January 12, 2008 2:06 PM
>> To: Charles.Annis_at_statisticalengineering.com
>> Cc: R-help_at_r-project.org
>> Subject: Re: [R] glm expand model to more values
>>
>> Charles Annis, P.E. wrote:
>>
>>> How many parameters are you trying to estimate? How many observations do
>>> you have?
>>>
>>> What is wrong is that half of your parameter estimates are statistically
>>> meaningless:
>>>
>>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
>>>
>>> overparameterized.model <- glm(b~poly(a,3),data=dd)
>>>
>>> summary(overparameterized.model)
>>>
>>>
>>> Coefficients:
>>> Estimate Std. Error t value Pr(>|t|)
>>>
>>> (Intercept) 6.6667 0.1725 38.644 0.000669 ***
>>>
>>> poly(a, 3)1 5.7371 0.4226 13.576 0.005382 **
>>>
>>> poly(a, 3)2 -0.1091 0.4226 -0.258 0.820395
>>>
>>> poly(a, 3)3 0.2236 0.4226 0.529 0.649562
>>>
>>>
>>>
>>>
>>> Charles Annis, P.E.
>>>
>>> Charles.Annis_at_StatisticalEngineering.com
>>> phone: 561-352-9699
>>> eFax: 614-455-3265
>>> http://www.StatisticalEngineering.com
>>>
>>>
>>> -----Original Message-----
>>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
>>>
>> On
>>
>>> Behalf Of Jarek Jasiewicz
>>> Sent: Saturday, January 12, 2008 11:50 AM
>>> To: R-help_at_r-project.org
>>> Subject: [R] glm expand model to more values
>>>
>>> Hi
>>>
>>> I have the problem with fitting curve to data with lm and glm. When I
>>> use polynominal dependiency, fitted values from model are OK, but I
>>> cannot recive proper values when I use coefficents to caltulate this.
>>> Let me present simple example:
>>>
>>> I have simple data.frame: (dd)
>>> a: 1 2 3 4 5 6
>>> b: 3 5 6 7 9 10
>>>
>>> I try to fit it to model:
>>>
>>> model=glm(b~poly(a,3),data=dd)
>>> I have following data fitted to model (as I expected)
>>> > fitted(model)
>>> 1 2 3 4 5 6
>>> 3.095238 4.738095 6.095238 7.333333 8.619048 10.119048
>>>
>>> and coef(model)
>>> (Intercept) poly(a, 3)1 poly(a, 3)2 poly(a, 3)3
>>> 6.6666667 5.7370973 -0.1091089 0.2236068
>>>
>>> so when I try to expand the model to other data (simple extrapolation),
>>> let say: s=seq(1:10,by=1)
>>>
>>> I do:
>>> extra=sapply(s,function(x) coef(model) %*% x^(0:3))
>>> and here is result:
>>> [1] 12.51826 19.49328 28.93336 42.18015 60.57528 85.46040 118.17714
>>> [8] 160.06715 212.47207 276.73354
>>>
>>> the data form expanding coefs are completly differnd from fitted
>>>
>>> What's going wrong?
>>>
>>> Jarek
>>>
>>> ______________________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>>
>> http://www.R-project.org/posting-guide.html
>>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>> sorry but I cannot understand. What does it means data are statistically
>> meanningless?
>>
>> It is examle with very simple data which I use according to simpleR
>> manual example to check why I cannot recive expected result. I need
>> simple model y~x^3+x^2....+z to extrapolate data
>> Jarek
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> I understand that data are not well example. But I try to find rather 
> general solution.
> Original data are list 98 dataframes and are calculated by over 100 
> lines R script I thought that it is too much to attach them, so I typed 
> few digits to ilustrate problem.
> 
> The question was asked wrong. It shoud be:
> 
> if formulas:
> pol3_model=lm(b~poly(a,3))
> p3_model=lm(b~a+I(a^2)+I(a^3))
> 
>  are the same? according R documetation - Yes

They're not using the same parameterization. You need to read `?poly'.

> both gives the same fitted() values, but completly different coef()

`poly' returns *orthogonal* polynomials unless told otherwise -- see the `raw' argument. Try setting `raw = TRUE', but for numerical reasons you may prefer the default.

HTH,
Henric

> 
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Jan 2008 - 16:21:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 13 Jan 2008 - 18:30:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive