Re: [R] Root mean square on binned GAM results

From: Joris Meys <jorismeys_at_gmail.com>
Date: Sat, 19 Jun 2010 03:31:14 +0200

Don't know about the correlations (never used them in a gam context actually...), but you can "bin" the mean by :
> x <- 1:100
> tapply(x,cut(x,5),mean)

(0.901,20.7] (20.7,40.6] (40.6,60.4] (60.4,80.3] (80.3,100]

        10.5 30.5 50.5 70.5 90.5

Cheers
Joris

On Sat, Jun 19, 2010 at 1:54 AM, David Jarvis <thangalin_at_gmail.com> wrote:
> Hi,
>
> Standard correlations (Pearson's, Spearman's, Kendall's Tau) do not
> accurately reflect how closely the model (GAM) fits the data. I was told
> that the accuracy of the correlation can be improved using a root mean
> square deviation (RMSD) calculation on binned data.
>
> For example, let 'o' be the real, observed data and 'm' be the model data. I
> believe I can calculate the root mean squared deviation as:
>
> sqrt( mean( o - m ) ^ 2 )
>
> However, this does not bin the data into mean sets. What I would like to do
> is:
>
> oangry <- c( mean(o[1:5]), mean(o[6:10]), ... )
> mangry <- c( mean(m[1:5]), mean(m[6:10]), ... )
>
> Then:
>
> sqrt( mean( oangry - mangry ) ^ 2 )
>
> That calculation I would like to simplify into (or similar to):
>
> sqrt( mean( bin( o, 5 ) - bin( m, 5 ) ) ^ 2 )
>
> I have read the help for ?cut, ?table, ?hist, and ?split, but am stumped for
> which one to use in this case--if any.
>
> How do you calculate c( mean(o[1:5]), mean(o[6:10]), ... ) for an arbitrary
> length vector using an appropriate number of bins (fixed at 5, or perhaps
> calculated using Sturges' formula)?
>
> I have also posted a more detailed version of this question on
> StackOverflow:
>
>
http://stackoverflow.com/questions/3073365/root-mean-square-deviation-on-binned-gam-results-using-r
>
> Many thanks.
>
> Dave
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
Joris.Meys_at_Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 19 Jun 2010 - 01:32:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 19 Jun 2010 - 02:50:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive