Re: [Rd] application to mentor syrfr package development for Google Summer of Code 2010

From: James Salsman <>
Date: Sun, 07 Mar 2010 21:00:27 -0800


If I understand your concern, you want to lay the foundation for derivatives so that you can implement the search strategies described in Schmidt and Lipson (2010) -- -- is that right? It is not clear to me how well this generalized approach will work in practice, but there is no reason not to proceed in parallel to establish a framework under which you could implement the metrics proposed by Schmidt and Lipson in the contemplated syrfr package.

I have expanded the test I proposed with two more questions -- at -- specifically:

5. Critique

6. Use anova to compare the goodness-of-fit of a SSfpl nls fit with a linear model of your choice. How can your characterize the degree-of-freedom-adjusted goodness of fit of nonlinear models?

I believe pairwise anova.nls is the optimal comparison for nonlinear models, but there are several good choices for approximations, including the residual standard error, which I believe can be adjusted for degrees of freedom, as can the F statistic which TableCurve uses; see:

Best regards,
James Salsman

On Sun, Mar 7, 2010 at 7:35 PM, Chidambaram Annamalai <> wrote:
> It's been a while since I proposed syrfr and I have been constantly in
> contact with the many people in the R community and I wasn't able to find a
> mentor for the project. I later got interested in the Automatic
> Differentiation proposal (adinr) and, on consulting with a few others within
> the R community, I mailed John Nash (who proposed adinr in the first place)
> if he'd be willing to take me up on the project. I got a positive reply only
> a few hours ago and it was my mistake to have not removed the syrfr proposal
> in time from the wiki, as being listed under proposals looking for mentors.
> While I appreciate your interest in the syrfr proposal I am afraid my
> allegiances have shifted towards the adinr proposal, as I got convinced that
> it might interest a larger group of people and it has wider scope in
> general.
> I apologize for having caused this trouble.
> Best Regards,
> Chillu
> On Mon, Mar 8, 2010 at 6:41 AM, James Salsman <>
> wrote:
>> Per
>> -- and
>> -- I am applying to mentor the "Symbolic Regression for R" (syrfr)
>> package for the Google Summer of Code 2010.
>> I propose the following test which an applicant would have to pass in
>> order to qualify for the topic:
>> 1. Describe each of the following terms as they relate to statistical
>> regression: categorical, periodic, modular, continuous, bimodal,
>> log-normal, logistic, Gompertz, and nonlinear.
>> 2. Explain which parts of were adopted in
>> SigmaPlot and which weren't.
>> 3. Use the 'outliers' package to improve a regression fit maintaining
>> the correct extrapolation confidence intervals as are between those
>> with and without outlier exclusions in proportion to the confidence
>> that the outliers were reasonably excluded.  (Show your R transcript.)
>> 4. Explain the relationship between degrees of freedom and correlated
>> independent variables.
>> Best regards,
>> James Salsman
>> ______________________________________________
>> mailing list
> mailing list Received on Mon 08 Mar 2010 - 05:09:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 08 Mar 2010 - 08:00:59 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive