Re: [Rd] loess prediction algorithm

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 26 Jul 2007 09:11:28 +0100 (BST)

The R interface is just a wrapper for those Netlib C/Fortran functions. I don't think anyone is going to be able (or willing) to read and explain those for you.

You do need to understand the loess.control parameters, and I believe they are explained in the White Book. But perhaps you should use the simplest options in R as a baseline.

I don't believe your sketchy description of tricube weights is correct: the White Book has the details.

The default degree is 2, not linear fits.

On Wed, 25 Jul 2007, apjaworski_at_mmm.com wrote:

>
> Hello,
>
> I need help with the details of loess prediction algorithm. I would like
> to get it implemented as a part of a measurement system programmed in
> LabView. My job is provide a detailed description of the algorithm. This
> is a simple one-dimensional problem - smoothing an (x, y) data set.
>
> I found quite a detailed description of the fitting procedure in the "white
> book". It is also described in great detail at the NIST site in the
> Engineering Statistics Handbook. It provides an example of Loess
> computations. I managed to reproduce their example exactly in R. At each
> data point I compute a weighted local linear fit using the number of points
> based of span. Then I predict the values from these local fits. This
> matches R "loess" predictions exactly.
>
> The problem starts when I try to predict at x values not in the data set.
> The "white book" does not talk about predictions at all. In the NIST
> handbook in the "Final note on Loess Computations" they mention this type
> of predictions but just say that the same steps are used for predictions as
> for fitting.
>
> When I try to use "the same steps" I get predictions that are quite
> different that the predictions obtained by fitting R loess model to a data
> set and running predict(<model object>, newdata=<grid of x values>). They
> match quite well at the lowest and highest ends of the x grid but in the
> middle are different and much less smooth. I can provide details but
> basically what I do to create the predictions at x0 is this:
> 1. I append c(x0, NA) to the data frame of (x, y) data.
> 2. I calculate abs(xi-x0), i.e., absolute deviations of the x values in
> the data set and a given x0 value.
> 3. I sort the data set according to these deviations. This way the first
> row has the (x0, NA) value.
> 4. I drop the first row.
> 5. I divide all the deviations by the m-th one, where m is the number of
> points used in local fitting - m = floor(n*span) where n is the number of
> points.
> 6. I calculate the "tricube" weights and assign 0's to the negative ones.
> This eliminates all the points except the m points of interest.
> 7. I fit a linear weighted regression using lm.
> 8. I predict y(x0) from this linear model.
> This is basically the same procedure I use to predict at the x values from
> the data set, except for point 4.
>
> I got the R sources for loess but it looks to me like most of the work is
> done in a bunch of Fortran modules. They are very difficult to read and
> understand, especially since they handle multiple x values. A couple of
> things that worry me are parameters in loess.control such as surface and
> cell. They seem to have something to do with predictions but I do not
> account for them in my simple procedure above.
>
> Could anyone shed a light on this problem? Any comment will be
> appreciated.
>
> I apologize in advance if this should have been posted in r-help. I
> figured that I have a better chance asking people who read the r-devel
> group, since they are likely to know more details about inner workings of
> R.
>
> Thanks in advance,
>
> Andy
>
> __________________________________
> Andy Jaworski
> 518-1-01
> Process Laboratory
> 3M Corporate Research Laboratory
> -----
> E-mail: apjaworski_at_mmm.com
> Tel: (651) 733-6092
> Fax: (651) 736-3122
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 26 Jul 2007 - 08:15:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 26 Jul 2007 - 15:07:03 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.