# density(kernel = "cosine") .. the `wrong cosine' ..

Subject: density(kernel = "cosine") .. the `wrong cosine' ..
From: Martin Maechler (maechler@stat.math.ethz.ch)
Date: Wed 01 Dec 1999 - 19:56:59 EST

```Message-ID: <14404.61675.357953.94472@gargle.gargle.HOWL>
```

I'm in teaching mode, kernel densities.

{History: density() was newly introduced in version 0.15, 19 Dec 1996;
most probably by Ross or Robert
}

When I was telling the students about different kernels (and why their
choice is not so important, and "equivalent bandwidths" etc,etc)
I wondered about the "Cosine" in my teaching notes which
is defined there as

k(x) = pi/4 * cos(pi/2 * x) * I{ |x| <= 1 }
i.e. in R
Kcos <- function(x) ifelse(abs(x) <= 1, pi/4 * cos(pi/2 * x), 0)

Now, R has instead (for bandwidth h <- bw/1.135724 which makes the bandwidth
Gaussian equivalent;
here just h == 1/pi to be similar to above)

Kcosine <- function(x) ifelse(abs(x) < 1, (1+cos(x*pi))/2 , 0)

I've looked in Dave Scott's (and Haerdle's "Smoothing... in S") book,
(Silverman doesn't mention any cosine kernel)
and both define the cosine kernel as I have it in my notes.

With above R code, look at

x <- seq(-1.2,1.2,len=501)
matplot(x, cbind(Kcos(x),Kcosine(x)), type='l', lty=1)

The big difference :

- R's version is smooth (differentiable at the border of support)
- Scott's (not really "his", of course!) version is not differentiable
but looks much closer to the Epanechnikov kernel and is hence almost
as `good' (less than half a percent of MSE loss w.r.t Epanechnikov).

Problem:

- An average user knowing some statistics literature will most probably
assume that a "cosine" kernel means the one in the literature,
*NOT* the one we have in R now.

Proposition / Possibilities / RFC [= Request For Comments] :

- We CHANGE the behavior of density(* , kernel="cosine")
to use the cosine from the litterature.

- provide the current "cosine" as kernel = "smoothcosine"
{I'd like to keep the possibility of 1-initial-letter abbreviation}

Enhancement (easy, I'll do that):

- We further provide both
Epanechnikov and "quartic" aka "biweight" additionally
in any case.

Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO D10 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

This archive was generated by hypermail 2b25 : Tue 04 Jan 2000 - 14:16:11 EST