Re: [Rd] gaussian family change suggestion

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue 11 Apr 2006 - 14:59:53 GMT

On Tue, 11 Apr 2006, Simon Wood wrote:

>>> Currently the `gaussian' family's initialization code signals an error if
>>> any response data are zero or negative and a log link is used. Given that
>>> zero or negative response data are perfectly legitimate under the GLM
>>> fitted using `gaussian("log")', this seems a bit unsatisfactory. Might
>>> it be worth changing it?
>>
>> Well, that's not really true: it says it is unable to find starting
>> values, and asks you to provide some. I don't really see why your choices
>> are satisfactory: what happens if all the data are negative for example?
>
> - then the suggested code traps it.

Yes, but why (it can happen too)?

>> We could try even harder, but code that is almost never used tends to get
>> broken whilst no one is testing it. So if you want to pursue it I think
>> we need a comprehensive test suite.
>
> - well, I guess, but I wouldn't have thought that zeros in data modelled
> using `gaussian("log")' is such a rare occurance is it? [Or did you mean
> that `gaussian("log")' is almost never used, and should hence be kept
> simple].

I meant initialization code for this case would rarely get used, as I suppose those people who do this are used to providing starting values (or there are few or no such people).

>
> I suppose there are arguments both ways...
>
> best,
> Simon
>
>>
>>> The current offending code from `gaussian' is:
>>>
>>> initialize = expression({
>>> n <- rep.int(1, nobs)
>>> if (is.null(etastart) && is.null(start) && is.null(mustart) &&
>>> ((family$link == "inverse" && any(y == 0)) ||
>>> (family$link == "log" && any(y <= 0)))) stop(
>>> "cannot find valid starting values: please specify some")
>>> mustart <- y
>>> })
>>>
>>> A possible replacement would be ...
>>>
>>> initialize = expression({
>>> n <- rep.int(1, nobs)
>>> if (is.null(etastart) && is.null(start) && is.null(mustart) &&
>>> ((family$link == "inverse" && all(y == 0)) ||
>>> (family$link == "log" && all(y <= 0)))) stop(
>>> "cannot find valid starting values: please specify some")
>>> mustart <- y
>>> if (family$link=="log") {
>>> iy <- y<=0
>>> if (sum(iy)) mustart[iy] <- min(y[!iy])*.5
>>> } else if (family$link=="inverse") {
>>> iy <- y==0
>>> if (sum(iy)) mustart[iy] <- min(abs(y[!iy]))*.5
>>> }
>>> })
>>>
>>> best,
>>> Simon
>>>
>>>
>>>> - Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY
>>>> - +44 (0)1225 386603 www.maths.bath.ac.uk/~sw283/
>>>
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
>> --
>> Brian D. Ripley, ripley@stats.ox.ac.uk
>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel: +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK Fax: +44 1865 272595
>>
>
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed Apr 12 01:29:31 2006

This archive was generated by hypermail 2.1.8 : Tue 11 Apr 2006 - 18:17:01 GMT