From: Achim Zeileis <Achim.Zeileis_at_wu-wien.ac.at>

Date: Mon, 18 Feb 2008 14:40:07 +0100 (CET)

Y <- rep(0, 100)

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 18 Feb 2008 - 13:44:42 GMT

Date: Mon, 18 Feb 2008 14:40:07 +0100 (CET)

On Mon, 18 Feb 2008, Sarah J Thomas wrote:

> Hello all:

*>
**> I have a question regarding the fitted.values returned from the
**> zeroinfl() function. The values seem to be nearly identical to those
**> fitted.values returned by the ordinary glm(). Why is this, shouldn't
**> they be more "zero-inflated"?
**>
**> I construct a zero-inflated series of counts, called Y, like so:
*

To make this reproducible, I set the random seed to

set.seed(123)

in advance and then ran your source code

b= as.vector(c(1.5, -2))

g= as.vector(c(-3, 1))

x <- runif(100) # x is the covariate

X <- cbind(1,x)

p <- exp(X%*%g)/(1+exp(X%*%g))

m <- exp(X%*%b) # log-link for the mean process

# of the Poisson

Y <- rep(0, 100)

u <- runif(100)

for(i in 1:100) {

if( u[i] < p[i] ) { Y[i] = 0 }

else { Y[i] <- rpois(1, m[i]) }

}

# now let's compare the fitted.values from zeroinfl() # and from glm()

z1 <- glm(Y ~ x, family=poisson)

z2 <- zeroinfl(Y ~ x|x) #poisson is the default

[snip]

> You can see that they are almost identical... and the fitted.values from

*> zeroinfl don't seem to be zero-inflated at all! What is going on?
*

Well, let's see how zero inflated your observations are:

R> sum(u < p)

[1] 2

Wow, two (!) observations that have been zero-inflated. Let's see how much the probability for observing a zero would have been

R> dpois(0, m[u < p])

[1] 0.3147816 0.1409670

which is not so low, in particular for the first one.

Overall, you've got

R> sum(Y < 1)

[1] 23

zeros in that data set and the expected number of zeros in a Poisson GLM is

R> sum(dpois(0, fitted(z1)))

[1] 23.35615

So you have observed *less* zeros than expected by a Poisson GLM. Surely, this is not the kind of data that zero-inflated models have been developed for.

> Ultimately I want these fitted.values for a goodness of fit type of test

*> to see if the zeroinfl model is needed or not for a given data series.
**> With these fitted.values as they are, I am rejecting assumption of a
**> zero-inflated model even when the data really are zero-inflated.
*

Maybe you ought to think about useful data-generating processes first before designing tests or criticizing software... Z

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 18 Feb 2008 - 13:44:42 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 18 Feb 2008 - 14:00:15 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*