From: Ben Bolker <bbolker_at_gmail.com>

Date: Tue, 21 Feb 2012 16:57:23 -0500

d <- data.frame(y=rpois(10,1),x=rep(c(1,NA),5),f=rep(1:2,each=5))

## subset AND NA omission

g <- glm(y~x,data=d,subset=(f==2),family=poisson)

poisson()$simulate(g,1) ## works

poisson()$simulate(g,1) ## fails

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 21 Feb 2012 - 22:00:24 GMT

Date: Tue, 21 Feb 2012 16:57:23 -0500

I'm wondering whether anyone has any insight into why the 'simulate' methods for the built-in glm() families (binomial, Poisson, Gamma ...) extract the prior weights using object$prior.weights rather than weights(object,"prior") ?

At first I thought this was so that things work correctly when e.g. subset= and na.action=na.exclude are used. However, the current versions of the functions don't seem to work particularly well when these options are used. A Poisson example, along with a modified function that is na.exclude-safe, is shown below ...

My reason for wanting this is that I'm working on a package that uses glm-like objects that don't have list-like structure, so can't contain $prior.weights elements. If weights were accessed using accessors, then we could use the existing simulate() functions ...

If R-core agreed that this is a shortcoming of the current implementation, all of the simulate() methods would need to be rewritten to account for this (as far as I can see there are four, for binomial/Poisson/Gamma/inverse Gaussian -- the base method in simulate.lm works OK (although it generates unnecessary random deviates corresponding to the NA values in the fitted output).

Actually, an even better design (in my opinion) might be to use predict(object,type="response",...) internally rather than fitted() -- this would, for free, allow the use case

simulate(object,newdata=...)

which would simulate values from the model for a novel set of predictor variables ... (coincidentally, this would also allow our package to use the existing framework more easily).

Are there any circumstances under which predict(object,type="response") is/could be *different* from fitted(object) for a 'glm' object ... ?

cheers

Ben Bolker

d <- data.frame(y=rpois(10,1),x=rep(c(1,NA),5),f=rep(1:2,each=5))

## subset AND NA omission

g <- glm(y~x,data=d,subset=(f==2),family=poisson)

length(fitted(g)) ## 2 length(weights(g)) ## 2 length(g$prior.weights) ## 2

poisson()$simulate(g,1) ## works

## now try na.exclude

g <- glm(y~x,data=d,subset=(f==2),na.action=na.exclude,

family=poisson)

length(fitted(g)) ## 5 length(weights(g)) ## 5 length(g$prior.weights) ## 2

poisson()$simulate(g,1) ## fails

## here's a new, 'safe' version of the Poisson simulate function ...

simnew <- function (object, nsim) {

wts <- weights(object)

if (any(na.omit(wts) != 1))

warning("ignoring prior weights")

ftd <- na.omit(fitted(object))

napredict(na.action(object),

rpois(nsim * length(ftd), ftd)) }

simnew(g,1)

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 21 Feb 2012 - 22:00:24 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

*
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.2.0, at Mon 27 Feb 2012 - 13:50:21 GMT.
*

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*