Re: [Rd] prior.weights and weights()

From: Ben Bolker <bbolker_at_gmail.com>
Date: Mon, 27 Feb 2012 13:10:43 +0000

Ben Bolker <bbolker <at> gmail.com> writes:

  Bump? (Quoting with "> " removed to make Gmane happy)

  I've since realized that it will be harder to use the built-in $simulate methods in my application than I thought, but I'm still curious about this issue (and might still be able to use them with a slight hack-around if simulate() methods internally used predict(object,type="response",...)) but I'm still curious about this issue.

  And I still think I have shown below that $simulate() doesn't work properly with na.action=na.exclude ...

  Ben Bolker



I'm wondering whether anyone has any insight into why the 'simulate' methods for the built-in glm() families (binomial, Poisson, Gamma ...) extract the prior weights using object$prior.weights rather than weights(object,"prior") ?

  At first I thought this was so that things work correctly when e.g. subset= and na.action=na.exclude are used. However, the current versions of the functions don't seem to work particularly well when these options are used. A Poisson example, along with a modified function that is na.exclude-safe, is shown below ...

  My reason for wanting this is that I'm working on a package that uses glm-like objects that don't have list-like structure, so can't contain $prior.weights elements. If weights were accessed using accessors, then we could use the existing simulate() functions ...

  If R-core agreed that this is a shortcoming of the current implementation, all of the simulate() methods would need to be rewritten to account for this (as far as I can see there are four, for binomial/Poisson/Gamma/inverse Gaussian -- the base method in simulate.lm works OK (although it generates unnecessary random deviates corresponding to the NA values in the fitted output).

  Actually, an even better design (in my opinion) might be to use predict(object,type="response",...) internally rather than fitted() -- this would, for free, allow the use case

simulate(object,newdata=...)

  which would simulate values from the model for a novel set of predictor variables ... (coincidentally, this would also allow our package to use the existing framework more easily).

  Are there any circumstances under which predict(object,type="response") is/could be *different* from fitted(object) for a 'glm' object ... ?

  cheers
    Ben Bolker



d <- data.frame(y=rpois(10,1),x=rep(c(1,NA),5),f=rep(1:2,each=5))
## subset AND NA omission

g <- glm(y~x,data=d,subset=(f==2),family=poisson)
length(fitted(g)) ## 2
length(weights(g)) ## 2
length(g$prior.weights) ## 2

poisson()$simulate(g,1) ## works

## now try na.exclude
g <- glm(y~x,data=d,subset=(f==2),na.action=na.exclude,

         family=poisson)

length(fitted(g)) ## 5
length(weights(g)) ## 5
length(g$prior.weights) ## 2

poisson()$simulate(g,1) ## fails

## here's a new, 'safe' version of the Poisson simulate function ...
simnew <- function (object, nsim) {
  wts <- weights(object)
  if (any(na.omit(wts) != 1))
    warning("ignoring prior weights")
  ftd <- na.omit(fitted(object))
  napredict(na.action(object),

            rpois(nsim * length(ftd), ftd)) }

simnew(g,1)



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 27 Feb 2012 - 13:14:55 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 27 Feb 2012 - 15:30:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive