From: Achim Zeileis <Achim.Zeileis_at_uibk.ac.at>

Date: Sun, 13 Mar 2011 19:28:05 +0100

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Mar 2011 - 18:34:26 GMT

Date: Sun, 13 Mar 2011 19:28:05 +0100

Ben,

thanks for your analysis...I also just sent a message with some similar (and some different) ideas.

On Sun, 13 Mar 2011, Ben Bolker wrote:

> The problem seems to be that the algorithm for coming up with a

*> starting guess for the phi (dispersion) parameter is getting a negative
**> number.
*

Yes, as I had written earlier today for the regression on an intercept only.

> It's not all that easy to figure this out ...

Indeed. I've already added a better warning and an ad hoc workaround to the devel version of betareg.

thx,

Z

> The data set is a

*> little bit nasty (lots of points stacked on the equivalent of (0,0)),
**> but not pathological.
**> I'm cc'ing the maintainer of the package --
**>
**> adding the lines
**>
**> if (sum(phi_y)<0) {
**> stop("bad estimated start value for phi: consider setting start
**> values manually\n",
**> "(see ?betareg.control)\n",
**> "Estimated starting values for
**> mean:\n",paste(beta,collapse=","))
**> }
**>
**> at line 162 of betareg.R in the current CRAN version provides
**> a more informative error message in this case.
**>
**> See below for solutions.
**> =============
**>
**>
**> results <- read.csv2("betareg_tmp.csv")
**> results$drugcat <- cut(results$drug,c(0,0.005,0.06,0.17))
**> table(results$drug)
**> table(results$alcoh)
**> table(results$cond)
**> ## shows that fairly large fractions of the data are
**> ## in the lowest category:
**> ## 165/209=0.001 'drug'
**> ## 54/209=0.001 'alcoh'
**> ## 38/209=0.001 'cond'
**> ## so this will be a fairly challenging problem in any case
**>
**> library(ggplot2)
**>
**> ggplot(results,aes(x=alcoh,y=cond))+stat_sum(aes(size=..n..),alpha=0.7)+
**> facet_wrap(~drugcat)+theme_bw()
**>
**>
**> library(betareg)
**> ## set phi link to logarithmic
**> ## basic problem (digging through betareg.fit etc.) is
**> ## that initial estimate of phi, based on
**> ## linear model of logit(cond) ~ alcoh + cond, is NEGATIVE ...
**> ## doesn't seem to be any way to override this starting value
**> ## brute force
**>
**> try(gyl<-betareg(cond ~ alcoh + drug, data=results,
**> link.phi="log"))
**>
**> ## pick through, debugging ... find starting values used
**>
**> svec <- c(-1.6299469,0.8048446,1.7071124,0)
**> gyl<-betareg(cond ~ alcoh + drug, data=results,
**> link.phi="log",
**> control=betareg.control(start=svec))
**>
**> ## would work fine with more generic starting values
**>
**> svec2 <- c(qlogis(mean(results$cond)),0,0,0)
**> gyl2<-betareg(cond ~ alcoh + drug, data=results,
**> link.phi="log",
**> control=betareg.control(start=svec2))
**>
**> ## before I got that to work, I also tried this (which
**> ## will be slower and less efficient but is a useful
**> ## alternative
**>
**> library(bbmle)
**>
**> ## define a variant parameterization of the beta distribution with
**> ## m=a/(a+b), phi=(a+b)
**> dbeta2 <- function(x,m,phi,log=FALSE) {
**>
**> a <- m*phi
**> b <- phi*(1-m)
**> dbeta(x,shape1=a,shape2=b,log=log)
**> }
**>
**> m1 <- mle2(cond~dbeta2(m=plogis(mu),phi=exp(logphi)),
**> parameters=list(mu~alcoh+drug),
**> data=results,
**> start=list(mu=qlogis(mean(results$cond)),logphi=0))
**> summary(m1)
**> p1 <- profile(m1)
**> plot(p1,show.points=TRUE)
**> confint(p1)
**> confint(m1,method="quad") ## not much difference
**>
**> coef(m1)
**> coef(gyl)
**>
**> On 11-03-13 12:59 PM, Vlatka Matkovic Puljic wrote:
**>> Sorry, here is my data (attached).
**>>
**>> 2011/3/12 Ben Bolker <bbolker_at_gmail.com <mailto:bbolker_at_gmail.com>>
**>>
**>> Vlatka Matkovic Puljic <v.matkovic.puljic <at> gmail.com
**>> <http://gmail.com>> writes:
**>>
**>> >
**>> > That was also my first thought.
**>> > But I guess it has something to do with W and phihat
**>> > (which I'm struggling to check
**>>
**>> Again, it would help to post a reproducible example ...
**>> hard to debug/diagnose by remote control. If you can't
**>> possibly post the data to the list, or put them on a web
**>> site somewhere, or randomize them a bit so you're not
**>> giving anything away, or find a simulated example that
**>> shows the same problem, you could as a last resort send them
**>> to me.
**>>
**>> Ben Bolker
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org <mailto:R-help_at_r-project.org> mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide
**>> http://www.R-project.org/posting-guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
**>>
**>>
**>>
**>>
**>> --
**>> **************************
**>> Vlatka Matkovic Puljic
**>> +32/ 485/ 453340
**>>
**>
**>
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Mar 2011 - 18:34:26 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 14 Mar 2011 - 09:10:21 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*