# Re: [R] practical help ... solving a system...

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Tue 21 Jun 2005 - 02:02:50 EST

The problem is that "size" in "dbinom" must be an integer:

> dbinom(x=2, size=4.5, prob=.5)

 NaN
Warning message:
NaNs produced in: dbinom(x, size, prob, log)

Consider the following:
> library(MASS)
> set.seed(123)
> x <- rbinom(100, 10, 0.4)
> nx <- max(x)
> fit <- fitdistr(x, dbinom, start=list(prob=0.5),
+ size = nx)
Warning message:
one-diml optimization by Nelder-Mead is unreliable: use optimize in: optim(start, mylogfn, x = x, hessian = TRUE, ...)
> attributes(fit)

\$names
 "estimate" "sd"

\$class
 "fitdistr"
#### From this, we learn that fit\$est is the MLE for prob given size #### Try different a range of values for "size"
> lglk <- rep(NA, 12)

> names(lglk) <- nx:(nx+11)
> for(i in 1:12){

```+ fit <- fitdistr(x, dbinom, start=list(prob=0.5),
+ size = nx+i-1)
+ lglk[i] <- sum(dbinom(x, size=nx+i-1,
+ prob=fit\$estimate, log=TRUE))
+ }
```

There were 15 warnings (use warnings() to see them)
> qchisq(0.95, 1)

 3.841459
> 2*(max(lglk)-lglk)

8 9 10 11 12 13 1.314439853 0.000000000 0.004998298 0.434260975 1.000371029 1.594030787

14 15 16 17 18 19 2.170873826 2.713847684 3.217152046 3.680715184 4.106493564 4.497508846

# 2*log(likelihood ratio) is approximately chi-square, # so a 95% confidence interval for "size" is 8:17.

If I needed this for a scientific research paper, I'd do more research on how to estimate "size" for the binomial -- if I really thought it should be estimated. If I needed an answer today, I'd probably go with what I just did here.

```	  hope this helps,
spencer graves

##################
```

Carsten Steinhoff wrote:
> Hello Spencer,

>
> thank you for your reply. As I've seen you have optimized only one
value of
> the binomial distribution by using Fitdistr.
> My problem is that I have to find both size AND prob.
> The error message always is: error in using optim. non-finite finite
> difference value 
``` >
```

> So I thought to do a MLE "by hand" ...
``` >
```

> Do you have an idea what do modify on the error above?
>
> Greetings, Carsten

Spencer Graves wrote:

```>       "PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html".  If you had provided more
> information about what you tried, it might be easier for someone else to
> offer effective help.  In particular, what did you try with fitdistr,
> and why do you think it didn't work (e.g., what error message did you get)?
>
>       The following is a minor modification of examples in the
> "fitdistr" help page:
>
>  > library(MASS)
>  > set.seed(123)
>  > x <- rbinom(100, 10, 0.4)
>  > (fit <- fitdistr(x, dbinom, start=list(prob=0.5), size=10))
>       prob
>   0.39804687
>  (0.01547943)
> Warning message:
> one-diml optimization by Nelder-Mead is unreliable: use optimize in:
> optim(start, mylogfn, x = x, hessian = TRUE, ...)
>  >
>       The difference between prob=0.4 that I simulated and the
> 0.39804687 estimated is shockingly small.  (No, I didn't run 100
> simulations with different seeds and pick the best one.)
>
>       If I were concerned about the warning, I could list "fitdist" and
> read the code.  I might even copy it into a script file and walk through
> it line by line.  When checked the code for "fitdistr" just now, I
> learned that it calls "optim", and "optim" provides other estimation
> methods, e.g., BFGR and CG.  The following are the results from trying
> those two:
>
> (fit.BFGS <- fitdistr(x, dbinom, start=list(prob=0.5),
> + size=10, method="BFGS"))
>       prob
>   0.39800028
>  (0.01547882)
> Warning messages:
> 1: NaNs produced in: dbinom(x, size, prob, log)
> 2: NaNs produced in: dbinom(x, size, prob, log)
> 3: NaNs produced in: dbinom(x, size, prob, log)
> 4: NaNs produced in: dbinom(x, size, prob, log)
> 5: NaNs produced in: dbinom(x, size, prob, log)
>  > (fit.CG <- fitdistr(x, dbinom, start=list(prob=0.5),
> + size=10, method="CG"))
>       prob
>   0.39800001
>  (0.01547881)
> Warning messages:
> 1: NaNs produced in: dbinom(x, size, prob, log)
> 2: NaNs produced in: dbinom(x, size, prob, log)
> 3: NaNs produced in: dbinom(x, size, prob, log)
> 4: NaNs produced in: dbinom(x, size, prob, log)
> 5: NaNs produced in: dbinom(x, size, prob, log)
>
>       If I'm concerned about these warnings, I can replace "dbinom" by a
> local copy that prints the value of "prob" each time it's called:
>
> dbinom. <- function (x, size, prob, log = FALSE){
>     cat(prob, "")
> .Internal(dbinom(x, size, prob, log))
> }
> (fit.BFGS <- fitdistr(x, dbinom., start=list(prob=0.5),
>     size=10, method="BFGS"))
> 0.5 0.501 0.499 -407.5005 -81.10011 -15.82002 -2.764004 -0.1528009
> 0.3694398 0.3704398 0.3684398 0.3996073 0.4006073 0.3986073 0.3980445
> 0.3990445 0.3970445 0.2133659 0.3611088 0.3906574 0.3965671 0.3977490
> 0.3979854 0.3989854 0.3969854 0.3980003 0.45997 0.4103942 0.4004791
> 0.3984960 0.3980994 0.3980201 0.3980043 0.3980011 0.3980004 0.3980003
> 0.3980003 0.3980003 0.3980003 0.3980003 0.4000003 0.3980003 0.3980003
> 0.3960003       prob
>   0.39800028
>  (0.01547882)
> Warning messages:
> 1: NaNs produced in: dbinom(x, size, prob, log)
> 2: NaNs produced in: dbinom(x, size, prob, log)
> 3: NaNs produced in: dbinom(x, size, prob, log)
> 4: NaNs produced in: dbinom(x, size, prob, log)
> 5: NaNs produced in: dbinom(x, size, prob, log)
>  >
>       It's no wonder dbinom produced NaNs:  It does that when prob < 0.
>
>       hope this helps.
>       spencer graves
>
> p.s.  MASS = "Modern Applied Statistics with S" by Venables and Ripley..
>  Have you seen this book?  I've learned a lot from it.
>
> Carsten Steinhoff wrote:
>
>> Hello,
>>
>> I want to estimate the parameters of a binomial distributed rv using MLE.
>> Other distributions will follow.
>> The equation system to solve is not very complex, but I've never done
>> such
>> work in R and don't have any idea how to start...
>>
>> The system is:
>>
>> (1)     n*P = X
>>
>> (2)    [sum {from j=0 to J-1} Y{j} /(n-j)] = -n * ln (1-X / n)
>>
>>
>> where    * only X is given (empirical mean)
>>              * J is maximum observed
>>              * Y is the number of observations above j
>>              * n and P are the parameters of the binomial distribution to
>> find....
>>
>> Who could help me with an example-solution for my "first" distribution
>> ? I
>> also need a hint how to make the sum-element.
>>
>> Maybe there's another - more simple - way to estimate the parameters...
>> first I tried via "FITDISTR" but without success.
>>
>> Thanx a lot for your help.
>>
>> Carsten
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help