# Re: [R] Markov transition matrices , missing transitions for certain years

From: Rolf Turner <rolf.turner_at_xtra.co.nz>
Date: Tue, 19 Apr 2011 22:37:07 +1200

Make two assumptions:

(1) The initial state probability distribution (``ispd'') is *NOT* a function of the
transition probability matrix (``tpm'').

(2) The boxes are stochastically independent of each other.

Both of these assumptions may be dubious. The second assumption is the crucial one, and I would guess it to be *highly* dubious. However without it, you simply can't get anywhere.

Subject to these assumptions the maximum likelihood estimates of the entries of the tpm may be found as follows:

Count the number of times that any box is in state "i" at time "t" and in state "j" at time "t+1". Count over all boxes and all times t = 1, 2, ..., m-1,
where you have observation over m years. (You have to stop at m-1 in order to be able to have observations at time t+1.)

Let this count be c_ij. Let c_i. be the sum over j of c_ij

Let the tpm be P = [p_ij].

Then the maximum likelihood estimate of p_ij is equal to c_ij/c_i.

[The only time that things can go wrong here is if state "i" never appears in any box, at any time t < m. In such a case the p_ij (j = 1, 2, 3, ..., K, where
K is the number of states or species) are simply not estimable from the available data. We never observe state i making a transition to *any* state,
so we cannot estimate the probabilities of such transitions.]

Writing R code to effect this estimation procedure is easy and is left as an exercise for the reader. :-)

cheers,

Rolf Turner

On 19/04/11 12:47, Abby_UNR wrote:
> Hi all,
> I am working for nest box occupancy data for birds and would like to
> construct a Markov transition matrix, to derive transition probabilities for
> ALL years of the study (not separate sets of transition probabilities for
> each time step). The actual dataset I'm working with is 125 boxes over 14
> years that can be occupied by 7 different species, though I have provided a
> slimmed down portion for this question...
> -
> A box can be in 1 of 4 "states" (i.e. bird species): 1,2,3,4
> Included here are 4 "box histories" over 4 years (y97, y98, y99, y00)
>
> These are the box histories
>> b1<- c(1,1,4,2)
>> b2<- c(1,4,4,3)
>> b3<- c(4,4,1,2)
>> b4<- c(3,1,1,1)
>> boxes<- data.frame(rbind(b1,b2,b3,b4))
>> colnames(boxes)<- c("y97","y98","y99","y00")
>> boxes
> y97 y98 y99 y00
> b1 1 1 4 2
> b2 1 4 4 3
> b3 4 4 1 2
> b4 3 1 1 1
> My problem is that there are 16 possible transitions, but not all possible
> transitions occur at each time step. Therefore, don't think I could do
> something easy like create a table for each time step and add them together,
> for example:
>
>> t1.boxes<- table(boxes\$y98, boxes\$y97)

>> t1.boxes
>
> 1 3 4
> 1 1 1 0
> 4 1 0 1
>> t2.boxes<- table(boxes\$y99, boxes\$y98)

>> t2.boxes
>
> 1 4
> 1 1 1
> 4 1 1
> t1.boxes and t2.boxes could not be added together to calculate the frequency
> of each transition occurring because they are of different dimensions. I'm
> not quite sure how to deal with this, I have attempted to write a function
> (shown below), though I'm not sure if it is needed, I am a bit new the
> programming world. If I could get some help either with the function or a
> way around it that would be most appreciated! Thank you!
>
> --------------
> Function requires the commands already listed above:
>
> FMAT<- matrx(0, nrow=4, ncol=4, byrow=TRUE)
> #This is the matrix that will store the frequency of each possible
> transition occurring over the 4 years
>
> nboxes<- 4
> nyears<- 4
>
> for(row in 1:nboxes)
> {
> for(col in 1:(nyears-1))
> {
> FMAT[boxes[row,col+1], boxes[row,col]]<- boxes[boxes[row, col+1],
> boxes[row,col]]
> #This is the line of code I have been struggling with an am unsure about. I
> have tried
> #various versions of this and keep getting an assortment of error messages.
> }
> }
>
> FMAT
>
> }

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 19 Apr 2011 - 10:39:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 19 Apr 2011 - 11:10:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.