# Re: [R] Making a markov transition matrix

From: <Bill.Venables_at_csiro.au>
Date: Sun 22 Jan 2006 - 22:51:09 EST

That solution for the case 'with gaps' merely omits transitions where the transition information is not for a single time step. (Mine can be modified for this as well - see below.)

But if you know that a firm went from state i in year y to state j in year y+3, say, without knowing the intermediate states, that must tell you something about the 1-step transition matrix as well. How do you use this information?

That's a much more difficult problem but you can do it using maximum likelihood, e.g. You think about how to calculate the likelihood function - and then to optimise it. This is getting a bit away from the original 'programming trick' question, but it is an interesting problem that occurs more often than I had realised. I'd be interested in knowing if anyone had done anything slick in this area.

Bill Venables.

-----Original Message-----
From: Ajay Narottam Shah [mailto:ajayshah@mayin.org] Sent: Sunday, 22 January 2006 5:15 PM
To: R-help
Cc: jholtman@gmail.com; Venables, Bill (CMIS, Cleveland) Subject: Re: [R] Making a markov transition matrix

On Sun, Jan 22, 2006 at 01:47:00PM +1100, Bill.Venables@csiro.au wrote:
> If this is a real problem, here is a slightly tidier version of the
> function I gave on R-help:
>
> transitionM <- function(name, year, state) {
> raw <- data.frame(name = name, state = state)[order(name, year), ]
> raw01 <- subset(data.frame(raw[-nrow(raw), ], raw[-1, ]),
> name == name.1)
> with(raw01, table(state, state.1))
> }

To modify this solution for the 'with gaps' case, omitting multiple step transitions, you need to include the year in the 'raw' data frame and then just change the subset condition to

name == name.1 & year == year.1 - 1

>
> Notice that this does assume there are 'no gaps' in the time series
> within firms, but it does not require that each firm have responses
for
> the same set of years.
>
> Estimating the transition probability matrix when there are gaps
within
> firms is a more interesting problem, both statistically and, when you
> figure that out, computationally.

With help from Gabor, here's my best effort. It should work even if there are gaps in the timeseries within firms, and it allows different firms to have responses in different years. It is wrapped up as a function which eats a data frame. Somebody should put this function into Hmisc or gtools or something of the sort.

```# Problem statement:
#
# You are holding a dataset where firms are observed for a fixed
# (and small) set of years. The data is in "long" format - one
# record for one firm for one point in time. A state variable is
# observed (a factor).
# You wish to make a markov transition matrix about the time-series
# evolution of that state variable.

```

set.seed(1001)

# Raw data in long format --
raw <- data.frame(name=c("f1","f1","f1","f1","f2","f2","f2","f2"),

```                  year=c(83,   84,  85,  86,  83,  84,  85,  86),
state=sample(1:3, 8, replace=TRUE)
)

transition.probabilities <- function(D, timevar="year",
idvar="name", statevar="state") {
merged <- merge(D, cbind(nextt=D[,timevar] + 1, D),
by.x = c(timevar, idvar), by.y = c("nextt", idvar))
```
t(table(merged[, grep(statevar, names(merged), value = TRUE)])) }

transition.probabilities(raw, timevar="year", idvar="name", statevar="state")

```--
Ajay Shah
http://www.mayin.org/ajayshah
ajayshah@mayin.org
http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help