Application to Markov Chains

Many natural phenomena progress through various stages and can be in a variety of states at each stage. For example, the weather in a given city progresses day by day and, on any given day, may be sunny or rainy. Here the states are “sun” and “rain,” and the weather progresses from one state to another in daily stages. Another example might be a football team: The stages of its evolution are the games it plays, and the possible states are “win,” “draw,” and “loss.”

The general setup is as follows: A “system” evolves through a series of “stages,” and at any stage it can be in any one of a finite number of “states.” At any given stage, the state to which it will go at the next stage depends on the past and present history of the system—that is, on the sequence of states it has occupied to date.

(Note: The name honors Andrei Andreyevich Markov (1856–1922) who was a professor at the university in St. Petersburg, Russia.)

Even in the case of a Markov chain, the state the system will occupy at any stage is determined only in terms of probabilities. In other words, chance plays a role. For example, if a football team wins a particular game, we do not know whether it will win, draw, or lose the next game. On the other hand, we may know that the team tends to persist in winning streaks; for example, if it wins one game it may win the next game of the time, lose of the time, and draw of the time. These fractions are called the probabilities of these various possibilities. Similarly, if the team loses, it may lose the next game with probability (that is, half the time), win with probability , and draw with probability . The probabilities of the various outcomes after a drawn game will also be known.

We shall treat probabilities informally here: The probability that a given event will occur is the long-run proportion of the time that the event does indeed occur. Hence, all probabilities are numbers between and . A probability of means the event is impossible and never occurs; events with probability are certain to occur.

If a Markov chain is in a particular state, the probabilities that it goes to the various states at the next stage of its evolution are called the transition probabilities for the chain, and they are assumed to be known quantities. To motivate the general conditions that follow, consider the following simple example. Here the system is a man, the stages are his successive lunches, and the states are the two restaurants he chooses.

Example 007199 incorporates most of the essential features of all Markov chains. The general model is as follows: The system evolves through various stages and at each stage can be in exactly one of distinct states. It progresses through a sequence of states as time goes on. If a Markov chain is in state at a particular stage of its development, the probability that it goes to state at the next stage is called the transition probability. The matrix is called the transition matrix for the Markov chain. The situation is depicted graphically in the diagram.

We make one important assumption about the transition matrix : It does not depend on which stage the process is in. This assumption means that the transition probabilities are independent of time—that is, they do not change as time goes on. It is this assumption that distinguishes Markov chains in the literature of this subject.

Consider the th column of the transition matrix . \begin{equation*} \begin{bmatrix} p_{1j} \\ p_{2j} \\ \vdots \\ p_{nj} \end{bmatrix} \end{equation*} If the system is in state at some stage of its evolution, the transition probabilities represent the fraction of the time that the system will move to state 1, state 2, …, state , respectively, at the next stage. We assume that it has to go to some state at each transition, so the sum of these probabilities is : \begin{equation*} p_{1j} + p_{2j} + \cdots + p_{nj} = 1 \quad \mbox{for each } j \end{equation*} Thus, the columns of all sum to and the entries of lie between and . Hence is called a stochastic matrix.

As in Example 007199, we introduce the following notation: Let denote the probability that the system is in state after transitions. The matrices \begin{equation*}{\bf s}_m = \begin{bmatrix} s_{1}^{(m)} \\ s_{2}^{(m)} \\ \vdots \\ s_{n}^{(m)} \end{bmatrix} \quad m = 0, 1, 2, \dots \end{equation*} are called the state vectors for the Markov chain. Note that the sum of the entries of must equal because the system must be in some state after transitions. The matrix is called the initial state vector for the Markov chain and is given as part of the data of the particular chain. For example, if the chain has only two states, then an initial vector means that it started in state 1. If it started in state 2, the initial vector would be . If , it is equally likely that the system started in state 1 or in state 2.

Heuristic Proof
Suppose that the Markov chain has been run times, each time starting with the same initial state vector. Recall that is the proportion of the time the system goes from state at some stage to state at the next stage, whereas is the proportion of the time it is in state at stage . Hence \begin{equation*} s_{i}^{m+1}N \end{equation*} is (approximately) the number of times the system is in state at stage . We are going to calculate this number another way. The system got to state at stage through some other state (say state ) at stage . The number of times it was in state at that stage is (approximately) , so the number of times it got to state via state is . Summing over gives the number of times the system is in state (at stage ). This is the number we calculated before, so \begin{equation*} s_{i}^{(m+1)}N = p_{i1}s_{1}^{(m)}N + p_{i2}s_{2}^{(m)}N + \cdots + p_{in}s_{n}^{(m)}N \end{equation*} Dividing by gives for each , and this can be expressed as the matrix equation .

If the initial probability vector and the transition matrix are given, Theorem 007270 gives , one after the other, as follows: \begin{equation*} \begin{array}{c}{\bf s}_{1} = P{\bf s}_0 \\{\bf s}_{2} = P{\bf s}_1 \\{\bf s}_{3} = P{\bf s}_2 \\ \vdots \end{array} \end{equation*} Hence, the state vector is completely determined for each by and .

Steady State Vector

Another phenomenon that was observed in Example 007199 can be expressed in general terms. The state vectors were calculated in that example and were found to “approach” . This means that the first component of becomes and remains very close to as becomes large, whereas the second component gets close to as increases. When this is the case, we say that converges to . For large , then, there is very little error in taking , so the long-term probability that the system is in state 1 is , whereas the probability that it is in state 2 is . In Example 007199, enough state vectors were computed for the limiting vector to be apparent. However, there is a better way to do this that works in most cases.

Suppose is the transition matrix of a Markov chain, and assume that the state vectors converge to a limiting vector . Then is very close to for sufficiently large , so is also very close to . Thus, the equation from Theorem 007270 is closely approximated by \begin{equation*}{\bf s} = P{\bf s} \end{equation*} so it is not surprising that should be a solution to this matrix equation. Moreover, it is easily solved because it can be written as a system of homogeneous linear equations \begin{equation*} (I - P){\bf s} ={\bf 0} \end{equation*} with the entries of as variables.

In Example 007199, where , the general solution to is , where is a parameter. But if we insist that the entries of sum to (as must be true of all state vectors), we find and so as before.

All this is predicated on the existence of a limiting vector for the sequence of state vectors of the Markov chain, and such a vector may not always exist. However, it does exist in one commonly occurring situation. A stochastic matrix is called regular if some power of has every entry greater than zero. The matrix of Example 007199 is regular (in this case, each entry of is positive), and the general theorem is as follows:

This theorem will not be proved here. The interested reader can find an elementary proof in J. Kemeny, H. Mirkil, J. Snell, and G. Thompson, Finite Mathematical Structures (Englewood Cliffs, N.J.: Prentice-Hall, 1958).

If is the regular transition matrix of a Markov chain, the column satisfying conditions 1 and 2 of Theorem 007400 is called the steady-state vectorfor the Markov chain. The entries of are the long-term probabilities that the chain will be in each of the various states.

Practice Problems

Problems prob:RegOrNot1-prob:RegOrNot2

Which of the following stochastic matrices is regular?

Regular Not regular
Regular Not regular

Problems prob:SteadyState&ProbVec1-prob:SteadyState&ProbVec5

In each case find the steady-state vector and, assuming that it starts in state 1, find the probability that it is in state 2 after transitions.

Steady state vector , probability
Steady state vector , probability
Steady state vector = , probability

Problems prob:fox1-prob:fox2

A fox hunts in three territories , , and . He never hunts in the same territory on two successive days. If he hunts in , then he hunts in the next day. If he hunts in or , he is twice as likely to hunt in the next day as in the other territory.

What proportion of his time does he spend in , in , and in ?
If he hunts in on Monday ( on Monday), what is the probability that he will hunt in on Thursday?

Problems prob:SocialClass1-prob:SocialClass2

Assume that there are three social classes—upper, middle, and lower—and that social mobility behaves as follows:

(a)
Of the children of upper-class parents, remain upper-class, whereas become middle-class and become lower-class.
(b)
Of the children of middle-class parents, remain middle-class, whereas the others are evenly split between the upper class and the lower class.
(c)
For the children of lower-class parents, remain lower-class, whereas become middle-class and upper-class.
Find the probability that the grandchild of lower-class parents becomes upper-class.
Find the long-term breakdown of society into classes. middle, upper, lower
The prime minister says she will call an election. This gossip is passed from person to person with a probability that the information is passed incorrectly at any stage. Assume that when a person hears the gossip he or she passes it to one person who does not know. Find the long-term probability that a person will hear that there is going to be an election.
John makes it to work on time one Monday out of four. On other work days his behaviour is as follows: If he is late one day, he is twice as likely to come to work on time the next day as to be late. If he is on time one day, he is as likely to be late as not the next day. Find the probability of his being late and that of his being on time Wednesdays. Probability of being late , Probability of being on time
Suppose you have cent and match coins with a friend. At each match you either win or lose cent  with equal probability. If you go broke or ever get cents, you quit. Assume your friend never quits. If the states are 0, 1, 2, 3, and 4 representing your wealth, show that the corresponding transition matrix is not regular. Find the probability that you will go broke after matches.

Problems prob:mouse1-prob:mouse2

A mouse is put into a maze of compartments, as in the diagram. Assume that he always leaves any compartment he enters and that he is equally likely to take any tunnel entry.

If he starts in compartment 1, find the probability that he is in compartment 1 again after moves.
Find the compartment in which he spends most of his time if he is left for a long time. He spends most of his time in compartment . The steady state vector is
If a stochastic matrix has a on its main diagonal, show that it cannot be regular. Assume it is not .
If is the stage- state vector for a Markov chain, show that holds for all and (where is the transition matrix).
A stochastic matrix is doubly stochastic if all the row sums also equal . Find the steady-state vector for a doubly stochastic matrix.
Consider the stochastic matrix

, where and .

(a)
Show that is the steady-state vector for .
(b)
Show that converges to the matrix by first verifying inductively that for . (It can be shown that the sequence of powers of any regular transition matrix converges to the matrix each of whose columns equals the steady-state vector for .)
Since and we get whence . Finally, , so converges to zero as increases.

Text Source

This application was adapted from Section 2.9 of Keith Nicholson’s Linear Algebra with Applications. (CC-BY-NC-SA)

W. Keith Nicholson, Linear Algebra with Applications, Lyryx 2018, Open Edition, p. 134

2024-09-11 17:54:44