0% found this document useful (0 votes)
47 views

Markov Chains

Uploaded by

Raymond
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Markov Chains

Uploaded by

Raymond
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Markov Chains

Andreas Klappenecker

Texas A&M University

© 2018 by Andreas Klappenecker. All rights reserved.

1 / 58
Stochastic Processes

A stochastic process X “ tX ptq : t P T u is a collection of


random variables. The index t usually represents time.

We call X ptq the state of the process at time t.

If T is countably infinite, then we call X a discrete time process.


We will mainly choose T to be the set of nonnegative integers.

2 / 58
Markov Chains
Definition
A discrete time process X “ tX0 , X1 , X2 , X3 , . . .u is called a Markov chain if and
only if the state at time t merely depends on the state at time t ´ 1. More
precisely, the transition probabilities

PrrXt “ at | Xt´1 “ at´1 , . . . , X0 “ a0 s “ PrrXt “ at | Xt´1 “ at´1 s

for all values a0 , a1 , . . . , at and all t ě 1.

In other words, Markov chains are “memoryless” discrete time processes. This
means that the current state (at time t ´ 1) is sufficient to determine the
probability of the next state (at time t). All knowledge of the past states is
comprised in the current state.

3 / 58
Homogeneous Markov Chains
Definition
A Markov chain is called homogeneous if and only if the transition
probabilities are independent of the time t, that is, there exist
constants Pi,j such that

Pi,j “ PrrXt “ j | Xt´1 “ is

holds for all times t.


Assumption
We will assume that Markov chains are homogeneous unless stated
otherwise.

4 / 58
Discrete State Space

Definition
We say that a Markov chain has a discrete state space if and
only if the set of values of the random variables is countably infinite

tv0 , v1 , v2 , . . .u.

For ease of presentation we will assume that the discrete state


space is given by the set of nonnegative integers

t0, 1, 2, . . .u.

5 / 58
Finite State Space

Definition
We say that a Markov chain is finite if and only if the set of values
of the random variables is a finite set

tv0 , v1 , v2 , . . . , vn´1 u.

For ease of presentation we will assume that finite Markov chains


have values in
t0, 1, 2, . . . , n ´ 1u.

6 / 58
Transition Probabilities

The transition probabilities

Pi,j “ PrrXt “ j | Xt´1 “ is.

determine the Markov chain. The transition matrix


¨ ˛
P0,0 P0,1 ¨ ¨ ¨ P0,j ¨ ¨ ¨
˚P1,0 P1,1 ¨ ¨ ¨ P1,j ¨ ¨ ¨‹
˚ .. .. . . . .. . . . ‹
˚ ‹
P “ pPi,j q “ ˚ . . . ‹
˝ Pi,0 Pi,1 ¨ ¨ ¨ Pi,j ¨ ¨ ¨‚
˚ ‹
.. .. . . . .. . . .
. . .
comprises all transition probabilities.
7 / 58
Multiple Step Transition Probabilities

For any m ě 0, we define the m-step transition probability


m
Pi,j “ PrrXt`m “ j | Xt “ is.

This is the probability that the chain moves from state i to state j
in exactly m steps.

If P “ pPi,j q denotes the transition matrix, then the m-step


transition matrix is given by
m
pPi,j q “ P m.

8 / 58
Small Example

Example
¨ 1 3
˛ ¨ ˛
0 4 0 4 0.00172635 0.00268246 0.992286 0.00330525
˚ 1 0 1 1 ‹ ˚ 0.00139476 0.00216748 0.993767 0.00267057 ‹
P“˚ 2 3 6 ‹ P 20 “˚ ‹
˝ 0 0 1 0 ‚ ˝ 0 0 1 0 ‚
1 1 1
0 2 4 4
0.00132339 0.00205646 0.994086 0.00253401

9 / 58
Graphical Representation

A Markov chain with state space V and transition matrix P can be represented
by a labeled directed graph G “ pV , E q, where edges are given by transitions
with nonzero probability

E “ tpu, v q | Pu,v ą 0u.

The edge pu, v q is labeled by the probability Pu,v .

Self-loops are allowed in these directed graphs, since we might have Pu,u ą 0.

10 / 58
Example of a Graphical Representation

1{3
1 2
1 3
¨ ˛
0 4 0 4
1
0 31 1

1{
1{2

2
˚ ‹
P “˝ 2 6

1{
˚ ‹

1{4

1{4
0 0 1 0‚

6
0 12 14 14
0 3{4
3
4
1{

11 / 58
Irreducible Markov Chains

12 / 58
Accessible States

We say that a state j is accessible from state i if and only if there


exists some integer n ě 0 such that
n
Pi,j ą 0.

If two states i and j are accessible from each other, then we say
that they communicate and we write i Ø j.

In the graph-representation of the chain, we have i Ø j if and only


if there are directed paths from i to j and from j to i.

13 / 58
Irreducible Markov Chains

Proposition
The communication relation is an equivalence relation.

By definition, the communication relation is reflexive and


symmetric. Transitivity follows by composing paths.
Definition
A Markov chain is called irreducible if and only if all states belong
to one communication class. A Markov chain is called reducible if
and only if there are two or more communication classes.

14 / 58
Irreducible Markov Chains

Proposition
A finite Markov chain is irreducible if and only if its graph
representation is a strongly connected graph.

15 / 58
Exercise
1
1{3
¨
0 1
0 3
˛ 1 2
4 4
˚ 1 0 1 1

1{
1{2

2
P “˚ 2 3 6

1{

1{4

1{4
˝0 0 1 0‚

6
0 12 14 14
0 3{4 3
4
1{
Question
Is this Markov chain irreducible?

16 / 58
Exercise
1
1{3
¨
0 1
0 3
˛ 1 2
4 4
˚ 1 0 1 1

1{
1{2

2
P “˚ 2 3 6

1{

1{4

1{4
˝0 0 1 0‚

6
0 12 14 14
0 3{4 3
4
1{
Question
Is this Markov chain irreducible?
Answer
No, since no other state can be reached from 2.
16 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
Is this Markov chain irreducible?

17 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
Is this Markov chain irreducible?
Answer
Yes.
17 / 58
Periodic and Aperiodic Markov Chains

18 / 58
Period

Definition
The period dpkq of a state k of a homogeneous Markov chain with
transition matrix P is given by
m
dpkq “ gcdtm ě 1 : Pk,k ą 0u.

if dpkq “ 1, then we call the state k aperiodic.


A Markov chain is aperiodic if and only if all its states are
aperiodic.

19 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

20 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

dp0q “
20 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

dp0q “ 1, dp1q “
20 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

dp0q “ 1, dp1q “ 1, dp2q “


20 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

dp0q “ 1, dp1q “ 1, dp2q “ 1, dp3q “


20 / 58
Exercise
1{3
1 2

1{
1{2

2
1 3
¨ ˛

1
0 0

1{
1{4

1{4
4 4

6
˚ 1 0 1 ‹ 1
P “˚ 2 3 ‹ 6
˝0 0 0 1‚
0 3{4
3
0 12 14 14
4
1{

Question
What is the period of each state?

dp0q “ 1, dp1q “ 1, dp2q “ 1, dp3q “ 1, so the chain is aperiodic.


20 / 58
Aperiodic Markov Chains

Aperiodicity can lead to the following useful result.


Proposition
Suppose that we have an aperiodic Markov chain with finite state
space and transition matrix P. Then there exists a positive integer
N such that
pP m qi,i ą 0
for all states i and all m ě N.

Before we prove this result, let us explore the claim in an exercise.

21 / 58
Exercise
1
1 2
¨ ˛

1
0 1 0 0

1
˚0 0 1 0‹
P “˚
˝0 0 0 1‚

0 3{4 3
3 1
4 0 0 4

4
1{
Question
m
What is the smallest number of steps Ni such that Pi,i ą 0 for all m ě N for
i P t0, 1, 2, 3u?

22 / 58
Exercise
1
1 2
¨ ˛

1
0 1 0 0

1
˚0 0 1 0‹
P “˚
˝0 0 0 1‚

0 3{4 3
3 1
4 0 0 4

4
1{
Question
m
What is the smallest number of steps Ni such that Pi,i ą 0 for all m ě N for
i P t0, 1, 2, 3u?

Answer
N0 “ 22 / 58
Exercise
1
1 2
¨ ˛

1
0 1 0 0

1
˚0 0 1 0‹
P “˚
˝0 0 0 1‚

0 3{4 3
3 1
4 0 0 4

4
1{
Question
m
What is the smallest number of steps Ni such that Pi,i ą 0 for all m ě N for
i P t0, 1, 2, 3u?

Answer
N0 “ 4, N1 “ 22 / 58
Exercise
1
1 2
¨ ˛

1
0 1 0 0

1
˚0 0 1 0‹
P “˚
˝0 0 0 1‚

0 3{4 3
3 1
4 0 0 4

4
1{
Question
m
What is the smallest number of steps Ni such that Pi,i ą 0 for all m ě N for
i P t0, 1, 2, 3u?

Answer
N0 “ 4, N1 “4, N2 “ 4, N3 “ 4. 22 / 58
Aperiodic Markov Chains

Now back to the general statement.


Proposition
Suppose that we have an aperiodic Markov chain with finite state
space and transition matrix P. Then there exists a positive integer
N such that
pP m qi,i ą 0
for all states i and all m ě N.

Let us now prove this claim.

23 / 58
Proof.

We will use the following fact from number theory.


Lemma
If a subset A of the set of nonnegative integers is
1 closed under addition, A ` A Ď A, and
2 satisfies gcdta | a P Au “ 1,
then it contains all but finitely many nonnegative integers, so there
exists a positive integer n such that tn, n ` 1, n ` 2, . . .u Ď A.

24 / 58
Proof of the Lemma.
Suppose that A “ ta1 , a2 , a3 , . . .u. Since gcd A “ 1, there must exist some
postiive integer k such that

gcdpa1 , a2 , . . . , ak q “ 1.

Thus, there exist integers n1 , n2 , . . . , nk such that

n1 a1 ` n2 a2 ` ¨ ¨ ¨ ` nk ak “ 1.

We can split this sum into a positive part P and a negative part N such that

P ´ N “ 1.

As sums of elements in A, both P and N are contained in A.

25 / 58
Proof of the Lemma (Continued)
Suppose that n is a positive integer such that n ě NpN ´ 1q. We can express n
in the form
n “ aN ` r
for some integer a and a nonnegative integer r in the range 0 ď r ď N ´ 1.
We must have a ě N ´ 1. Indeed, if a were less than N ´ 1, then we would have
n “ aN ` r ă NpN ´ 1q, contradicting our choice of n.
We can express n in the form

n “ aN ` r “ aN ` r pP ´ Nq “ pa ´ r qN ` rP.

Since a ě N ´ 1 ě r , the factor pa ´ r q is nonnegative. As N and P are in A, we


must have n “ pa ´ r qN ` rP P A.
We can conclude that all sufficiently large integers n are contained in A.

26 / 58
Proof of the Proposition.
For each state i, consider the set Ai of possible return times
m
Ai “ tm ě 1 | Pi,i ą 0u.

Since the Markov chain in aperiodic, the state i is aperiodic, so gcd Ai “ 1.


If m, m1 are elements of Ai , then

PrrXm “ i | X0 “ is ą 0 and PrrXm`m1 “ i | Xm “ is ą 0.

Therefore,

PrrXm`m1 “ i | X0 “ is ě PrrXm`m1 “ i | Xm “ is PrrXm “ i | X0 “ is ą 0.

So m ` m1 is an element of Ai . Therefore, Ai ` Ai Ď Ai .
By the lemma, Ai contains all but finitely many nonnegative integers. Therefore,
A contains all but finitely many nonnegative integers. 2
27 / 58
Aperiodic Irreducible Markov Chains

Proposition
Let X be an irreducible and aperiodic Markov chain with finite
state space and transition matrix P. Then there exists an M ă 8
such that pP m qi,j ą 0 for all states i and j and all m ě M.

In other words, in an irreducible, aperiodic, and finite Markov chain,


one can reach each state from each other state in an arbitrary
number of steps with a finite number of exceptions.

28 / 58
Proof.
Since the Markov chain is aperiodic, there exist a positive integer N such that
pP n qi,i ą 0 for all states i and all n ě N.
n
Since P is irreducible, there exist a positive integer ni,j such that Pi,ji,j ą 0. After
m ě N ` ni,j steps, we have

PrrX m “ j | X0 “ is ě PrrX
looooooooooomooooooooooon m “ j | Xm´ni,j “ is looooooooooooomooooooooooooon
looooooooooooomooooooooooooon PrrXm´ni,j “ i | X0 “ is .
m ą0
Pi,j n m´ni,j
“Pi,ji,j ą0 “Pi,i ą0

m
In other words, we have Pi,j ą 0, as claimed.

29 / 58
Stationary Distributions

30 / 58
Stationary Distribution

Definition
Suppose that X is a finite Markov chain with transition matrix P.
A row vector v “ pp0 , p1 , . . . , pn´1 q is called a stationary
distribution for P if and only if
n´1
ÿ
1 the pk are nonnegative real numbers such that pk “ 1.
k“0
2 vP “ v .

31 / 58
Examples

Example
Every probability distribution on the states is a stationary
probability distribution when P is the identity matrix.

Example
ˆ ˙
1{2 1{2
If P “ , then v “ p1{6, 5{6q satisfies vP “ v .
1{10 9{10

32 / 58
Existence and Uniqueness of Stationary Distribution

Proposition
Any aperiodic and irreducible finite Markov chain has precisely one
stationary distribution.

33 / 58
Total Variation Distance

Definition
If p “ pp0 , p1 , . . . , pn´1 q and q “ pq0 , q1 , . . . , qn´1 q are probability
distributions on a finite state space, then
n´1
1ÿ
dTV pp, qq “ |pk ´ qk |
2 k“0

is called the total variation distance between p and q.

In general, 0 ď dTV pp, qq ď 1. If p “ q, then dTV pp, qq “ 0.

34 / 58
Convergence in Total Variation

Definition
pmq pmq pmq
If p pmq “ pp0 , p1 , . . . , pn´1 q is a probability distribution for each
m ě 1 and p “ pp0 , p1 , . . . , pn´1 q is a probability distribution, then
we say that p pmq converges to p in total variation if and only if

lim dTV pp pmq , pq “ 0.


mÑ8

35 / 58
Convergence Theorem

Proposition
Let X be a finite irreducible aperiodic Markov chain with transition
matrix P. If p p0q is some initial probability distribution on the states
and p is a stationary distribution, then p pmq “ p p0q P m converges in
total variation to the stationary distribution,

lim dTV pp pmq , pq “ 0.


mÑ8

36 / 58
Reversible Markov Chains

37 / 58
Definition
Suppose that X is a Markov chain with finite state space and
transition matrix P. A probability distribution π on S is called
reversible for the chain if and only if

πi Pi,j “ πj Pj,i

holds for all states i and j in S.


A Markov chain is called reversible if and only if there exists a
reversible distribution for it.

38 / 58
Reversible Markov Chains

Proposition
Suppose that X is a Markov chain with finite state space and
transition matrix P. If π is a reversible distribution for the Markov
chain, then it is also a stationary distribution for it.

39 / 58
Proof.
We need to show that πP “ π. In other words, we need to show
that ÿ
πj “ πk Pk,j .
kPS
holds for all states j.
This is straightforward, since
ÿ ÿ ÿ
πj “ πj Pj,k “ πj Pj,k “ πk Pk,j ,
kPS kPS kPS

where we used the reversibility condition πj Pj,k “ πk Pk,j in the last


equality.

40 / 58
Random Walks

41 / 58
Random Walks

Definition
A random walk on an undirected graph G “ pV , E q is given by
the transitition matrix P with
#
1
dpuq if pu, v q P E ,
Pu,v “
0 otherwise.

42 / 58
Properties

Proposition
For a random walk on a undirected graph with transition matrix P,
we have
1 P is irreducible if and only if G is connected,
2 P is aperiodic if and only if G is not bipartite.

Proof.
If P is irreducible, then the graphical representation is a strongly
connected directed graph, so the underlying undirected graph is
connected. The converse is clear.

43 / 58
Proof. (Continued)
The Markov chain corresponding to a random walk on an
undirected graph has either period 1 or 2. It has period 2 if and
only if G is bipartite. In other words, P is aperiodic if and only if G
is not bipartite.

44 / 58
Reversible Markov Chain

Proposition
A random walk on a graph with vertex set V “ tv1 , v2 , . . . , vn u is a
Markov chain with reversible distribution
ˆ ˙
dpv1 q dpv2 q dpvn q
π“ , ,..., ,
d d d
ř
where d “ v PV dpv q is the total degree of the graph.

45 / 58
Proof.
Suppose that u and v are adjacent vertices. Then
dpuq 1 1 dpv q 1
πu Pu,v “ “ “ “ πv Pv ,u .
d dpuq d d dpv q
If u and v are non-adjacent vertices, then

πu Pu,v “ 0 “ πv Pv ,u ,

since Pu,v “ 0 “ Pv ,u .

46 / 58
Example

|V | “ 8 and |E | “ 12
v1
8
ÿ
dpvk q “ 2|E | “ 24 v2 v3 v4
k“1

v5

ˆ
2 3 5 3 2 3 3 3
˙ v6 v7 v8
π“ , , , , , , ,
24 24 24 24 24 24 24 24

47 / 58
Markov Chain Monte Carlo Algorithms

48 / 58
Markov Chain Monte Carlo Algorithms

The Idea
Given a probability distribution π on a set S, we want to be able to
sample from this probability distribution.
In MCMC, we define a Markov chain that has π as a stationary
distribution. We run the chain for some iterations and then sample
from it.

49 / 58
Markov Chain Monte Carlo Algorithms

The Idea
Given a probability distribution π on a set S, we want to be able to
sample from this probability distribution.
In MCMC, we define a Markov chain that has π as a stationary
distribution. We run the chain for some iterations and then sample
from it.
Why?
Sometimes it is easier to construct the Markov chain than the
probability distribution π.

49 / 58
Hardcore Model
Definition
Let G “ pV , E q be a graph. The hardcore model of G randomly
assigns either 0 or 1 to each vertex such that no neighboring
vertices both have the value 1.
Assignment of the values 0 or 1 to the vertices are called
configurations. So a configuration is a map in t0, 1uV .
A configuration is called feasible if and only if no adjacent vertices
have the value 1.
In the hardcore model, the feasible configurations are chosen
uniformly at random.

50 / 58
Hardcore Model

Question
For a given graph G , how can you directly choose a feasible
configuration uniformly at random?

51 / 58
Hardcore Model

Question
For a given graph G , how can you directly choose a feasible
configuration uniformly at random?

An equivalent question is:


Question
For a given graph G , how can you directly choose independent sets
of G uniformly at random?

51 / 58
Grid Graph Example
Observation
2
In an n ˆ n grid graph, there are 2n configurations.

52 / 58
Grid Graph Example
Observation
2
In an n ˆ n grid graph, there are 2n configurations.

Observation
2
There are at least 2n {2 feasible configurations in the grid graph.

Indeed, set every other node in the grid graph to 0. For example, if we label the
vertices by tpx, y q | 0 ď x ă n, 0 ď y ă nu. Then set all vertices with x ` y ” 0
pmod 2q to 0. The value of the remaining n2 {2 vertices can be chosen arbitrarily,
2
giving at least 2n {2 feasible configurations.

Direct sampling from the feasible configurations seems difficult.

52 / 58
Hardcore Model Markov Chain

Given a graph G “ pV , E q with a set F of feasible configurations.


We can define a Markov chain with state space F and the following
transitions

1 Let Xn be the current feasible configuration. Pick a vertex


v P V uniformly at random.
2 For all vertices w P V ztv u, the value of the configuration
will not change: Xn`1 pw q “ Xn pw q.
3 Toss a fair coin. If the coin shows heads and all neighbors of
v have the value 0, then Xn`1 pv q “ 1; otherwise
Xn`1 pv q “ 0.
53 / 58
Hardcore Model
Proposition
The hardcore model Markov chain is irreducible.

54 / 58
Hardcore Model
Proposition
The hardcore model Markov chain is irreducible.
Proof.
Given an arbitrary feasible configuration with m ones, it is possible
to reach the configuration with all zeros in m steps.
Similarly, it is possible to go from the zero configuration to an
arbitrary feasible configuration with positive probability in a finite
number of steps.
Therefore, it is possible to go from an arbitrary feasible
configuration to another in a finite number of steps with positive
probability.
54 / 58
Hardcore Model

Proposition
The hardcore model Markov chain is aperiodic.

55 / 58
Hardcore Model

Proposition
The hardcore model Markov chain is aperiodic.

Proof.
For each state, there is a small but nonzero probability that the
Markov chain stays in the same state. Thus, each state is
aperiodic. Therefore, the Markov chain is aperiodic.

55 / 58
Hardcore Model

Proposition
Let π denote the uniform distribution on the set of feasible
configurations F. Let P denote the transition matrix. Then

πf Pf ,g “ πg Pg ,f

for all feasible configurations f and g .

56 / 58
Proof.
Since πf “ πg “ 1{|F|, it suffices to show that Pf ,g “ Pg ,f .
1 This is trivial if f “ g .
2 If f and g differ in more than one vertex, then
Pf ,g “ 0 “ Pg ,f .
3 If f and g differ only on the vertex v . If G has k vertices,
then
1 1
Pf ,g “ ¨ “ Pg ,f .
2 k

57 / 58
Hardcore Model

Corollary
The stationary distribution of the hardcore model Markov chain is
the uniform distribution on the set of feasible configurations.

58 / 58

You might also like