What is a Markov Chain? - Mihai Nica's Notes

# Ingredients of a Markov Chain ## The State Space $\mathcal{S}$ $\mathcal{S}$ is a **set of "states"**. e.g. $\mathcal{S} =\left\{ 0,1,2,\ldots,9\right\}$ for the [[Snakes and Ladders Example]] ## The Stochastic Process $S_1,S_2,\ldots$ This is a discrete time stochastic process (i.e. a sequence of random variables) of $\mathcal{S}$ **valued random variables**, $S_1,S_2,\ldots$. For example, in the [[Snakes and Ladders Example]], we calculated that $\mathbb{P}(S_{3}=7)=\frac{23}{216}$ ## Probability vectors $\mathbb{P}(S_{t}=x)$ For each time $t$, we can get a collection of probabilities which we often think of as a row vector. For example in the [[Snakes and Ladders Example]], we have vectors like: $\mathbb{P}(S_2 = x) = \left[0,0, \frac{6}{36},\frac{2}{36},0,\frac{3}{36},\frac{4}{36},\frac{4}{36},\frac{8}{36},0,\frac{13}{36} \right]$ ## The Transition Matrix $P_{s,s^\prime}$ aka $p(s^\prime | s)$ The **transition probability matrix**[^2] tells you the probability to go from one state to the next state in one step: $P_{ss^{\prime}}=\mathbb{P}\left(S_{t+1}=s^{\prime}|{S_{t}=s}\right)$ The $s$-th row of the matrix shows you the probabilieis you end up somewhere starting from state s . Therefore, the sum of each row of the transition matrix is 1 by the law of total probability. $\sum_{s^\prime} P_{s s^\prime} = 1$ The probability matrix can act on the probability vectors to get the [[How do I use a Probability Transition Matrix?|update rule]] for probabilities in a Markov chain. ## The Markov Property The fact that “What happens in the future depends only on the present state (not how we got here)” is called the "Markov property" aka “The future is independent of the past given the present” Mathematically, this is written as a conditional probability: $\mathbb{P}(S_{t+1}=s^{\prime}|{S_{t}=s})=\mathbb{P}(S_{t+1}=s^{\prime}|{S_{t}=s,S_{t-1}=s_{1}^{past},\ldots})$ # Markov Chain Definition A Markov chain [^1] is a discrete time stochastic process (i.e. sequence of random variable) $S_1,S_2,\ldots$ that have some transition matrix $P_{s,s^\prime}$ and obey the Markov Property. # Other Definitions About Markov Chains ## Terminal States A terminal state is a state where if you are there, you stay there forever. i.e. $\mathbb{P}(S_{t+1} = s^\ast | S_{t} = s^\ast) = 100\%$ ## Episodes An "episode" refers to running the Markov chain until a terminal state is reached. Since we know what happens at a terminal state (you stay there forever!), we can stop the chain and be finished. This is a good word to know because it helps describe things. (e.g. "The average length of an episode is. ...", "I simulated 100 episodes ... ") etc. [^1]: Technically this is a discrete time Markov chain; there is also a notion of a continuous time Markov chain $\{S_t\}_{t\in \mathbb{R}^+}$ [^2]: Here we assume this does not depend on time. More generally, one can also allow $P$ to depend on time $t$.