# 0x041 Stochastics

## 1. Markov Process

### 1.1. Markov Chain

Definition (Markov chain) The process $$X$$ is a Markov chain if it has the Markov property:

$P(X_n = s | X_0 = x_0, X_1 = x_1, .., X_{n-1} = x_{n-1}) = P(X_n = s | X_{n-1} = x_{n-1})$

Definition (homogeneous) A Markov chain $$X$$ is called homogeneous if

$P(X_{n+1} = j | X_n = i) = P(X_1 = j | X_0 = i)$

The transition matrix $$P=(p_{ij})$$ is the $$|S| \times |S|$$ matrix of transition probabilities $$p_{ij} = P(X_{n+1} = j | X_n = i)$$

Theorem (Chapman-Kolmogorov) The n-step transition matrix $$P(m, m+n) = (p_{ij}(m, m+n)) = P^n$$ where $$p_{ij}(m, m+n) = P(X_{m+n} = j | X_m = i)$$

Evolution of the chain is determined by the transition matrix $$P$$ and initial mass function $$\mu$$, many questions about the chain is thus largely reducible to the study of matrices.

#### 1.1.1. Notations

##### 1.1.1.1. State Notations

Let $$P_i, E_i$$ denote the probability conditional on $$X(0) = i$$:

$P_i(A) = P(A | X_0 = i), E_i(X) = E(X | X_0 = i)$

Definition (recurrent state, transient state) State $$i$$ is called recurrent if

$P_i(X_n = i \text{ for some } n \geq 1) = 1$

The probability of eventual return to $$i$$, starting from $$i$$ is 1. Otherwise it is called transient

Definition (positive) For a recurrent state $$i$$, $$i$$ is called positive if the mean recurrence time $$\mu_i < \infty$$, otherwise it is called null

$\mu_i = E_i(T_i) = \sum_n n f_{ii}(n)$

where $$f_{i,j}(n) = P_i(X_1 \neq j, X_2 \neq j, ..., X_{n-1} \neq j, X_n = j)$$ is the first passage times

Definition (periodic) The period $$d(i)$$ of a state $$i$$ is defined by $$d(i) = gcd \{ n: p_{ii}(n) > 0 \}$$, $$i$$ is periodic if $$d(i) > 1$$ otherwise aperiodic

Definition (ergodic) A state is called ergodic if it is recurrent, positive and aperiodic

ergodicity

Ergodicity is a more general concept whose statement is the time average of trajectory almost equals the ensemble average

$\langle J \rangle = \lim_{T \to \infty} \frac{1}{T} \int J(u(t))$
##### 1.1.1.2. Chain Notations

Defintion (communication) $$i$$ is said to communicate with $$j$$ with notation $$i \to j$$ if the chain visits state $$j$$ with strictly positive prob starting from $$i$$. and they intercommunicate if $$i \to j$$ and $$j \to i$$

Defintion (closed, irreducible) A set $$C$$ of states is called

• closed: if $$p_{i,j} = 0$$ for all $$i \in C, j \notin C$$
• irreducible: if $$i,j$$ intercommunicate for all $$i,j \in C$$

Theorem (decomposition) The state space $$S$$ can be partitioned uniquely as

$S = T \cup C_1 \cup C_2 ...$

where $$IT$$ is the set of transient states, and $$C_i$$ are irreducible closed sets of recurrent states

#### 1.1.2. Random Walk

The simple random walk on the integers have state space $$S = \{ 0, 1, -1, 2, -2, ... \}$$ and transition probabilities:

$p_{ij} = \begin{cases} p & \text{ if } j = i+1 \\ q = 1-p & \text{ if } j = i-1 \\ 0 & \text{otherwise} \end{cases}$

Gambler's ruin

A gambler starts with $$k$$ units and ends with either $$0$$ or $$N$$. Every turn it increases money by 1 units with probability $$p$$, otherwise decrease 1 unit with probability $$q=1-p$$. Let $$p_k$$ be the probability of ultimate ruin starting from $$k$$, then using the law of total probability (i.e $$P(A) = \sum_i P(A|B_i)P(B_i)$$), we know

$p_k = p \cdot p_{k+1} + q \cdot p_{k-1}$

Solving this gives the solution

$p_k = \frac{(q/p)^k - (q/p)^N}{1 - (q/p)^N}$

when $$p=1/2$$, this simplifies to $$p_k = 1 - k/N$$

Mean time $$D_k$$ to reach the absorbing barriers has similar formula

$D_k = p (1 + D_{k+1}) + q(1+D_{k-1})$

Solving this gives the solution

$D_k = \begin{cases} \frac{1}{q-p}[k- N (\frac{1 - (q/p)^k}{1 - (q/p)^N})] & \text{ if } p \neq 1/2 \\ k(N-k) & \text{ if } p = 1/2 \end{cases}$

One useful technique for simple random walk is to count the number of sample paths

Let $$S_n = a + \sum_{i=1}^n X_i$$ where $$S_0 = a$$ and move with $$n$$ steps. It is easy to see

$P(S_n = b) = \binom{n}{\frac{1}{2} (n+b-a)} p^{\frac{1}{2}(n+b-a)}q^{\frac{1}{2}(n-b+a)}$

where the first term $$N_n(a,b)$$ is the number of path from $$(0, a)$$ to $$(n,b)$$

$N_n(a,b) = \binom{n}{\frac{1}{2} (n+b-a)}$

Theorem (reflection) among all paths from $$(0,a)$$ to $$(n,b)$$ the number of paths which contain some point $$(k,0)$$ can be written as

$N_n^0(a,b) = N_n(-a,b)$

This can be seen by reflecting the path along the x-axis

### 1.2. Continuous-time Markov Process

Now generalize the discrete time to the continuous time: $$X = \{ X(t) | t \geq 0 \}$$, states are still countable

Definition (Markov property) The process $$X$$ satisfies Markov property if

$P(X(t_n) = j | X(t_1) = i_1, ..., X(t_{n-1} = i_{n-1}) = P(X(t_n) = j | X_(t_{n-1}) = i_{n-1})$

when $$t_i$$ is an increasing sequence

Definition (transition probability)

$p_{ij}(s,t) = P(X(t) = j | X(s) = i)$

It is homogeneous if $$p_{ij}(s, t) = p_{ij}(0, t-s)$$ for all $$i,j,s,t$$ and we only write $$p_{ij}(t-s)$$ to denote $$p_{ij}$$

## 2. Stationary Process

Definition (strongly stationary) The process $$X = \{ X(t) | t \geq 0 \}$$, taking values in $$R$$, is called strongly stationary if the families

$\{ X(t_1), X(t_2), ..., X(t_n) \}, \{ X(t_1+h), X(t_2+h), ..., X(t_n+h) \}$

have the same joint distribution, for all given $$t_1, ..., t_n$$ and $$h > 0$$

Instead of the finite-dimensional joint distribution (or fdds), covariance also characterize the joint distribution, which gives the weakly stationary

Definition (weakly stationary) The process $$X = \{ X(t) | t \geq 0 \}$$ is called weakly stationary if for all $$t_1, t_2$$, $$h > 0$$

$EX(t_1) = EX(t_2)$
$cov(X(t_1), X(t_2)) = cov(X(t_1+h), X(t_2+h))$

## 3. Renewals

Definition (renewal process) A renewal process $$N= \{ N(t): t \geq 0 \}$$ is a process for which

$N(t) = \max \{ n | T_n \leq t \}$

where $$T_0=0$$ and

$T_n = \sum_{i=1}^N X_i$

and $$X_i$$ are iid non-negative random variables

### 3.1. Poisson Process

If the $$X_i$$ are exponentially distributed, then $$N$$ is a Poisson process.

Definition (Poisson process) A Poisson process with intensity $$\lambda$$ is a process $$N = \{ N(t) : t \geq 0 \}$$ taking values in $$S = \{ 0, 1, 2, ... \}$$ such taht

$P(N(t+h) = n+m | N(t) = n) = \begin{cases} \lambda h + o(h) & \text{if } m =1 \\ o(h) & \text{if } m > 1 \\ 1 - \lambda h + o(h) & \text{ if }m=0 \\ \end{cases}$

Additionally $$N(0) = 0, N(s) \leq N(t)$$ for $$s < t$$ and $$N(t) - N(s)$$ is independent of the times of emissions during $$[0, s]$$

Bernoulli process and Poisson process

It can also be interpreted as a generalization of Bernoulli process to the continuous space

Definition (Bernoulli process) Let $$S = \{ 0, 1, 2 ... \}$$ and define the Markov chain $$Y$$ by

$P(Y_{n+1} = s+1 | Y_n = s) = p, P(Y_{n+1} = s | Y_n = s ) = 1-p$

Theorem (distribution of Poisson process) The random variable $$N(t)$$ has the Poisson distribution with parameter $$\lambda t$$

$P(N(t) = j) = \frac{e^{-\lambda t}(\lambda t)^j}{j!}$

An alternative formulation is to use the arrival times: $$T_0, T_1, ...$$ where $$T_n$$ is the time of the n-th arrival

$T_0 = 0, T_n = \inf_t (N(t) = n)$

The interarrival times are random variables $$U_0, U_1, ...,$$

$U_n = T_{n+1} - T_n$

The random variables $$U_0, U_1, ...$$ are independent, each having the exponential distribution with parameter $$\lambda$$

## 5. Martingales

Definition (martingale) A sequence $$Y = \{ Y_n : n \geq 0 \}$$ is a martingale with respect to the sequence $$X = \{ X_n : n \geq 0 \}$$ if, for all $$n \geq 0$$

$E|Y_n| < \infty$
$E(Y_{n+1} | X_0, X_1, ..., X_n) = Y_n$

Note a Markov chain might not be Martingales, and vice versa

simple martingale

Let $$X_1, X_2, ...$$ be independent variables with zero means. The sequence of partial sum $$S_n = X_1 + X_2 + ... + X_n$$ is a martingale wrt $$X$$

$E(S_{n+1} | X_1, ..., X_n) = E(S_n + X_{n+1} | X_1, ..., X_n) = S_n + 0$

## 6. Diffusion Process

Diffusion process is indexed by continuous time and taking values in real line.

### 6.1. Brownian Motion, Wiener Process

Brownian Motion can be seen as the limit of the random walks

Definition (Brownian Motion, Wiener process) A Wiener process $$W = \{ W(t) : t \geq 0 \}$$ has to satisfy

• $$W$$ has independent increments
• $$W(s+t) - W(s)$$ is distributed as $$N(0, \sigma^2 t)$$ for all $$s,t \geq 0$$
• sample paths of $$W$$ is almost surely continuous

It can be shwon to have the following properties:

• the sample paths are continuous functions of $$t$$
• almost all sample paths are nowhere diffeerntiable functions of $$t$$

## 7. Reference

• CMU 36410 Introduction to Probability Modeling notes
• Probability and Random Processes