\(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\Z}{\mathbb{Z}}\) \(\newcommand{\bs}{\boldsymbol}\) \( \newcommand{\cov}{\text{cov}} \) \( \newcommand{\var}{\text{var}} \) \( \newcommand{\sd}{\text{sd}} \)
  1. Random
  2. 1. Probability Spaces
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9
  12. 10

9. Stochastic Processes

Introduction

Suppose that \( (\Omega, \scr{F}, \P) \) is a probability space, so that \( \Omega \) is the sample space, \( \mathscr{F} \) the \( \sigma \)-algebra of events, and \( \P \) is the probability measure on \( \mathscr{F} \). Suppose also that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces; that is, each is a set with a \( \sigma \)-algebra of admissible subsets.

A random process or stochastic process on \( (\Omega, \mathscr{F}, \P) \) with state space \( S \) and index set \( T \) is a collection of random variables \( \bs{X} = \{X_t: t \in T\} \) such that \( X_t \) takes values in \( S \) for each \( t \in T \).

Sometimes it will be more notationally convenient to write \( X(t) \) instead of \( X_t \). Note that we can think of a random process as a random function \( t \mapsto X_t \) from \( T \) into \( S \). Since \( X_t \) is itself a (measurable) function from \( \Omega \) into \( S \), it follows that ultimately, a stochastic process is a function \( X: \Omega \times T \to S \).

But enough abstraction for a while. Several random processes are associated with Bernoulli trials:

Random process associated with the Poisson model include

Similarly, random processes associated with renewal theory include

Markov chains form a very important family of random processes as do Brownian motion and related processes. We will study these in subsequent chapters.

In addition to being to measurable spaces, the state space \( S \) and the index set \( T \) will often have other interesting mathematical structures as well, and often our interest is in the interplay between these structures and the probability laws that govern the process. Often \( T = \N \) or \( T = [0, \infty) \) and \( T \) is thought of as a time space. In this special case, \( X_t \) is the state of the random process at time \( t \). This is the setting for most of the processes mentioned above, but not all. The counting Poisson process on a general measure space is a process indexed by the measurable subsets of the space. In the study of Markov random fields and interacting particle systems, to give other examples, the index set is often a higher-dimensional integer lattice.

Equivalent Processes

Our next goal is to study different ways that two stochastic processes, with the same state and index spaces, can be equivalent.

First, we often feel that we understand a random process \( \bs{X} = \{X_t: t \in T\} \) well if we know the finite dimensional distributions, that is, if we know the distribution of \( \left(X_{t_1}, X_{t_2}, \ldots, X_{t_n}\right) \) for every choice of \( n \in \N_+ \) and \( (t_1, t_2, \ldots, t_n) \in T^n \). Thus, we can compute \( \P\left[\left(X_{t_1}, X_{t_2}, \ldots, X_{t_n}\right) \in A\right] \) for every \( n \in \N_+ \), \( (t_1, t_2, \ldots, t_n) \in T^n \), and (measurable) \( A \subseteq S^n \). Using various rules of probability, we can compute the probabilities of many events involving infinitely many values of the index parameter \( t \) as well. With this idea in mind, we have the following definition:

Random processes \( \bs{X} = \{X_t: t \in T\} \) and \( \bs{Y} = \{Y_t: t \in T\} \) with the state space \( S \) and index set \( T \) are equivalent in distribution if they have the same finite dimensional distributions. This defines an equivalence relation on the collection of stochastic processes with a given state space \( S \) and index set \( T \). That is,

  1. \( \bs{X} \) is equivalent in distribution to \( \bs{X} \) (the reflexive property)
  2. If \( \bs{X} \) is equivalent in distribution to \( \bs{Y} \) then \( \bs{Y} \) is equivalent in distribution to \( \bs{X} \) (the symmetric property)
  3. If \( \bs{X} \) is equivalent in distribution to \( \bs{Y} \) and \( \bs{Y} \) is equivalent in distribution to \( \bs{Z} \) then \( \bs{X} \) is equivalent in distribution to \( \bs{Z} \) (the transitive property)

Thus, equivalence in distribution partitions the collection of all random processes with a given state space and index set into mutually disjoint equivalence classes. But of course, we already know that two random variables can have the same distribution but be very different as variables (functions on the sample space). Clearly, the same statement applies to random processes.

Suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of Bernoulli trials with success parameter \( p = \frac{1}{2} \). Let \( Y_n = 1 - X_n \) for \( n \in \N_+ \). Then \( \bs{Y} = (Y_1, Y_2, \ldots) \) is equivalent in distribution to \( \bs{X} \) but \[ \P(X_n \ne Y_n \text{ for every } n \in \N_+) = 1 \]

Proof:

\( \bs{Y} \) is also a Bernoulli trials process with success parameter \( \frac{1}{2} \), so \( \bs{X} \) and \( \bs{Y} \) are equivalent in distribution. Also, of course, the state space is \( \{0, 1\} \) and \( Y_n = 1 \) if and only if \( X_n = 0 \).

Motivated by this example, let's look at another, stronger way that random processes can be equivalent. First recall that random variables \( X \) and \( Y \) on \( (\Omega, \mathscr{F}, \P) \), with values in \( S \), are equivalent if \( \P(X = Y) = 1 \).

Suppose that \( \bs{X} = \{X_t: t \in T\} \) and \( \bs{Y} = \{Y_t: t \in T\} \) are random processes with state space \( S \) and index set \( T \). Then \( \bs{Y} \) is a versions of \( \bs{X} \) if \( Y_t \) is equivalent to \( X_t \) (so that \( \P(X_t = Y_t) = 1 \)) for every \( t \in T \). This also defines an equivalence relation on the collection of stochastic processes with a given state space \( S \) and index set \( T \). That is,

  1. \( \bs{X} \) is a version of \( \bs{X} \) (the reflexive property)
  2. If \( \bs{X} \) is a version of \( \bs{Y} \) then \( \bs{Y} \) is ia version of \( \bs{X} \) (the symmetric property)
  3. If \( \bs{X} \) is a version of \( \bs{Y} \) and \( \bs{Y} \) is of \( \bs{Z} \) then \( \bs{X} \) is a version of \( \bs{Z} \) (the transitive property)

So the version of relation also partitions the collection of stochastic processes with a given state space and index set into mutually disjoint equivalence classes.

Suppose again that \( \bs{X} = \{X_t: t \in T\} \) and \( \bs{Y} = \{Y_t: t \in T\} \) are random processes with state space \( S \) and index set \( T \). If \( \bs{Y} \) is a version of \( \bs{X} \) then \( \bs{Y} \) and \( \bs{X} \) are equivalent in distribution.

Proof:

Suppose that \( (t_1, t_2, \ldots, t_n) \in T^n \) and that \( A \) is a measurable subset \( S^n \). Recall that the intersection of a finite (or even countably infinite) collection of events with probability 1 still has probability 1. Hence \begin{align} \P\left[\left(X_{t_1}, X_{t_2}, \ldots, X_{t_n}\right) \in A\right] & = \P\left[\left(X_{t_1}, X_{t_2}, \ldots, X_{t_n}\right) \in A, \, X_{t_1} = Y_{t_1}, X_{t_2} = Y_{t_2}, \ldots, X_{t_n} = Y_{t_n} \right] \\ & = \P\left[\left(Y_{t_1}, Y_{t_2}, \ldots, Y_{t_n}\right) \in A, \, X_{t_1} = Y_{t_1}, X_{t_2} = Y_{t_2}, \ldots, X_{t_n} = Y_{t_n} \right] = \P\left[\left(Y_{t_1}, Y_{t_2}, \ldots, Y_{t_n}\right) \in A\right] \end{align}

As noted in the proof, a countable intersection of events with probability 1 still has probability 1. Hence if \( T \) is countable and random processes \( \bs{X} \) is a version of \( \bs{Y} \) then \[ \P(X_t = Y_t \text{ for all } t \in T) = 1 \] so \( \bs{X} \) and \( \bs{Y} \) really are essentially the same random process. But when \( T \) is uncountable the result in the displayed equation may not be true, and \( \bs{X} \) and \( \bs{Y} \) may be very different as random functions on \( T \). Here is a simple example:

Suppose that \( \Omega = T = [0, \infty) \), \( \mathscr{F} = \mathscr{T} \) is the \( \sigma \)-algebra of Borel measurable subsets of \( [0, \infty) \), and \( \P \) is any continuous probability measure on \( (\Omega, \mathscr{F}) \). Let \( S = \{0, 1\} \) (with all subsets measurable, of course). For \( t \in T \) and \( \omega \in \Omega \), define \( X_t(\omega) = \bs{1}_t(\omega) \) and \( Y_t(\omega) = 0 \). Then \( \bs{X} = \{X_t: t \in T\} \) is a version of \( \bs{Y} = \{Y_t: t \in T\} \), but \( t \mapsto X_t \) is discontinuous with probability 1 while \( t \mapsto Y_t \) is continuous with probability 1.

Proof:

For \( t \in [0, \infty) \), \( \P(X_t \ne Y_t) = \P\{t\} = 0 \) since \( P \) is a continuous measure. But \( \P(t \mapsto X_t \text{ is discontinuous}) = \P(t \mapsto Y_t \text{ is continuous}) = \P(\Omega) = 1 \)

Motivated by this example, we have our strongest form of equivalence:

Suppose that \( \bs{X} = \{X_t: t \in T\} \) and \( \bs{Y} = \{Y_t: t \in T\} \) are random processes with state space \( S \) and index set \( T \). Then \( \bs{X} \) is indistinguishable from \( \bs{Y} \) if \( \P(X_t = Y_t \text{ for all } t \in T) = 1 \). This also defines an equivalence relation on the collection of stochastic processes with a given state space \( S \) and index set \( T \). That is,

  1. \( \bs{X} \) is indistinguishable from \( \bs{X} \) (the reflexive property)
  2. If \( \bs{X} \) is indistinguishable from \( \bs{Y} \) then \( \bs{Y} \) is indistinguishable from \( \bs{X} \) (the symmetric property)
  3. If \( \bs{X} \) is indistinguishable from \( \bs{Y} \) and \( \bs{Y} \) is indistinguishable from \( \bs{Z} \) then \( \bs{X} \) is indistinguishable from \( \bs{Z} \) (the transitive property)

So the indistinguishable from relation also partitions the collection of stochastic processes with a given state space and index set into mutually disjoint equivalence classes. Trivially, if \( \bs{X} \) is indistinguishable from \( \bs{Y} \), then \( \bs{X} \) is a version of \( \bs{Y} \). As noted above, when \( T \) is countable, the converse is also true, but not, as our previous example shows, when \( T \) is uncountable.

So to summarize, indistinguishable from implies version of implies equivalent in distribution, but none of the converse implications hold in general.

The Kolmogorov Construction

How can we construct random processes with given distributional properties? More specifically, how can we construct random processes with specified finite dimensional distributions? Many of the random processes that we will study are ultimately constructed from simple sequences of random variables. All of the random processes associated with Bernoulli trials, and all of the random processes from renewal theory (including the Poisson model) are ultimately constructed from a sequence of independent, identically distributed (IID) variables: the Bernoulli trials sequence in the first case, and the sequence of interarrival times in the second case. It's relatively easy to construct a probability space that supports a sequence of IID variables with a specified distribution. A bit more generally, it's relatively easy to construct a probability space for a sequence of random variables \( (X_0, X_1, \ldots) \) in which the distribution of \( X_0 \) is specified and, for each \( n \in \N \), the conditional distribution of \( X_{n+1} \) given \( (X_0, \ldots, X_n) \) is specified. This applies in particular to Markov chains.

However, some random processes, particularly Brownian motion cannot be constructed easily from a simple sequence of random variables, and so the question of existence is critical. The Kolmogorov existence theorem (named for Andrei Kolmogorov) states that if we specify the finite dimensional distributions in a consistent way, then there exists a stochastic process \( \{X_t: t \in T\} \) defined on a suitable probability space that has the given finite dimensional distributions. The consistency condition is a bit clunky to state in full generality, but the basic idea is very easy to understand. In the simplest case, suppose that \( s \) and \( t \) are distinct elements in \( T \) and that we specify the distribution (probability measure) \( P_s \) of \( X_s \), \( P_t \) of \( X_t \), \( P_{s,t} \) of \( (X_s, X_t) \), and \( P_{t,s} \) of \( (X_t, X_s) \). Then clearly we must specify these so that \[ P_s(A) = P_{s,t}(A \times S), \quad P_t(B) = P_{s,t}(S \times B) \] For all (measurable) \( A, \, B \subseteq S \). Clearly we also must have \( P_{s,t}(C) = P_{t,s}(C^\prime) \) for all measurable \( C \subseteq S \times S \), where \( C^\prime = \{(y, x): (x, y) \in C\} \).

Let's now state the Kolmogorov theorem more precisely. If you are a new student of probability, or are not interested in the technical details, you might want to skip the rest of this section. If you are interested, be sure to review the section on measurable theory in the chapter on Foundations and the section on measure theory in this chapter. We assume again that our index set is \( T \) and that our state space is \( S \) with \( \sigma \)-algebra \( \mathscr{S} \). To state the consistency conditions, we need some notation. For \( n \in \N_+ \), let \( T^{(n)} \) denote the set of \( n \)-tuples of distinct elements of \( T \), and let \( \bs{T} = \bigcup_{n=1}^\infty T^{(n)} \) denote the set of all finite sequences of distinct elements of \( T \). If \( n \in \N_+ \), \( \bs{t} = (t_1, t_2, \ldots, t_n) \in T^{(n)} \) and \( \pi \) is a permutation of \( \{1, 2, \ldots, n\} \), let \( \bs{t} \pi \) denote the element of \( T^{(n)} \) with coordinates \( (\bs{t} \pi)_i = t_{\pi(i)} \). That is, we permute the coordinates of \( \bs{t} \) according to \( \pi \). If \( C \subseteq S^n \) is measurable, let \[ \pi C = \left\{(x_1, x_2, \ldots, x_n) \in S^n: \left(x_{\pi(1)}, x_{\pi(2)}, \ldots, x_{\pi(n)}\right) \in C\right\} \] finally, if \( n \gt 1 \), let \( \bs{t}_- \) denote the vector \( (t_1, t_2, \ldots, t_{n-1}) \in T^{(n-1)} \)

Now suppose that \( P_\bs{t} \) is a probability measure on \( S^n \) for each \( n \in \N_+ \) and \( \bs{t} \in T^{(n)} \). The idea, of course, is that we want the collection \( \mathscr{P} = \{P_\bs{t}: \bs{t} \in \bs{T}\} \) to be the finite dimensional distributions of a random process with index set \( T \) and state space \( S \). Here is the critical definition:

The collection \( \mathscr{P} \) is consistent if

  1. \( P_{\bs{t} \pi}(C) = P_\bs{t}(\pi C) \) for every \( n \in \N_+ \), \( \bs{t} \in T^{(n)} \), permutation \( \pi \) of \( \{1, 2, \ldots, n\} \), and measurable \( C \subseteq S^n \).
  2. \( P_{\bs{t}_-}(C) = P_\bs{t}(C \times S) \) for every \( n > 1 \), \( \bs{t} \in T^{(n)} \), and measurable \( C \subseteq S^{n-1} \)

With the proper definition of consistence, we state the Kolmogorov existence theorem:

If \( \mathscr{P} \) is a consistent collection of probability distributions, then there exists a probability space \( (\Omega, \mathscr{F}, \P) \) and a stochastic process \( \bs{X} = \{X_t: t \in T\} \) on this probability space such that \( \mathscr{P} \) is the collection of finite dimensional distribution of \( \bs{X} \).

Proof sketch:

Think of the entire stochastic process as a random experiment. Then a possible outcome is a function \( \omega: T \to S \). Thus, our sample space is the collection \( \Omega \) of all such functions. The \( \sigma \)-algebra \( \mathscr{F} \) on \( \Omega \) is the \( \sigma \)-algebra generated by all sets of the form \[ A = \{\omega \in \Omega: \omega(t) \in B_t \text{ for all } t \in T\} \] where \( B_t \in \mathscr{S} \) for all \( t \in T \) and \( B_t = S \) for all but finitely many \( t \in T \). We know how our desired probability measure \( \P \) should work on the sets that generate \( \mathscr{F} \). Specifically, suppose that \( A \) is a set of the type in the displayed equation, and \( B_t = S \) except for \( \bs{t} = (t_1, t_2, \ldots, t_n) \in T^{(n)} \). Then we want \[ \P(A) = P_\bs{t}(B_{t_1} \times B_{t_2} \times \cdots \times B_{t_n}) \] Basic existence and uniqueness theorems in measure theory that we discussed earlier, and the consistency of \( \mathscr{P} \), guarantee that \( \P \) can be extended to a probability measure on all of \( \mathscr{F} \). Finally, for \( t \in T \) we define \( X_t: \Omega \to S \) by \( X_t(\omega) = \omega(t) \) for \( \omega \in \Omega \). Thus, we have a stochastic process \( \bs{X} = \{X_t: t \in T\} \) with state space \( S \), defined on the probability space \( (\Omega, \mathscr{F}, \P) \), with \( \mathscr{P} \) as the collection of finite dimensional distributions.