\( \newcommand{\P}{\mathbb{P}} \) \( \newcommand{\R}{\mathbb{R}} \) \( \newcommand{\N}{\mathbb{N}} \) \( \newcommand{\Z}{\mathbb{Z}} \) \( \newcommand{\bs}{\boldsymbol} \) \( \newcommand{\length}{\text{length}} \)
  1. Random
  2. 1. Probability Spaces
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9
  12. 10

8. Existence and Uniqueness

Suppose that \( S \) is a set and \( \mathscr{S} \) a \( \sigma \)-algebra of subsets of \( S \), so that \( (S, \mathscr{S}) \) is a measurable space. In many cases, it is impossible to define a positive measure \(\mu\) on \(\mathscr{S}\) explicitly, by giving a formula for computing \(\mu(A)\) for each \(A \in \mathscr{S}\). Rather, we often know how the measure \(\mu\) should work on some class of sets \(\mathscr{B}\) that generates \( \mathscr{S} \). We would then like to know that \(\mu\) can be extended to a positive measure on \(\mathscr{S}\), and that this extension is unique. The purpose of this section is to discuss the basic results on this topic. To understand this section you will need to review the sections on measure theory and special set structures in the chapter on Foundations, and the section on positive measures in this chapter. If you are not interested in questions of existence and uniqueness of positive measures, you can safely skip this section.

Basic Theory

Positive Measures on Algebras

Recall first that an algebra \(\mathscr{A}\) of subsets of \(S\) is a collection of subsets that contains \(S\) and is closed under complements and finite unions (and hence also finite intersections). Here is our first definition:

A positive measure on \(\mathscr{A}\) is a function \( \mu: \mathscr{A} \to [0, \infty] \) that satisfies the following properties:

  1. \( \mu(\emptyset) = 0 \)
  2. If \( \{A_i: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr{A} \) and if \( \bigcup_{i \in I} A_i \in \mathscr{A} \) then \[ \mu\left(\bigcup_{i \in I} A_i\right) = \sum_{i \in I} \mu(A_i) \]

Clearly the definition of a positive measure on an algebra is very similar to the definition for a \( \sigma \)-algebra. If the collection of sets in (b) is finite, then \( \bigcup_{i \in I} A_i \) must be in the algebra \( \mathscr{A} \). Thus, \( \mu \) is finitely additive. If the collection is countably infinite, then there is no guarantee that the union is in \( \mathscr{A} \). If it is however, then \( \mu \) must be additive over this collection. Given the similarity, it is not surprising that \( \mu \) shares many of the basic properties of a positive measure on a \( \sigma \)-algebra, with proofs that are almost identical.

If \( A, \; B \in \mathscr{A} \), then \( \mu(B) = \mu(A \cap B) + \mu(B \setminus A) \).

Proof:

Note that \( B = (A \cap B) \cup (B \setminus A) \), and the sets in the union are in the algebra \( \mathscr{A} \) and are disjoint.

If \( A, \; B \in \mathscr{A} \) and \( A \subseteq B \) then

  1. \( \mu(B) = \mu(A) + \mu(B \setminus A) \)
  2. \( \mu(A) \le \mu(B) \)
Proof:

Part (a) follows from the previous theorem, since \( A \cap B = A \). Part (b) follows from part (a).

Thus \( \mu \) is increasing, relative to the subset partial order \( \subseteq \) on \( \mathscr{A} \) and the ordinary order \( \le \) on \( [0, \infty] \). Note also that if \( A, \; B \in \mathscr{A} \) and \( \mu(B) \lt \infty \) then \( \mu(B \setminus A) = \mu(B) - \mu(A \cap B) \). In the special case that \( A \subseteq B \), this becomes \( \mu(B \setminus A) = \mu(B) - \mu(A) \). If \( \mu(S) \lt \infty \) then \( \mu(A^c) = \mu(S) - \mu(A) \). These are the familiar difference and complement rules.

The following result is the subadditive property for a positive measure \( \mu \) on an algebra \( \mathscr{A} \).

Suppose that \( A_i \in \mathscr{A} \) for \( i \) in a countable index set \( I \) and \( \bigcup_{i \in I} A_i \in \mathscr{A} \). Then \[ \mu\left(\bigcup_{i \in I} A_i \right) \le \sum_{i \in I} \mu(A_i) \]

Proof:

The proof is just like before. Assume that \( I = \N_+ \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus (A_1 \cup \ldots \cup A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Then \( \{B_i: i \in I\} \) is a disjoint collection of sets in \( \mathscr{A} \) with the same union as \( \{A_i: i \in I\} \). Also \( B_i \subseteq A_i \) for each \( i \) so \( \mu(B_i) \le \mu(A_i) \). Hence if the union is in \( \mathscr{A} \) then \[ \mu\left(\bigcup_{i \in I} A_i \right) = \mu\left(\bigcup_{i \in I} B_i \right) = \sum_{i \in I} \mu(B_i) \le \sum_{i \in I} \mu(A_i) \]

For a finite union of sets with finite measure, the inclusion-exclusion formula holds, and the proof is just like the one for a probability measure.

Suppose that \(A_i \in \mathscr{S}\) for each \(i \in I\) where \(\#(I) = n\), and that \( \mu(A_i) \lt \infty \) for \( i \in I \). Then \[\mu \left( \bigcup_{i \in I} A_i \right) = \sum_{k = 1}^n (-1)^{k - 1} \sum_{J \subseteq I, \; \#(J) = k} \mu \left( \bigcap_{j \in J} A_j \right)\]

The continuity theorems hold for a positive measure \( \mu \) on an algebra \( \mathscr{A} \), just as for a positive measure on a \( \sigma \)-algebra, assuming that the appropriate union and intersection are in the algebra. The proofs are just as before.

Suppose that \( (A_1, A_2, \ldots) \) is a sequence of sets in \( \mathscr{S} \).

  1. If the sequence is increasing, so that \( A_n \subseteq A_{n+1} \) for each \( n \in \N_+ \), and \( \bigcup_{i = 1}^\infty A_i \in \mathscr{A} \), then \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).
  2. If the sequence is decreasing, so that \( A_{n+1} \subseteq A_n \) for each \( n \in \N_+ \), and \( \mu(A_1) \lt \infty \) and \( \bigcap_{i=1}^\infty A_i \in \mathscr{A} \), then \( \mu\left(\bigcap_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).
Proof:

For part (a), note that if \( \mu(A_k) = \infty \) for some \( k \) then \( \mu(A_n) = \infty \) for \( n \ge k \) and \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \infty \) if this union is in \( \mathscr{A} \). Thus, suppose that \( \mu(A_i) \lt \infty \) for each \( i \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus A_{i-1} \) for \( i \in \{2, 3, \ldots\} \). Then \( (B_1, B_2, \ldots) \) is a disjoint sequence in \( \mathscr{A} \) with the same union as \( (A_1, A_2, \ldots) \). Also, \( \mu(B_1) = \mu(A_1) \) and \( \mu(B_i) = \mu(A_i) - \mu(A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Hence if the union is in \( \mathscr{A} \), \[ \mu\left(\bigcup_{i=1}^\infty A_i \right) = \mu \left(\bigcup_{i=1}^\infty B_i \right) = \sum_{i=1}^\infty \mu(B_i) = \lim_{n \to \infty} \sum_{i=1}^n \mu(B_i) \] But \( \sum_{i=1}^n \mu(B_i) = \mu(A_1) + \sum_{i=2}^n [\mu(A_i) - \mu(A_{i-1})] = \mu(A_n) \).

For part (b), note that \( A_1 \setminus A_n \in \mathscr{A} \) and this sequence is increasing. Moreover, \( \bigcup_{n=1}^\infty (A_1 \setminus A_n) = \left(\bigcap_{n=1}^\infty A_n \right)^c \cap A_1 \). Hence if \( \bigcap_{n=1}^\infty A_n \in \mathscr{A} \) then \( \bigcup_{n=1}^\infty (A_1 \setminus A_n) \in \mathscr{A} \). Thus using the continuity result for increasing sets, \begin{align} \mu \left(\bigcap_{i=1}^\infty A_i \right) & = \mu\left[A_1 \setminus \bigcup_{i=1}^\infty (A_1 \setminus A_i) \right] = \mu(A_1) - \mu\left[\bigcup_{i=1}^\infty (A_1 \setminus A_n)\right]\\ & = \mu(A_1) - \lim_{n \to \infty} \mu(A_1 \setminus A_n) = \mu(A_1) - \lim_{n \to \infty} [\mu(A_1) - \mu(A_n)] = \lim_{n \to \infty} \mu(A_n) \end{align}

Recall that if the sequence \( (A_1, A_2, \ldots) \) is increasing, then we define \( \lim_{n \to \infty} A_n = \bigcup_{n=1}^\infty A_n \), and if the sequence is decreasing then we define \( \lim_{n \to \infty} A_n = \bigcap_{n=1}^\infty A_n \). Thus the conclusion of both parts of the continuity theorem is \[ \P\left(\lim_{n \to \infty} A_n\right) = \lim_{n \to \infty} \P(A_n) \] Finite additivity and continuity for increasing events imply countable additivity:

If \( \mu: \mathscr{A} \to [0, \infty] \) satisfies the properties below then \( \mu \) is a positive measure on \( \mathscr{A} \).

  1. \( \mu(\emptyset) = 0 \)
  2. \( \mu\left(\bigcup_{i \in I} A_i\right) = \sum_{i \in I} \mu(A_i) \) if \( \{A_i: i \in I\} \) is a finite disjoint collection of sets in \( \mathscr{A} \)
  3. \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \) if \( A_n, \; n \in \N_+ \) is an increasing sequence of events in \( \mathscr{A} \) and \( \bigcup_{i=1}^\infty A_i \in \mathscr{A} \).
Proof:

All that is left to prove is additivitiy over a countably infinite collection of sets in \( \mathscr{A} \) when the union is also in \( \mathscr{A} \). Thus suppose that \( A_n \in \mathscr{A} \) for \( n \in \N_+ \), and that the sets are disjoint and \( \bigcup_{n=1}^\infty A_n \in \mathscr{A} \). Let \( B_n = \bigcup_{i=1}^n A_i \) for \( n \in \N_+ \). Then \( B_n \in \mathscr{A} \) for \( n \in \N_+ \), and this sequence is increasing and has the same union as \( A_n, \; n \in \N_+ \). Hence using the finite additivity and the continuity property we have \[ \P\left(\bigcup_{n = 1}^\infty A_n\right) = \P\left(\bigcup_{n=1}^\infty B_n\right) = \lim_{n \to \infty} \P(B_n) = \lim_{n \to \infty} \sum_{i=1}^n \P(A_i) = \sum_{i=1}^\infty \P(A_i) \]

Many of the basic theorems in measure theory require that the measure not be too far removed from being finite. This leads to the following definition, which is just like the one for a positive measure on a \( \sigma \)-algebra.

A measure \( \mu \) on an algebra \( \mathscr{A} \) of subsets of \( S \) is said to be \( \sigma \)-finite if there exists a sequence of sets \( A_n \in \mathscr{A}, \; n \in \N_+ \) such that \( \bigcup_{n=1}^\infty A_n = S \) and \( \mu(A_n) \lt \infty \) for each \( n \in \N_+ \). The sequence is called a \( \sigma \)-finite sequence.

Suppose that \( \mu \) is a \( \sigma \)-finite measure on an algebra \( \mathscr{A} \) of subsets of \( S \).

  1. There exists an increasing \( \sigma \)-finite sequence.
  2. There exists a disjoint \( \sigma \)-finite sequence.
Proof:

We use the same tricks that we have used before. Let \( A_n \in \mathscr{A}, \; n \in \N_+ \) be a sequence that satisfies the \( \sigma \)-finite definition. That is, \( \mu(A_n) \lt \infty \) and \( S = \bigcup_{n=1}^\infty A_n \).

  1. Let \( B_n = \bigcup_{i = 1}^n A_i \). Then \( B_n \in \mathscr{A} \) for \( n \in \N_+ \) and this sequence is increasing. Moreover, \( \mu(B_n) \le \sum_{i=1}^n \mu(A_i) \lt \infty \) for \( n \in \N_+ \) and \( \bigcup_{n=1}^\infty B_n = \bigcup_{n=1}^\infty A_n = S \).
  2. Let \( C_1 = A_1 \) and let \( C_n = A_n \setminus \bigcup_{i=1}^{n-1} A_i \) for \( n \in \{2, 3, \ldots\} \). Then \( C_n \in \mathscr{A} \) for each \( n \in \N_+ \) and this sequence is disjoint. Moreover, \( C_n \subseteq A_n \) so \( \mu(C_n) \le \mu(A_n) \lt \infty \) and \( \bigcup_{n=1}^\infty C_n = \bigcup_{n=1}^\infty A_n = S \).

Extension and Uniqueness Theorems

The fundamental theorem on measures states that a positive, \( \sigma \)-finite measure \( \mu \) on an algebra \( \mathscr{A} \) can be uniquely extended to \( \sigma(\mathscr{A}) \). The extension part is sometimes referred to as the Carathéodory extension theorem, and is named for the Greek mathematician Constantin Carathéodory.

If \( \mu \) is a positive, \( \sigma \)-finte measure on an algebra \(\mathscr{A}\), then \( \mu \) can be extended to a positive measure on \( \mathscr{S} = \sigma(\mathscr{A}) \).

Proof:

The proof is complicated, but here is a broad outline. First, for \( A \subseteq S \), we define a cover of \( A \) to be a countable collection \( \{A_i: i \in I\} \) of sets such that \( A_i \in \mathscr{A} \) for each \( i \in I \) and \( A \subseteq \bigcup_{i \in I} A_i \). Next, we define a new set function \( \mu^* \), the outer measure, on all subsets of \( S \): \[ \mu^*(A) = \inf \left\{ \sum_{i \in I} \mu(A_i): \{A_i: i \in I\} \text{ is a cover of } A \right\}, \quad A \subseteq S \] Outer measure satifies the following properties.

  1. \( \mu^*(A) \ge 0 \) for \( A \subseteq S \), so \( \mu^* \) is nonnegative.
  2. \( \mu^*(A) = \mu(A) \) for \( A \in \mathscr{A} \), so \( \mu^* \) extends \( \mu \).
  3. If \( A \subseteq B \) then \( \mu^*(A) \le \mu^*(B) \), so \( \mu^* \) is increasing
  4. If \( A_i \subseteq S \) for each \( i \) in a countable index set \( I \) then \( \mu^*\left(\bigcup_{i \in I} A_i\right) \le \sum_{i \in I} \mu^*(A_i) \), so \( \mu^* \) is countably subadditive.

Next, \( A \subseteq S \) is said to be measurable if \[ \mu^*(B) = \mu^*(B \cap A) + \mu^*(B \setminus A), \quad B \subseteq S \] Thus, \( A \) is measurable if \( \mu^* \) is additive with respect to the partition of \( B \) induced by \( \{A, A^c\} \), for every \( B \subseteq S \). We let \( \mathscr{M} \) denote the collection of measurable subsets of \( S \). The proof is finished by showing that \( \mathscr{A} \subseteq \mathscr{M} \), \( \mathscr{M} \) is a \( \sigma \)-algebra of subsets of \( S \), and \( \mu^* \) is a positive measure on \( \mathscr{M} \). It follows that \( \sigma(\mathscr{A}) = \mathscr{S} \subseteq \mathscr{M} \) and hence \( \mu^* \) is a measure on \( \mathscr{S} \) that extends \( \mu \)

The following theorem is the basic uniqueness result, and thus serves as the complement to the basic extension result. The proof, like others that we have seen, uses Dynkin's \( \pi \)-\( \lambda \) theorem, named for Eugene Dynkin. The theorem also requires another variation of the term \( \sigma \)-finite. Suppose that \( \mu \) is a measure on a \( \sigma \)-algebra \( \mathscr{S} \) of subsets of \( S \) and \( \mathscr{B} \subseteq \mathscr{S} \). Then \( \mu \) is \( \sigma \)-finite on \( \mathscr{B} \) if there exists a countable collection \( \{B_i: i \in I\} \subseteq \mathscr{B} \) such that \( \mu(B_i) \lt \infty \) for \( i \in I \) and \( \bigcup_{i \in I} B_i = S \).

Suppose that \( \mathscr{B} \) is a \( \pi \)-system and that \( \mathscr{S} = \sigma(\mathscr{B}) \). If \( \mu_1 \) and \( \mu_2 \) are positive measures on \( \mathscr{S} \) and are \( \sigma \)-finite on \( \mathscr{B} \), and if \( \mu_1(A) = \mu_2(A) \) for all \( A \in \mathscr{B} \), then \( \mu_1(A) = \mu_2(A) \) for all \( A \in \mathscr{S} \).

Proof:

Suppose that \( B \in \mathscr{B} \) and that \( \mu_1(B) = \mu_2(B) \lt \infty \). Let \( \mathscr{L}_B = \{A \in \mathscr{S}: \mu_1(A \cap B) = \mu_2(A \cap B) \} \). Then \( S \in \mathscr{L}_B \) since \( \mu_1(B) = \mu_2(B) \). If \( A \in \mathscr{L}_B \) then \( \mu_1(A \cap B) = \mu_2(A \cap B) \) so \( \mu_1(A^c \cap B) = \mu_1(B) - \mu_1(A \cap B) = \mu_2(B) - \mu_2(A \cap B) = \mu_2(A^c \cap B) \) and hence \( A^c \in \mathscr{L}_B \). Finally, suppose that \( \{A_j: j \in J\} \) is a countable, disjoint collection of events in \( \mathscr{L}_B \). Then \( \mu_1(A_j \cap B) = \mu_2(A_j \cap B) \) for each \( j \in J \) and hence \begin{align} \mu_1\left[ \left(\bigcup_{j \in J} A_j \right) \cap B \right] & = \mu_1 \left(\bigcup_{j \in J} (A_j \cap B) \right) = \sum_{j \in J} \mu_1(A_j \cap B) \\ & = \sum_{j \in J} \mu_2(A_j \cap B) = \mu_2\left(\bigcup_{j \in J} (A_j \cap B) \right) = \mu_2 \left[ \left(\bigcup_{j \in J} A_j \right) \cap B \right] \end{align} Therefore \( \bigcup_{j \in J} A_j \in \mathscr{L}_B \), and so \( \mathscr{L}_B \) is a \( \lambda \)-system. By assumption, \( \mathscr{B} \subseteq \mathscr{L}_B \) and therefore by the \( \pi \)-\( \lambda \) theorem, \( \mathscr{S} = \sigma(\mathscr{B}) \subseteq \mathscr{L}_B \).

Next, by assumption there exists \( B_i \in \mathscr{B} \) with \( \mu_1(B_i) = \mu_2(B_i) \lt \infty \) for each \( i \in \N_+ \) and \( S = \bigcup_{i=1}^\infty B_i \). If \( A \in \mathscr{S} \) then the inclusion-exclusion rule can be applied to \[ \mu_k\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] = \mu_k\left[\bigcup_{i=1}^n (A \cap B_i) \right] \] where \( k \in \{1, 2\} \) and \( n \in \N_+ \). But the inclusion-exclusion formula only has terms of the form \( \mu_k \left[ \bigcap_{j \in J} (A \cap B_j) \right] = \mu_k \left[ A \cap \left(\bigcap_{j \in J} B_j\right) \right] \) where \( J \subseteq \{1, 2, \ldots, n\} \). But \( \bigcap_{j \in J} B_j \in \mathscr{B} \) since \( \mathscr{B} \) is a \( \pi \)-system, so by the previous paragraph, \( \mu_1 \left[ \bigcap_{j \in J} (A \cap B_j) \right] = \mu_2 \left[ \bigcap_{j \in J} (A \cap B_j) \right] \). It then follows that for each \( n \in \N_+ \) \[ \mu_1\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] = \mu_2\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] \] Finally, letting \( n \to \infty \) and using the continuity theorem for increasing sets gives \( \mu_1(A) = \mu_2(A) \).

An algebra \( \mathscr{A} \) of subsets of \( S \) is trivially a \( \pi \)-system. Hence, if \( \mu_1 \) and \( \mu_2 \) are positive measures on \( \mathscr{S} = \sigma(\mathscr{A}) \) and are \( \sigma \)-finite on \( \mathscr{A} \), and if \( \mu_1(A) = \mu_2(A) \) for \( A \in \mathscr{A} \), then \( \mu_1(A) = \mu_2(A) \) for \( A \in \mathscr{S} \). This completes the second part of the fundamental theorem.

Of course, the results of this subsection hold for probability measures. Formally, a probability measure \( \P \) on an algebra \( \mathscr{A} \) of subsets of \( S \) is a positive measure on \( \mathscr{A} \) with the additional requirement that \( \P(S) = 1 \). Probability measures are trivially \( \sigma \)-finite, so a probability measure \( \P \) on an algebra \( \mathscr{A} \) can be uniquely extended to \( \mathscr{S} = \sigma(\mathscr{A}) \).

However, usually we start with a collection that is more primitive than an algebra. The next result combines the definition with the main theorem associated with the definition. For a proof see the section on special set structures in the chapter on Foundations.

Suppose that \( \mathscr{B} \) is a nonempty collection of subsets of \( S \) and let \[ \mathscr{A} = \left\{\bigcup_{i \in I} B_i: \{B_i: i \in I\} \text{ is a finite, disjoint collection of sets in } \mathscr{B}\right\} \] If the following conditions are satisfied, then \( \mathscr{B} \) is a semi-algebra of subsets of \( S \), and then \( \mathscr{A} \) is the algebra generated by \(\mathscr{B}\).

  1. If \( B_1, \, B_2 \in \mathscr{B} \) then \( B_1 \cap B_2 \in \mathscr{B} \).
  2. If \( B \in \mathscr{B} \) then \( B^c \in \mathscr{A} \).

Suppose now that we know how a measure \( \mu \) should work on a semi-algebra \( \mathscr{B} \) that generates an algebra \( \mathscr{A} \) and then a \( \sigma \)-algebra \( \mathscr{S} = \sigma(\mathscr{A}) = \sigma(\mathscr{B}) \). That is, we know \( \mu(B) \in [0, \infty] \) for each \( B \in \mathscr{B} \). Because of the additivity property, there is no question as to how we should extend \( \mu \) to \(\mathscr{A}\). We must have \[ \mu(A) = \sum_{i \in I} \mu(B_i)\] if \(A = \bigcup_{i \in I} B_i\) for some finite, disjoint collection \( \{B_i: i \in I\} \) of sets in \( \mathscr{B} \) (so that \( A \in \mathscr{A} \)). However, we cannot assign the values \( \mu(B) \) for \( B \in \mathscr{B} \) arbitrarily. The following extension theorem states that, subject just to some essential consistency conditions, the extension of \( \mu \) from the semi-algebra \( \mathscr{B} \) to the algebra \( \mathscr{A} \) does in fact produce a measure on \( \mathscr{A} \). The consistency conditions are that \( \mu \) be finitely additive and countably subadditive on \( \mathscr{B} \).

Suppose that \( \mathscr{B} \) is a semi-algebra of subsets of \( S \) and that \( \mathscr{A} \) is the algebra of subsets of \( S \) generated by \(\mathscr{B}\). A function \( \mu: \mathscr{B} \to [0, \infty] \) can be uniquely extended to a measure on \( \mathscr{A} \) if and only if \( \mu \) satisfies the following properties:

  1. If \( \emptyset \in \mathscr{B} \) then \( \mu(\emptyset) = 0 \).
  2. If \( \{B_i: i \in I\} \) is a finite, disjoint collection of sets in \( \mathscr{B} \) and \( B = \bigcup_{i \in I} B_i \in \mathscr{B} \) then \( \mu(B) = \sum_{i \in I} \mu(B_i) \).
  3. If \( B \in \mathscr{B} \) and \( B \subseteq \bigcup_{i \in I} B_i \) where \( \{B_i: i \in I\} \) is a countable collection of sets in \( \mathscr{B} \) then \( \mu(B) \le \sum_{i \in I} \mu(B_i) \)

If the measure \( \mu \) on the algebra \( \mathscr{A} \) is \( \sigma \)-finite, then the extension theorem and the uniqueness theorem above apply, so \( \mu \) can be extended uniquely to a measure on the \( \sigma \)-algebra \( \mathscr{S} = \sigma(\mathscr{A}) = \sigma(\mathscr{B}) \). This chain of extensions, starting with a semi-algebra \( \mathscr{B} \), is often how measures are constructed.

Examples and Applications

Lebesgue Measure on \( \R \)

We start with our most important and essential application. Recall that the Borel \( \sigma \)-algebra, named for Émile Borel, is the \( \sigma \)-algebra \( \mathscr{R} \) generated by the standard Euclidean topology on \( \R \). Equivalently, \( \mathscr{R} = \sigma(\mathscr{I}) \) where \( \mathscr{I} \) is the collection of intervals of \( \R \) (of all types—bounded and unbounded, with any type of closure, and including single points and the empty set). Next recall how the length of an interval is defined. For \( a, \, b \in \R \) with \( a \le b \), each of the intervals \( (a, b) \), \( [a, b) \), \( (a, b] \), and \( [a, b] \) has length \( b - a \). For \( a \in \R \), each of the intervals \( (a, \infty) \), \( [a, \infty) \), \( (-\infty, a) \), \( (-\infty, a] \) has length \( \infty \), as does \( \R \) itself. The standard measure on \( \mathscr{R} \) generalizes the length measurement for intervals.

There exists a unique measure \( \lambda \) on \( \mathscr{R} \) such that \( \lambda(I) = \length(I) \) for \( I \in \mathscr{I} \).

Proof:

Recall that \( \mathscr{I} \) is a semi-algebra: The intersection of two intervals is another interval, and the complement of an interval is either another interval or the union of two disjoint intervals. Define \( \lambda \) on \( \mathscr{I} \) by \( \lambda(I) = \length(I) \) for \( I \in \mathscr{I} \). Then \( \lambda \) satisfies the consistency conditions above and hence \( \lambda \) can be extended to a measure on the algebra \( \mathscr{J} \) generated by \( \mathscr{I} \), namely the collection of finite, disjoint unions of intervals. The measure \( \lambda \) on \( \mathscr{J} \) is clearly \( \sigma \)-finite, since \( \R \) can be written as a countably infinite union of bounded intervals. Hence the standard extension and uniqueness theorems apply, so \( \lambda \) can be extended to a measure on \( \mathscr{R} = \sigma(\mathscr{I}) \).

The measure \( \lambda \) is known as Lebesgue measure in honor of Henri Lebesgue. Since \( \lambda \) is \( \sigma \)-finite, the \( \sigma \)-algebra of Borel sets \( \mathscr{R} \) can be completed with respect to \( \lambda \). We denote the completed \( \sigma \)-algebra by \( \mathscr{R}^* \); this is the Lebesgue \( \sigma \)-algebra. Recall that completed means that if \( A \in \mathscr{R}^* \), \( \lambda(A) = 0 \) and \( B \subseteq A \), then \( B \in \mathscr{R}^* \) (and then \( \lambda(B) = 0 \)). The Lebesgue measure \( \lambda \) on \( \R \), with either the Borel \( \sigma \)-algebra \( \mathscr{R} \), or its completion \( \mathscr{R}^* \) is the standard measure that is used for the real numbers.

Lebesgue-Stieltjes Measures on \( \R \)

The construction of Lebesgue measure on \( \R \) can be generalized. Here is the definition that we will need.

A function \( F: \R \to \R \) that satisfis the following properties is a distribution function on \( \R \)

  1. \( F \) is increasing: if \( x \le y \) then \( F(x) \le F(y) \).
  2. \( F \) is continuous from the right: \( \lim_{t \downarrow x} F(t) = F(x) \) for all \( x \in \R \).

Since \( F \) is increasing, the limit from the left at \( x \in \R \) exists in \( \R \) and is denoted \( F(x^-) = \lim_{t \uparrow x} F(t) \). Similarly \(F(\infty) = \lim_{x \to \infty} F(x) \) exists, as a real number or \( \infty \), and \(F(-\infty) = \lim_{x \to -\infty} F(x) \) exists, as a real number or \( -\infty \).

If \( F \) is a distribution function on \( \R \), then there exists a unique measure \( \mu \) on \( \mathscr{R} \) that satisfies \[ \mu(a, b] = F(b) - F(a), \quad -\infty \le a \le b \le \infty \]

The measure \( \mu \) is called the Lebesgue-Stieltjes measure associated with \( F \), named for Henri Lebesgue and Thomas Joannes Stieltjes. Distribution functions and the measures associated with them are studied in more detail in the chapter on Distributions. When the function \( F \) takes values in \( [0, 1] \), the associated measure \( \P \) is a probability measure, and the function \( F \) is the probability distribution function of \( \P \). Probability distribution functions are also studied in much more detail (but with less technicality) in the chapter on Distributions.

Note that the identity function \( x \mapsto x \) for \( x \in \R \) is a distribution function, and the measure associated with this function is ordinary Lebesgue measure on \( \R \) constructed above.

Product Spaces

Suppose that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces. For the product set \( S \times T \), recall that the product \( \sigma \)-algebra is \[ \mathscr{S} \otimes \mathscr{T} = \sigma\{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \] the \( \sigma \)-algebra generated by the Cartesian products of measurable sets.

Suppose that \( (S, \mathscr S, \mu) \) and \( (T, \mathscr T, \nu) \) are \( \sigma \)-finite measure spaces. Then there exists a unique \( \sigma \)-finite measure \( \mu \otimes \nu \) on \((S \times T, \mathscr{S} \otimes \mathscr{T}) \) such that \[ (\mu \otimes \nu)(A \times B) = \mu(A) \nu(B); \quad A \in \mathscr{S}, \; B \in \mathscr{T} \] The measure space \( (S \times T, \mathscr{S} \otimes \mathscr{T}, \mu \otimes \nu) \) is the product measure space associated with \( (S, \mathscr{S}, \mu) \) and \( (T, \mathscr{T}, \nu) \).

Proof:

Recall that the collection \( \mathscr{B} = \{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \) is a semi-algebra: the intersection of two product sets is another product set, and the complement of a product set is the union of two disjoint product sets. We define \( \rho: \mathscr{B} \to [0, \infty] \) by \( \rho(A \times B) = \mu(A) \nu(B) \). The consistency conditions above hold, so \( \rho \) can be extended to a measure on the algebra \( \mathscr{A} \) generated by \( \mathscr{B} \); \( \mathscr{A} \) is the collection of all finite, disjoint unions of products of measurable sets. We will now show that the extended measure \( \rho \) is \( \sigma \)-finite on \( \mathscr{A} \). Since \( \mu \) is \( \sigma \)-finite, by the result above, there exists an increasing sequence \( (A_1, A_2, \ldots) \) of sets in \( \mathscr{S} \) with \( \mu(A_i) \lt \infty \) and \( \bigcup_{i = 1}^\infty A_i = S \). Similarly, there exists an increasing sequence \( (B_1, B_2, \ldots) \) of sets in \( \mathscr{T} \) with \( \nu(B_j) \lt \infty \) and \( \bigcup_{j = 1}^\infty B_j = T \). Then \( \rho(A_i \times B_j) = \mu(A_i) \nu(B_j) \lt \infty \), and since the sets are increasing, \( \bigcup_{(i, j) \in \N_+ \times \N_+} A_i \times B_j = S \times T \). The standard extension and uniqueness theorems now apply, so \( \rho \) can be extended uniquely to a measure on \( \sigma(\mathscr{A}) = \mathscr{S} \otimes \mathscr{T} \).

Recall that for \( C \subseteq S \times T \), the cross section of \( C \) in the first coordinate at \( x \in S \) is \( C_x = \{ y \in T: (x, y) \in C\} \). Similarly, the cross section of \( C \) in the second coordinate at \( y \in T \) is \( C^y = \{ x \in S: (x, y) \in C\} \). We know that the cross sections of a measurable set are measurable. The following result shows that the measures of the cross sections of a measurable set form measurable functions.

Suppose again that \( (S, \mathscr S, \mu) \) and \( (T, \mathscr T, \nu) \) are \( \sigma \)-finite measure spaces. If \( C \in \mathscr{S} \otimes \mathscr{T} \) then

  1. \( x \mapsto \nu(C_x) \) is a measurable function from \( S \) to \( [0, \infty] \).
  2. \( y \mapsto \mu(C^y) \) is a measurable function from \( T \) to \( [0, \infty] \).
Proof:

We prove part (a), since of course the proof for part (b) is symmetric. Suppose first that the measure spaces are finite. Let \( \mathscr{R} = \{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \) denote the set of measurable rectangles. Let \( \mathscr{C} = \{C \in \mathscr{S} \otimes \mathscr{T}: x \mapsto \nu(C_x) \text{ is measurable}\}\). If \( A \times B \in \mathscr{R} \), then \( A \times B \in \mathscr{C} \), since \( \nu[(A \times B)_x] = \nu(B) \bs{1}_A(x) \). Next, suppose \( C \in \mathscr{C} \). Then \( (C^c)_x = (C_x)^c \), so \( \nu[(C^c)_x] = \nu(T) - \nu(C_x) \) and this is a measurable function of \( x \in S \). Hence \( C^c \in \mathscr{C} \). Next, suppose that \( \{C_i: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr{C} \) and let \( C = \bigcup_{i \in I} C_i \). Then \( \{(C_i)_x: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr{T} \), and \( C_x = \bigcup_{i \in I} (C_i)_x \). Hence \( \nu(C_x) = \sum_{i \in I} \nu[(C_i)_x] \), and this is a measurable function of \( x \in S \). Hence \( C \in \mathscr{C} \). It follows that \( \mathscr{C} \) is a \( \lambda \)-system that contains \( \mathscr{R} \), which in turn is a \( \pi \)-system. It follows from Dynkins \(\pi\)-\(\lambda \) theorem, that \( \mathscr{S} \otimes \mathscr{T} = \sigma(\mathscr{R}) \subseteq \mathscr{C} \). Thus \( \mathscr{C} = \mathscr{S} \otimes \mathscr{T} \).

Consider now the general case where the measure spaces are \( \sigma \)-finite. There exists a countable, increasing sequence of sets \( C_n \in \mathscr{S} \otimes \mathscr{T} \) for \( n \in \N_+ \) with \( (\mu \otimes \nu)(C_n) \lt \infty \) for \( n \in \N_+ \). If \( C \in \mathscr{S} \otimes \mathscr{T} \), then \( C \cap C_n \) is increasing in \( n \in \N_+ \), and \( C = \bigcup_{n=1}^\infty (C \cap C_n) \). Hence, for \( x \in S \), \( (C \cap C_n)_x \) is increasing in \( n \in \N_+ \) and \( C_x = \bigcup_{n=1}^\infty (C \cap C_n)_x \). Therefore \( \nu(C_x) = \lim_{n \to \infty} \nu[(C \cap C_n)_x] \). But \( x \mapsto \nu[(C \cap C_n)_x] \) is a measurable function of \( x \in S \) for each \( n \in \N_+ \) by the previous argument, so \( x \mapsto \nu(C_x) \) is a measurable function of \( x \in S \).

In the next chapter, where we study integration with respect to a measure, we will see that for \( C \in \mathscr{S} \otimes \mathscr{T} \), the product measure \( (\mu \otimes \nu)(C) \) can be computed by integrating \( \nu(C_x) \) over \( x \in S \) with respect to \( \mu \) or by integrating \( \mu(C^y) \) over \( y \in T \) with respect to \( \nu \). These results, generalizing the definition of the product measure in (11), are special cases of Fubini's theorem, named for the Italian mathematician Guido Fubini.

Except for more complicated notation, these results extend in a perfectly straightforward way to the product of a finite number of \( \sigma \)-finite measure spaces.

Suppose that \( n \in \N_+ \) and that \( (S_i, \mathscr S_i, \mu_i) \) is a \( \sigma \)-finite measure space for \( i \in \{1, 2, \ldots, n\} \). Let \( S = \prod_{i=1}^n S_i \) and let \( \mathscr S \) denote the corresponding product \( \sigma \)-algebra. There exists a unique \( \sigma \)-finite measure \( \mu \) on \( (S, \mathscr{S}) \) satisfying \[ \mu\left(\prod_{i=1}^n A_i\right) = \prod_{i=1}^n \mu_i(A_i), \quad A_i \in \mathscr{S}_i \text{ for } i \in \{1, 2, \ldots, n\} \] The measure space \( (S, \mathscr S, \mu) \) is the product measure space associated with the given measure spaces.

Lebesgue Measure on \( \R^n \)

For \( \R^n \), the \( n \)-dimensional Euclidean space, there are several ways to think of the Borel \( \sigma \)-algebra \( \mathscr{R}_n \). Of course, by definition, \( \mathscr{R}_n \) is the \( \sigma \)-algebra generated by the standard Euclidean topology on \( \R^n \), which in turn is the \( n \)-fold product of the topology on \( \R \). Also, \( \mathscr{R}_n \) is the \( n \)-fold power of \( \mathscr{R} \), the Borel \( \sigma \)-algebra of \( \R \). That is, \( \mathscr{R}_n = \mathscr{R} \otimes \mathscr{R} \otimes \cdots \otimes \mathscr{R} \) (\( n \) times). It is also the \( \sigma \)-algebra generated by the products of intervals: \[ \mathscr{R}_n = \sigma\left\{I_1 \times I_2 \times \cdots I_n: I_j \in \mathscr{I} \text{ for } j \in \{1, 2, \ldots n\}\right\} \] The corresponding Lebesgue measure \( \lambda_n \) is the \( n \)-fold power of \( \lambda \). In particular, \[ \lambda_n(A_1 \times A_2 \times \cdots \times A_n) = \lambda(A_1) \lambda(A_2) \cdots \lambda(A_n); \quad A_1, \, \ldots, A_n \in \mathscr{R} \] Specializing further, if \( I_j \in \mathscr{I} \) is an interval for \( j \in \{1, 2, \ldots, n\} \) then \[ \lambda_n\left(I_1 \times I_2 \times \cdots \times I_n\right) = \length(I_1) \length(I_2) \cdots \length(I_n) \] In particular, \( \lambda_2 \) extends the area measure on \( \mathscr{R}_2 \) and \( \lambda_3 \) extends the volume measure on \( \mathscr{R}_3 \). In general, \( \lambda_n(A) \) is sometimes referred to as \( n \)-dimensional volume of \( A \in \mathscr{R}_n \). As in the one-dimensional case, \( \mathscr{R}_n \) can be completed with respect to \( \lambda_n \), essentially adding all subsets of sets of measure 0 to \( \mathscr{R}_n \). The completed \( \sigma \)-algebra is the \( \sigma \)-algebra of Lebesgue measurable sets.

Probability Spaces

Our other application of the results on product spaces involve probability, of course.

Suppose \( n \in \N_+ \) and that \( (S_i, \mathscr{S}_i, \P_i) \) is a probability space for \( i \in \{1, 2, \ldots, n\} \). The corresponding product measure space \( (S, \mathscr S, \P) \) is a probability space. If \( X_i: S \to S_i \) is the \( i \)th coordinate function on \( S\) \[ X_i(x_1, x_2, \ldots, x_n) = x_i, \quad (x_1, x_2, \ldots, x_n) \in S \] then \( (X_1, X_2, \ldots, X_n) \) is a sequence of independent random variables on \( (S, \mathscr{S}, \P) \), and \( X_i \) has distribution \( \P_i \) on \( (S_i, \mathscr S_i) \) for each \( i \in \{1, 2, \ldots, n \} \).

Proof:

Of course, the existence of the product space \( (S, \mathscr S, \P) \) follows immediately from the more general result for positive measure spaces above. Clearly \( \P \) is a probability measure since \( \P(S) = \prod_{i=1}^n \P_i(S_i) = 1 \). Suppose that \( A_i \in \mathscr S_i \) for \( i \in \{1, 2, \ldots, n\} \). Then \( \{X_1 \in A_1, X_2 \in A_2 \ldots, X_n \in A_n\} = \prod_{i=1}^n A_i \in \mathscr S\). Hence \[ \P(X_1 \in A_1, X_2 \in A_2, \ldots, X_n \in A_n) = \prod_{i=1}^n \P_i(A_i) \] If we fix \( i \in \{1, 2, \ldots, n\} \) and let \( A_j = S_j \) for \( j \ne i \), then the displayed equation give \( \P(X_i \in A_i) = \P_i(A_i) \), so \( X_i \) has distribution \( \P_i \) on \( (S_i, \mathscr S_i) \). Returning to the displayed equation we have \[ \P(X_1 \in A_1, X_2 \in A_2, \ldots, X_n \in A_n) = \prod_{i=1}^n \P(X_i \in A_i) \] so \( (X_1, X_2, \ldots, X_n) \) are independent.

Intuitively, the given probability spaces correspond to \( n \) random experiments. The product space then is the probability space that corresponds to the experiments performed independently. When modeling a random experiment, if we say that we have a finite sequence of independent random variables with specified distributions, we can rest assured that there actually is a probability space that supports this statement

We can extend the last result to an infinite sequence of probability spaces. Suppose that \( (S_i, \mathscr S_i) \) is a measurable space for each \( i \in \N_+ \). Recall that the product space \( \prod_{i=1}^\infty S_i \) consists of all sequences \( \bs x = (x_1, x_2, \ldots) \) such that \( x_i \in S_i \) for each \( i \in \N_+ \). The corresponding product \( \sigma \)-algebra \( \mathscr S \) is the \( \sigma \)-algebra generated by the collection of cylinder sets \[ \mathscr{B} = \left\{\prod_{i=1}^\infty A_i: A_i \in \mathscr S_i \text{ for each } i \in \N_+ \text{ and } A_i = S_i \text{ for all but finitely many } i \in \N_+\right\} \]

Suppose that \( (S_i, \mathscr{S}_i, \P_i) \) is a probability space for \(i \in \N_+ \). Let \( (S, \mathscr S) \) denote the product measurable space and \( \mathscr B \) the collection of cylinder sets. Then there exists a unique probability measure \( \P \) on \( (S, \mathscr S) \) that satisfies \[ \P\left(\prod_{i=1}^\infty A_i\right) = \prod_{i=1}^\infty \P_i(A_i), \quad \prod_{i=1}^\infty A_i \in \mathscr B\] If \( X_i: S \to S_i \) is the \( i \)th coordinate function on \( S\) for \( i \in \N_+ \) then \( (X_1, X_2, \ldots) \) is a sequence of independent random variables on \( (S, \mathscr{S}, \P) \), and \( X_i \) has distribution \( \P_i \) on \( (S_i, \mathscr S_i) \) for each \( i \in \N_+ \).

Proof:

The proof is similar to the one above for positive measure spaces. First recall that the collection of cylinder sets \( \mathscr B \) is a semi-algebra. We define \( \P: \mathscr{B} \to [0, 1] \) as in the statement of the theorem. Note that all but finitely many factors are 1. The consistency conditions above hold, so \( \P \) can be extended to a probability measure on the algebra \( \mathscr{A} \) generated by \( \mathscr{B} \); \( \mathscr{A} \) is the collection of all finite, disjoint unions of cylinder sets. The standard extension and uniqueness theorems now apply, so \( \P\) can be extended uniquely to a measure on \( \mathscr S = \sigma(\mathscr{A})\). The proof that \( (X_1, X_2, \ldots) \) are independent and that \( X_i \) has distribution \( \P_i \) for each \( i \in \N_+ \) is just as in the previous theorem.

Once again, if we model a random process by starting with an infinite sequence of independent random variables, we can be sure that there exists a probability space that supports this sequence. The particular probability space constructed in the last theorem is called the canonical probability space associated with the sequence of random variables. Note also that it was important that we had probability measures rather than just general positive measures in the construction, since the infinite product \( \prod_{i=1}^\infty \P_i(A_i) \) is always well defined.