\(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\Z}{\mathbb{Z}}\) \(\newcommand{\Q}{\mathbb{Q}}\) \(\newcommand{\D}{\mathbb{D}}\) \(\newcommand{\bs}{\boldsymbol}\) \(\newcommand{\ms}{\mathscr}\)
  1. Random
  2. 0. Foundations
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9
  12. 10
  13. 11
  14. 12
  15. 13
  16. 14
  17. 15
  18. 16
  19. 17
  20. 18
  21. 19

11. Measurable Spaces

In this section we discuss some topics from measure theory that are a bit more advanced than the topics in the early sections of this chapter. However, measure-theoretic ideas are essential for a deep understanding of probability, since probability is itself a measure. The most important of the definitions is the \(\sigma\)-algebra, a collection of subsets of a set with certain closure properties. Such collections play a fundamental role, even for applied probability, in encoding the state of information about a random experiment.

On the other hand, we won't be overly pedantic about measure-theoretic details in this text. Unless we say otherwise, we assume that all sets that appear are measurable (that is, members of the appropriate \(\sigma\)-algebras), and that all functions are measurable (relative to the appropriate \(\sigma\)-algebras).

Although this section is somewhat abstract, many of the proofs are straightforward. Be sure to try the proofs yourself before expanding the details.

Algebras and \( \sigma \)-Algebras

Suppose that \(S\) is a set playing the role of a universal set for a particular mathematical model. It is sometimes impossible to include all subsets of \(S\) in our model, particularly when \(S\) is uncountable. In a sense, the more sets that we include, the harder it is to have consistent theories. However, we almost always want the collection of admissible subsets to be closed under the basic set operations. This leads to some important definitions.

Algebras of Sets

Suppose that \(\ms S\) is a nonempty collection of subsets of \(S\). Then \(\ms S\) is an algebra (or field) if it is closed under complement and union:

  1. If \(A \in \ms S\) then \(A^c \in \ms S\).
  2. If \(A \in \ms S\) and \(B \in \ms S\) then \(A \cup B \in \ms S\).

If \(\ms S\) is an algebra of subsets of \(S\) then

  1. \( S \in \ms S \)
  2. \( \emptyset \in \ms S \)
Details:
  1. Since \( \ms S \) is nonempty, there exists \( A \in \ms S \). Hence \( A^c \in \ms S \) so \( S = A \cup A^c \in \ms S \).
  2. \( \emptyset = S^c \in \ms S \)

Suppose that \(\ms S\) is an algebra of subsets of \(S\) and that \(A_i \in \ms S\) for each \(i\) in a finite index set \(I\).

  1. \(\bigcup_{i \in I} A_i \in \ms S\)
  2. \(\bigcap_{i \in I} A_i \in \ms S\)
Details:
  1. This follows by induction on the number of elements in \(I\).
  2. This follows from (a) and De Morgan's law. If \( A_i \in \ms S \) for \( i \in I \) then \( A_i^c \in \ms S \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \ms S \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \ms S \).

Thus it follows that an algebra of sets is closed under a finite number of set operations. That is, if we start with a finite number of sets in the algebra \( \ms S \), and build a new set with a finite number of set operations (union, intersection, complement), then the new set is also in \( \ms S \). However in many mathematical theories, probability in particular, this is not sufficient; we often need the collection of admissible subsets to be closed under a countable number of set operations.

\(\sigma\)-Algebras of Sets

Suppose that \(\ms S\) is a nonempty collection of subsets of \(S\). Then \(\ms S\) is a \(\sigma\)-algebra (or \(\sigma\)-field) if the following axioms are satisfied:

  1. If \(A \in \ms S\) then \(A^c \in \ms S\).
  2. If \(A_i \in \ms S\) for each \(i\) in a countable index set \(I\), then \(\bigcup_{i \in I} A_i \in \ms S\).

Clearly a \(\sigma\)-algebra of subsets is also an algebra of subsets, so the basic results for algebras in still hold. In particular, \( S \in \ms S \) and \( \emptyset \in \ms S \).

If \(A_i \in \ms S\) for each \(i\) in a countable index set \(I\), then \(\bigcap_{i \in I} A_i \in \ms S\).

Details:

The proof is just like the one in for algebras. If \( A_i \in \ms S \) for \( i \in I \) then \( A_i^c \in \ms S \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \ms S \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \ms S \).

Thus a \(\sigma\)-algebra of subsets of \(S\) is closed under countable unions and intersections. This is the reason for the symbol \(\sigma\) in the name. As mentioned in the introductory paragraph, \( \sigma \)-algebras are of fundamental importance in mathematics generally and probability theory specifically, and thus deserve a special definition:

If \( S \) is a set and \( \ms S \) a \( \sigma \)-algebra of subsets of \( S \), then \( (S, \ms S) \) is called a measurable space.

The term measurable space will make more sense when we discuss positive measures on such spaces.

Suppose that \(S\) is a set and that \(\ms S\) is a finite algebra of subsets of \(S\). Then \(\ms S\) is also a \(\sigma\)-algebra.

Details:

Any countable union of sets in \(\ms S\) reduces to a finite union.

However, there are algebras that are not \(\sigma\)-algebras. Here is the classic example:

Suppose that \( S \) is an infinite set. The collection of finite and co-finite subsets of \( S \) defined below is an algebra of subsets of \( S \), but not a \(\sigma\)-algebra: \[ \ms F = \{A \subseteq S: A \text{ is finite or } A^c \text{ is finite}\} \]

Details:

\( S \in \ms F \) since \( S^c = \emptyset \) is finite. If \( A \in \ms F \) then \( A^c \in \ms F \) by the symmetry of the definition. Suppose that \( A, \, B \in \ms F \). If \( A \) and \( B \) are both finite then \( A \cup B \) is finite. If \( A^c \) or \( B^c \) is finite, then \( (A \cup B)^c = A^c \cap B^c \) is finite. In either case, \( A \cup B \in \ms F \). Thus \( \ms F \) is an algebra of subsets of \( S \).

Since \( S \) is infinite, it contains a countably infinite subset \( \{x_0, x_1, x_2, \ldots\} \). Let \( A_n = \{x_{2 n}\} \) for \( n \in \N \). Then \( A_n \) is finite, so \( A_n \in \ms F \) for each \( n \in \N \). Let \( E = \bigcup_{n=0}^\infty A_n = \{x_0, x_2, x_4, \ldots\} \). Then \( E \) is infinite by construction. Also \(\{x_1, x_3, x_5, \ldots\} \subseteq E^c \), so \( E^c \) is infinite as well. Hence \( E \notin \ms F \) and so \( \ms F \) is not a \( \sigma \)-algebra.

General Constructions

Recall that \(\ms P(S)\) denotes the collection of all subsets of \(S\), called the power set of \(S\). Trivially, \(\ms P(S)\) is the largest \(\sigma\)-algebra of \(S\). The power set is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but as noted above, is sometimes too large to be useful if \( S \) is uncountable. At the other extreme, the smallest \(\sigma\)-algebra of \(S\) is given next:

The collection \(\{\emptyset, S\}\) is a \(\sigma\)-algebra.

Details:

Clearly \( \{\emptyset, S\} \) is a finite algebra: \( S \) and \( \emptyset \) are complements of each other, and \( S \cup \emptyset = S \). Hence \( \{S, \emptyset\} \) is a \( \sigma \)-algebra by .

In many cases, we want to construct a \(\sigma\)-algebra that contains certain basic sets. The next two results show how to do this.

Suppose that \(\ms S_i\) is a \(\sigma\)-algebra of subsets of \(S\) for each \(i\) in a nonempty index set \(I\). Then \( \ms S = \bigcap_{i \in I} \ms S_i\) is also a \(\sigma\)-algebra of subsets of \(S\).

Details:

The proof is completely straightforward. First, \( S \in \ms S_i \) for each \( i \in I \) so \( S \in \ms S \). If \( A \in \ms S \) then \( A \in \ms S_i \) for each \( i \in I \) and hence \( A^c \in \ms S_i \) for each \( i \in I \). Therefore \( A^c \in \ms S \). Finally suppose that \( A_j \in \ms S \) for each \( j \) in a countable index set \( J \). Then \( A_j \in \ms S_i \) for each \( i \in I \) and \( j \in J \) and therefore \( \bigcup_{j \in J} A_j \in \ms S_i \) for each \( i \in I \). It follows that \( \bigcup_{j \in J} A_j \in \ms S \).

Note that no restrictions are placed on the index set \( I \), other than it be nonempty, so in particular it may well be uncountable.

Suppose that \( S \) is a set and that \(\ms B\) is a collection of subsets of \(S\). The \(\sigma\)-algebra generated by \(\ms B\) is \[\sigma(\ms B) = \bigcap \{\ms S: \ms S \text{ is a } \sigma\text{-algebra of subsets of } S \text{ and } \ms B \subseteq \ms S\}\] A \(\sigma\)-algebra that is generated by a countable collection of sets is said to be countably generated.

So the \(\sigma\)-algebra generated by \(\ms B\) is the intersection of all \(\sigma\)-algebras that contain \(\ms B\), which by really is a \(\sigma\)-algebra. Note that the collection of \( \sigma \)-algebras in the intersection is not empty, since \( \ms P(S) \) is in the collection. Think of the sets in \(\ms B\) as basic sets that we want to be measurable, but do not form a \(\sigma\)-algebra.

The \(\sigma\)-algebra \(\sigma(\ms B)\) is the smallest \(\sigma\) algebra containing \(\ms B\).

  1. \(\ms B \subseteq \sigma(\ms B)\)
  2. If \(\ms S\) is a \(\sigma\)-algebra of subsets of \(S\) and \(\ms B \subseteq \ms S\) then \(\sigma(\ms B) \subseteq \ms S\).
Details:

Both of these properties follows from the definition of \( \sigma(\ms B) \) in .

Note that the conditions in completely characterize \( \sigma(\ms B) \). If \( \ms S_1 \) and \( \ms S_2 \) satisfy the conditions, then by (a), \( \ms B \subseteq \ms S_1 \) and \( \ms B \subseteq \ms S_2 \). But then by (b), \( \ms S_1 \subseteq \ms S_2 \) and \( \ms S_2 \subseteq \ms S_1\).

If \(A\) is a subset of \(S\) then \(\sigma\{A\} = \{\emptyset, A, A^c, S\}\)

Details:

Let \( \ms S = \{\emptyset, A, A^c, S\} \). Clearly \( \ms S \) is an algebra: \( A \) and \( A^c \) are complements of each other, as are \( \emptyset \) and \( S \). Also, \begin{align*} &A \cup A^c = A \cup S = A^c \cup S = S \cup S = \emptyset \cup S = S \\ &A \cup \emptyset = A \cup A = A \\ &A^c \cup \emptyset = A^c \cup A^c = A^c \\ &\emptyset \cup \emptyset = \emptyset \end{align*} Since \( \ms S \) is finite, it is a \( \sigma \)-algebra by . Next, \( A \in \ms S \). Conversely, if \( \ms T \) is a \( \sigma \)-algebra and \( A \in \ms T \) then of course \( \emptyset, S, A^c \in \ms T \) so \( \ms S \subseteq \ms T \). Hence \( \ms S = \sigma\{A\} \)

We can generalize . Recall that a collection of subsets \( \ms A = \{A_i: i \in I\} \) is a partition of \( S \) if \( A_i \cap A_j = \emptyset \) for \( i, \; j \in I \) with \( i \ne j \), and \( \bigcup_{i \in I} A_i = S \).

Suppose that \( \ms A = \{A_i: i \in I\} \) is a countable partition of \( S \) into nonempty subsets. Then \( \sigma(\ms A) \) is the collection of all unions of sets in \( \ms A \). That is, \[ \sigma(\ms A) = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \]

Details:

Let \( \ms S = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \). Note that \( S \in \ms S \) since \( S = \bigcup_{i \in I} A_i \). Next, suppose that \( B \in \ms S \). Then \( B = \bigcup_{j \in J} A_j \) for some \( J \subseteq I \). But then \( B^c = \bigcup_{j \in J^c} A_j \), so \( B^c \in \ms S \). Next, suppose that \( B_k \in \ms S \) for \( k \in K \) where \( K \) is a countable index set. Then for each \( k \in K \) there exists \( J_k \subseteq I \) such that \( B_k = \bigcup_{j \in J_k} A_j \). But then \( \bigcup_{k \in K} B_k = \bigcup_{k \in K} \bigcup_{j \in J_k} A_j = \bigcup_{j \in J} A_j \) where \( J = \bigcup_{k \in K} J_k \). Hcnce \( \bigcup_{k \in K} B_k \in \ms S \). Therefore \( \ms S \) is a \( \sigma \)-algebra of subsets of \( S \). Trivially, \( \ms A \subseteq \ms S \). If \( \ms T \) is a \( \sigma \)-algebra of subsets of \( S \) and \( \ms A \subseteq \ms T \), then clearly \( \bigcup_{j \in J} A_j \in \ms T \) for every \( J \subseteq I \). Hence \( \ms S \subseteq \ms T\).

A \( \sigma \)-algebra of this form is said to be generated by a countable partition. Note that since \( A_i \ne \emptyset \) for \( i \in I \), the representation of a set in \( \sigma(\ms A) \) as a union of sets in \( \ms A \) is unique. That is, if \( J, \, K \subseteq I \) and \( J \ne K \) then \( \bigcup_{j \in J} A_j \ne \bigcup_{k \in K} A_k \). In particular, if there are \( n \) nonempty sets in \( \ms A \), so that \( \#(I) = n \), then there are \( 2^n \) subsets of \( I \) and hence \( 2^n \) sets in \( \sigma(\ms A) \).

Suppose now that \( \ms A = \{A_1, A_2, \ldots, A_n\} \) is a collection of \(n\) subsets of \(S\) (not necessarily disjoint). To describe the \( \sigma \)-algebra generated by \( \ms A \) we need a bit more notation. For \( x = (x_1, x_2, \ldots, x_n) \in \{0, 1\}^n \) (a bit string of length \( n \)), let \( B_x = \bigcap_{i=1}^n A_i^{x_i} \) where \( A_i^1 = A_i \) and \( A_i^0 = A_i^c \).

In the setting above,

  1. \( \ms B = \{B_x: x \in \{0, 1\}^n\} \) partitions \( S \).
  2. \( A_i = \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\) for \(i \in \{1, 2, \ldots, n\}\).
  3. \(\sigma(\ms A) = \sigma(\ms B) = \left\{\bigcup_{x \in J} B_x: J \subseteq \{0, 1\}^n\right\}\).
Details:
  1. Suppose that \( x, \; y \in \{0, 1\}^n \) and that \( x \ne y \). Without loss of generality we can suppose that for some \( j \in \{1, 2, \ldots, n\} \), \(x_j = 0 \) while \( y_j = 1 \). Then \( B_x \subseteq A_j^c \) and \( B_y \subseteq A_j \) so \( B_x \) and \( B_y \) are disjoint. Suppose that \( s \in S \). Construct \( x \in \{0, 1\}^n \) by \( x_i = 1 \) if \( s \in A_i \) and \( x_i = 0 \) if \( s \notin A_i \), for each \( i \in \{1, 2, \ldots, n\} \). Then by definition, \( s \in B_x \). Hence \( \ms B \) partitions \( S \).
  2. Fix \( i \in \{1, 2, \ldots, n\}\). Again if \( x \in \{0, 1\}^n \) and \( x_i = 1 \) then \( B_x \subseteq A_i \). Hence \(\bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\} \subseteq A_i\). Conversely, suppose \( s \in A_i \). Define \( y \in \{0, 1\}^n \) by \( y_j = 1 \) if \( s \in A_j \) and \( y_j = 0 \) if \( s \notin A_j \) for each \( j \in \{1, 2, \ldots, n\} \). Then \( y_i = 1 \) and \( s \in B_y \). Hence \( s \in \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\).
  3. Clearly, every \( \sigma \)-algebra of subsets of \( S \) that contains \( \ms A \) must also contain \( \ms B \), and every \( \sigma \)-algebra of subsets of \( S \) that contains \( \ms B \) must also contain \( \ms A \). It follows that \( \sigma(\ms A) = \sigma(\ms B) \). The characterization in terms of unions now follows from .

Recall that there are \( 2^n \) bit strings of length \( n \). The sets in \( \ms A \) are said to be in general position if the sets in \( \ms B \) are distinct (and hence there are \( 2^n \) of them) and are nonempty. In this case, there are \( 2^{2^n} \) sets in \( \sigma(\ms A) \).

Open the Venn diagram app. This app shows two subsets \(A\) and \(B\) of \(S\) in general position, and lists the 16 sets in \( \sigma\{A, B\} \).

  1. Select each of the 4 sets that partition \( S \): \( A \cap B \), \( A \cap B^c \), \( A^c \cap B \), \( A^c \cap B^c \).
  2. Select each of the other 12 sets in \(\sigma\{A, B\}\) and note how each is a union of some of the sets in (a).

Sketch a Venn diagram with sets \( A_1, \, A_2, \, A_3 \) in general position. Identify the set \( B_x \) for each \( x \in \{0, 1\}^3 \).

If a \( \sigma \)-algebra is generated by a collection of basic sets, then each set in the \( \sigma \)-algebra is generated by a countable number of the basic sets.

Suppose that \( S \) is a set and \( \ms B \) a nonempty collection of subsets of \( S \). Then

\[ \sigma(\ms B) = \{A \subseteq S: A \in \sigma(\ms C) \text{ for some countable } \ms C \subseteq \ms B\} \]
Details:

Let \( \ms S \) denote the collection on the right. We first show that \( \ms S \) is a \( \sigma \)-algebra. First, pick \( B \in \ms B \), which we can do since \( \ms B \) is nonempty. Then \( S \in \sigma\{B\} \) so \( S \in \ms S \). Let \( A \in \ms S \) so that \( A \in \sigma(\ms C) \) for some countable \( \ms C \subseteq \ms B \). Then \( A^c \in \sigma(\ms C) \) so \( A^c \in \ms S \). Finally, suppose that \( A_i \in \ms S \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \), there exists a countable \( \ms C_i \subseteq \ms B \) such that \( A_i \in \sigma(\ms C_i) \). But then \( \bigcup_{i \in I} \ms C_i \) is also countable and \( \bigcup_{i \in I} A_i \in \sigma\left(\bigcup_{i \in I} \ms C_i \right) \). Hence \( \bigcup_{i \in I} A_i \in \ms S \).

Next if \( B \in \ms B \) then \( B \in \sigma\{B\} \) so \( B \in \ms S \). Hence \( \sigma(\ms B) \subseteq \ms S \). Conversely, if \( A \in \sigma(\ms C) \) for some countable \( \ms C \subseteq \ms B \) then trivially \( A \in \sigma(\ms B) \).

We have seen examples of finite \(\sigma\) algebras and infinite \(\sigma\)-algebras. It turns out that a \(\sigma\)-algebra cannot be countably infinite.

Suppose that \(\ms S\) is a \(\sigma\)-algebra of subsets of a set \(S\). Then \(\ms S\) is either finite or uncountable.

Details:

The proof is by contradiction. Suppose that \(\ms S\) is countably infinite. Clearly the base set \(S\) is infinite since \(\ms S \subseteq \ms P(S)\) and \(\ms P(S)\) is finite if \(S\) is finite. Define \[A_x = \bigcap\{A \in \ms S: x \in A\}, \quad x \in S\] Then \(A_x \in \ms S\) for \(x \in S\) since the intersection is over a countable collection of sets in \(\ms S\). Clearly \(A_x\) is the smallest set in \(\ms S\) containing \(x\). We next show that the distinct sets in the collection \(\ms A = \{A_x: x \in S\} \subseteq \ms S\) are disjoint. Suppose that \(x, \, y \in S\) and \(A_x \cap A_y \ne \emptyset\). If \(x \notin A_y\) then \(A_x \setminus A_y \in \ms S\) is a proper subset of \(A_x\) containing \(x\), which contradicts the definition of \(A_x\). Hence \(x \in A_y\) and therefore \(A_x \subseteq A_y\). By a symmetric argument, \(A_y \subseteq A_x\) and hence \(A_x = A_y\). So we conclude that for every \(x, \, y \in S\), either \(A_x \cap A_y = \emptyset\) or \(A_x = A_y\). Trivially, if \(A \in \ms S\) then \(A = \cup_{x \in A} A_x\). So if \(\ms A\) is finite then \(\ms S\) is finite, since there can only be a finite number of distinct unions of a finite collection of sets. But this contradicts the assumption that \(\ms S\) is countably infinite. On the other hand, if \(\ms A\) is countably infinite then \(\ms S\) is uncountable, since there are uncountably many distinct unions of a countably infinite collection of sets. But this is also a contradiction.

A \( \sigma \)-algebra on a set naturally leads to a \( \sigma \)-algebra on a subset.

Suppose that \((S, \ms S)\) is a measurable space, and that \(R \subseteq S\). Let \(\ms R = \{A \cap R: A \in \ms S\}\). Then

  1. \( \ms R \) is a \(\sigma\)-algebra of subsets of \(R\).
  2. If \(R \in \ms S\) then \(\ms R = \{B \in \ms S: B \subseteq R\}\).
Details:
  1. First, \( S \in \ms S \) and \( S \cap R = R \) so \( R \in \ms R \). Next suppose that \( B \in \ms R \). Then there exists \( A \in \ms S \) such that \( B = A \cap R \). But then \( A^c \in \ms S \) and \( R \setminus B = R \cap B^c = R \cap A^c \), so \( R \setminus B \in \ms R \). Finally, suppose that \( B_i \in \ms R \) for \( i \) in a countable index set \( I \). For each \( i \in I \) there exists \( A_i \in \ms S \) such that \( B_i = A_i \cap R \). But then \( \bigcup_{i \in I} A_i \in \ms S \) and \( \bigcup_{i \in I} B_i = \left(\bigcup_{i \in I} A_i \right) \cap R \), so \( \bigcup_{i \in I} B_i \in \ms R \).
  2. Suppose that \( R \in \ms S \). Then \( A \cap R \in \ms S \) for every \( A \in \ms S \), and of course, \( A \cap R \subseteq R \). Conversely, if \( B \in \ms S \) and \( B \subseteq R \) then \( B = B \cap R \) so \( B \in \ms R \)

The \( \sigma \)-algebra \(\ms R\) is the \(\sigma\)-algebra on \(R\) induced by \(\ms S\). If \(R \in \ms S\) then \((R, \ms R)\) is a subspace of \((S, \ms S)\). The following construction is useful for counterexamples. Compare this example with example for finite and co-finite sets.

Let \( S \) be a nonempty set. The collection of countable and co-countable subsets of \( S \) is \[ \ms C = \{A \subseteq S: A \text{ is countable or } A^c \text{ is countable}\} \]

  1. \( \ms C \) is a \( \sigma \)-algebra
  2. \( \ms C = \sigma\{\{x\}: x \in S\} \), the \( \sigma \)-algebra generated by the singleton sets.
Details:
  1. First, \( S \in \ms C \) since \( S^c = \emptyset \) is countable. If \( A \in \ms C \) then \( A^c \in \ms C \) by the symmetry of the definition. Suppose that \( A_i \in \ms C \) for each \( i \) in a countable index set \( I \). If \( A_i \) is countable for each \( i \in I \) then \( \bigcup_{i \in I} A_i \) is countable. If \( A_j^c \) is countable for some \( j \in I \) then \( \left(\bigcup_{i \in I} A_i \right)^c = \bigcap_{i \in I} A_i^c \subseteq A_j^c \) is countable. In either case, \( \bigcup_{i \in I} A_i \in \ms C \).
  2. Let \( \ms{D} = \sigma\{\{x\}: x \in S\} \). Clearly \( \{x\} \in \ms C \) for \( x \in S \). Hence \( \ms{D} \subseteq \ms C \). Conversely, suppose that \( A \in \ms C \). If \( A \) is countable, then \( A = \bigcup_{x \in A} \{x\} \in \ms{D} \). If \( A^c \) is countable, then by an identical argument, \( A^c \in \ms{D} \) and hence \( A \in \ms{D} \).

Of course, if \( S \) is itself countable then \( \ms C = \ms P(S) \). On the other hand, if \( S \) is uncountable, then there exists \( A \subseteq S \) such that \( A \) and \( A^c \) are uncountable. Thus, \( A \notin \ms C \), but \( A = \bigcup_{x \in A} \{x\} \), and of course \( \{x\} \in \ms C \). Thus, we have an example of a \( \sigma \)-algebra that is not closed under general unions. Here is another use of this \(\sigma\)-algebra as a counterexample:

Suppose that \(S\) is an uncountable set. The \(\sigma\)-algebra \(\ms C\) of countable and co-countable sets is not countably generated.

Details:

The proof is by contradiction. Suppose that \(\ms A = \{A_i: i \in I\} \) is a countable collection of sets in \(\ms C\) and that \(\ms C = \sigma(\ms A)\). For each \( i \in I \), let \(B_i = A_i\) if \(A_i\) is countable and \(B_i = A_i^c\) if \(A_i^c\) is countable. Then \(B_i \in \ms C\) for each \(i \in I\). The countable collection \(\ms B = \{B_i: i \in I\} \) generates the same \(\sigma\)-algebra as \(\ms A\), so \(\ms C = \sigma(\ms B)\). Now let \( B = \bigcup_{i \in I} B_i \). Then \(B\) is a countable union of countable sets, so \(B\) is countable. Therefore \(\sigma\{\{x\}: x \in B\} = \ms P(B)\}\), the collection of all subsets of \(B\). Since \(B_i \in \ms P(B)\) for each \(i \in I\) and \(\{x\} \in \ms C\) for each \(x \in B\) we have \[ \ms C = \sigma(\ms B) \subseteq \ms P(B) \subseteq \ms C\] So \(\ms C = \ms P(B)\). But this is clearly a contradiction since \(S \notin \ms P(B)\).

Topology and Measure

One of the most important ways to generate a \( \sigma \)-algebra is by means of topology. Recall that a topological space consists of a set \( S \) and a topology \(\ms S\), the collection of open subsets of \( S \). Most spaces that occur in probability and stochastic processes are topological spaces, so it's crucial that the topological and measure-theoretic structures are compatible.

Suppose that \( (S, \ms T) \) is a topological space. Then \( \ms S = \sigma(\ms T) \) is the Borel \( \sigma \)-algebra on \(S\), and \((S, \ms S)\) is a Borel measurable space.

So the Borel \( \sigma \)-algebra on \( S \), named for Émile Borel is generated by the open subsets of \( S \). Thus, a topological space \( (S, \ms T) \) naturally leads to a measurable space \( (S, \sigma(\ms T))\). Since a closed set is simply the complement of an open set, the Borel \( \sigma \)-algebra contains the closed sets as well (and in fact is generated by the closed sets). Here are some other sets that are in the Borel \(\sigma\)-algebra:

Suppose again that \((S, \ms T)\) is a topological space and let \(\ms S = \sigma(\ms T)\) denote the Borel \(\sigma\)-algebral. Suppose also that \(I\) is a countable index set.

  1. If \(A_i\) is open for each \(i \in I\) then \(\bigcap_{i \in I} A_i \in \ms S\). Such sets are called \(G_\delta\) sets.
  2. If \(A_i\) is closed for each \(i \in I\) then \(\bigcup_{i \in I} A_i \in \ms S\). Such sets are called \(F_\sigma\) sets.
  3. If \((S, \ms T)\) is Hausdorff then \(\{x\} \in \ms S\) for every \(x \in S\).
Details:
  1. This follows from .
  2. This follows from .
  3. This follows since \(\{x\}\) is closed for each \(x \in S\) if the topology is Hausdorff.

In terms of part (c), recall that a topological space is Hausdorff, named for Felix Hausdorff, if the topology can distinguish individual points. Specifically, if \(x, \, y \in S\) are distinct then there exist disjoint open sets \(U, \, V\) with \(x \in U\) and \(y \in V\). This is a very basic property possessed by almost all topological spaces that occur in applications. A simple corollary of (c) is that if the topological space \((S, \ms T)\) is Hausdorff then \(A \in \ms S\) for every countable \(A \subseteq S\).

Let's note the extreme cases. If \( S \) has the discrete topology \( \ms P(S) \), so that every set is open (and closed), then of course the Borel \( \sigma \)-algebra is also \( \ms P(S) \). As noted above, this is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but is often too large if \( S \) is uncountable. If \(S\) has the trivial topology \(\{S, \emptyset\}\), then the Borel \(\sigma\)-algebra is also \(\{S, \emptyset\}\), and so is also trivial.

Recall that a base for a topological space \( (S, \ms T) \) is a collection \( \ms B \subseteq \ms T \) with the property that every set in \(\ms T\) is a union of a collection of sets in \( \ms B \). In short, every open set is a union of some of the basic open sets.

Suppose that \( (S, \ms T) \) is a topological space with a countable base \( \ms B \). Then \( \sigma(\ms B) = \sigma(\ms T) \).

Details:

Since \( \ms B \subseteq \ms T \) it follows trivially that \( \sigma(\ms B) \subseteq \sigma(\ms T) \). Conversely, if \( U \in \ms T \), there exists a collection of sets in \( \ms B \) whose union is \( U \). Since \( \ms B \) is countable, \( U \in \sigma(\ms B) \).

The topological spaces that occur in probability and stochastic processes are usually assumed to have a countable base (along with other nice properties such as the Hausdorff property and locally compactness). The \( \sigma \)-algebra used for such a space is usually the Borel \( \sigma \)-algebra, which by the previous result, is countably generated.

Measurable Functions

Recall that a set usually comes with a \(\sigma\)-algebra of admissible subsets. A natural requirement on a function is that the inverse image of an admissible set in the co-domain be admissible in the domain. Here is the formal definition.

Suppose that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces. A function \( f: S \to T \) is measurable if \( f^{-1}(A) \in \ms S \) for every \( A \in \ms T \).

If the \( \sigma \)-algebra in the co-domain is generated by a collection of basic sets, then to check the measurability of a function, we need only consider inverse images of basic sets:

Suppose again that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces, and that \( \ms T = \sigma(\ms B) \) for a collection of subsets \( \ms B \) of \( T \). Then \( f: S \to T \) is measurable if and only if \( f^{-1}(B) \in \ms S \) for every \( B \in \ms B \).

Details:

First \( \ms B \subseteq \ms T \), so if \( f: S \to T \) is measurable then the condition in the theorem trivially holds. Conversely, suppose that the condition in the theorem holds, and let \( \ms{U} = \{A \in \ms T: f^{-1}(A) \in \ms S\} \). Then \( T \in \ms{U} \) since \( f^{-1}(T) = S \in \ms S \). If \( A \in \ms{U} \) then \( f^{-1}(A^c) = \left[f^{-1}(A)\right]^c \in \ms S \), so \( A^c \in \ms{U} \). If \( A_i \in \ms{U} \) for \( i \) in a countable index set \( I \), then \( f^{-1}\left(\bigcup_{i \in I} A_i\right) = \bigcup_{i \in I} f^{-1}(A_i) \in \ms S \), and hence \( \bigcup_{i \in I} A_i \in \ms{U} \). Thus \( \ms{U} \) is a \( \sigma \)-algebra of subsets of \( T \). But \( \ms B \subseteq \ms{U} \) by assumption, so \( \ms T = \sigma(\ms B) \subseteq \ms{U} \). Of course \( \ms{U} \subseteq \ms T \) by definition, so \( \ms{U} = \ms T \) and hence \( f \) is measurable.

If you have reviewed topology then you may have noticed a striking parallel between the definition of continuity for functions on topological spaces and the defintion of measurability for functions on measurable spaces: A function from one topological space to another is continuous if the inverse image of an open set in the co-domain is open in the domain. A function from one measurable space to another is measurable if the inverse image of a measurable set in the co-domain is measurable in the domain. If we start with topological spaces, which we often do, and use the Borel \( \sigma \)-algebras to get measurable spaces, then we get the following (hardly surprising) connection.

Suppose that \( (S, \ms S) \) and \( (T, \ms T) \) are topological spaces, and that we give \( S \) and \( T \) the Borel \( \sigma \)-algebras \( \sigma(\ms S) \) and \( \sigma(\ms T) \) respectively. If \( f: S \to T \) is continuous, then \( f \) is measurable.

Details:

If \( V \in \ms T \) then \( f^{-1}(V) \in \ms S \subseteq \sigma(\ms S) \). Hence \( f \) is measurable by .

Measurability is preserved under composition, the most important method for combining functions.

Suppose that \((R, \ms R)\), \((S, \ms S)\), and \((T, \ms T)\) are measurable spaces. If \(f: R \to S\) is measurable and \(g: S \to T\) is measurable, then \(g \circ f: R \to T\) is measurable.

Details:

If \( A \in \ms T \) then \( g^{-1}(A) \in \ms S \) since \( g \) is measurable, and hence \( (g \circ f)^{-1}(A) = f^{-1}\left[g^{-1}(A)\right] \in \ms R \) since \( f \) is measurable.

If \( T \) is given the smallest possible \( \sigma \)-algebra or if \( S \) is given the largest one, then any function from \( S \) into \( T \) is measurable.

Every function \( f: S \to T \) is measurable in each of the following cases:

  1. \( \ms T = \{\emptyset, T\} \) and \( \ms S \) is an arbitrary \( \sigma \)-algebra of subsets of \( S \)
  2. \( \ms S = \ms P(S) \) and \( \ms T \) is an arbitrary \( \sigma \)-algebra of subsets of \( T \).
Details:
  1. Suppose that \( \ms T = \{\emptyset, T\} \) and that \( \ms S \) is an arbitrary \( \sigma \)-algebra on \( S \). If \( f: S \to T \), then \( f^{-1}(T) = S \in \ms S \) and \( f^{-1}(\emptyset) = \emptyset \in \ms S \) so \( f \) is measurable.
  2. Suppose that \( \ms S = \ms P(S) \) and that \( \ms T \) is an arbitrary \( \sigma \)-algebra on \( T \). If \( f: S \to T \), then trivially \( f^{-1}(A) \in \ms S \) for every \( A \in \ms T \) so \( f \) is measurable.

When there are several \( \sigma \)-algebras for the same set, then we use the phrase with respect to so that we can be precise. If a function is measurable with respect to a given \( \sigma \)-algebra on its domain, then it's measurable with respect to any larger \( \sigma \)-algebra on the domain. If the function is measurable with respect to a \( \sigma \)-algebra on the co-domain then its measurable with respect to any smaller \( \sigma \)-algebra on the co-domain.

Suppose that \( S \) has \( \sigma \)-algebras \( \ms R \) and \( \ms S \) with \( \ms R \subseteq \ms S \), and that \( T \) has \( \sigma \)-algebras \( \ms T \) and \( \ms{U} \) with \( \ms T \subseteq \ms{U} \). If \( f: S \to T \) is measurable with respect to \( \ms R \) and \( \ms{U} \), then \( f \) is measureable with respect to \( \ms S \) and \( \ms T \).

Details:

If \( A \in \ms T \) then \( A \in \ms{U} \). Hence \( f^{-1}(A) \in \ms R \) so \( f^{-1}(A) \in \ms S \).

The following construction is particularly important in probability theory:

Suppose that \( S \) is a set and \( (T, \ms T) \) is a measurable space. Suppose also that \(f: S \to T\) and define \(\sigma(f) = \left\{f^{-1}(A): A \in \ms T\right\}\). Then

  1. \( \sigma(f) \) is a \(\sigma\)-algebra on \(S\).
  2. \( \sigma(f) \) is the smallest \( \sigma \)-algebra on \( S \) that makes \( f \) measurable.
Details:
  1. The key to the proof is that the inverse image preserves all set operations First, \( S \in \sigma(f) \) since \( T \in \ms T \) and \( f^{-1}(T) = S \). If \( B \in \sigma(f) \) then \( B = f^{-1}(A) \) for some \( A \in \ms T \). But then \( A^c \in \ms T \) and hence \( B^c = f^{-1}(A^c) \in \sigma(f) \). Finally, suppose that \( B_i \in \sigma(f) \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \) there exists \( A_i \in \ms T \) such that \( B_i = f^{-1}(A_i) \). But then \( \bigcup_{i \in I} A_i \in \ms T \) and \( \bigcup_{i \in I} B_i = f^{-1}\left(\bigcup_{i \in I} A_i \right) \). Hence \( \bigcup_{i \in I} B_i \in \sigma(f) \).
  2. If \( \ms S \) is a \( \sigma \)-algebra on \( S \) and \( f \) is measurable with respect to \( \ms S \) and \( \ms T \), then by definition \( f^{-1}(A) \in \ms S \) for every \( A \in \ms T \), so \( \sigma(f) \subseteq \ms S \).

Appropriately enough, \( \sigma(f) \) is called the \(\sigma\)-algebra generated by \(f\). Often, \( S \) will have a given \( \sigma \)-algebra \( \ms S \) and \( f \) will be measurable with respect to \( \ms S \) and \( \ms T \). In this case, \( \sigma(f) \subseteq \ms S \). We can generalize to an arbitrary collection of functions on \( S \).

Suppose \( S \) is a set and that \((T_i, \ms T_i)\) is a measurable space for each \(i\) in a nonempty index set \(I\). Suppose also that \(f_i: S \to T_i\) for each \(i \in I\). The \(\sigma\)-algebra generated by this collection of functions is \[ \sigma\left\{f_i: i \in I\right\} = \sigma\left\{\sigma(f_i): i \in I\right\} = \sigma\left\{f_i^{-1}(A): i \in I, \, A \in \ms T_i\right\} \]

Again, this is the smallest \(\sigma\)-algebra on \(S\) that makes \(f_i\) measurable for each \(i \in I\).

Product Sets

Product sets arise naturally in the form of the higher-dimensional Euclidean spaces \( \R^n \) for \( n \in \{2, 3, \ldots\} \). In addition, product spaces are particularly important in probability, where they are used to describe the spaces associated with sequences of random variables. More general product spaces arise in the study of stochastic processes. We start with the product of two sets; the generalization to products of \( n \) sets and to general products is straightforward, although the notation gets more complicated.

Suppose that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces. The product \( \sigma \)-algebra on \( S \times T \) is \[\ms S \times \ms T = \sigma\{A \times B: A \in \ms S, \; B \in \ms T\} \]

So the definition is natural: the product \( \sigma \)-algebra is generated by products of measurable sets. Note however that \(\ms S \times \ms T\) is not the Cartesian product of the collections \(\ms S\) and \(\ms T\), even though the same notation is used. Our next goal is to consider the measurability of functions defined on, or mapping into, product spaces. Of basic importance are the projection functions. If \( S \) and \( T \) are sets, let \( p_1: S \times T \to S \) and \( p_2: S \times T \to T \) be defined by \( p_1(x, y) = x \) and \( p_2(x, y) = y \) for \( (x, y) \in S \times T \). Recall that \( p_1 \) is the projection onto the first coordinate and \( p_2 \) is the projection onto the second coordinate. The product \( \sigma \) algebra is the smallest \( \sigma \)-algebra that makes the projections measurable:

Suppose again that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces. Then \( \ms S \times \ms T = \sigma\{p_1, p_2\} \).

Details:

If \( A \in \ms S \) then \( p_1^{-1}(A) = A \times T \in \ms S \times \ms T\). Similarly, if \( B \in \ms T \) then \( p_2^{-1}(B) = S \times B \in \ms S \times \ms T \). Hence \( p_1 \) and \( p_2 \) are measurable, so \( \sigma\{p_1, p_2\} \subseteq \ms S \times \ms T \). Conversely, if \( A \in \ms S \) and \( B \in \ms T \) then \( A \times B = p_1^{-1}(A) \cap p_2^{-1}(B) \in \sigma\{p_1, p_2\}\). Since sets of this form generate the product \( \sigma \)-algebra, we have \( \ms S \times \ms T \subseteq \sigma\{p_1, p_2\} \).

Projection functions make it easy to study functions mapping into a product space.

Suppose that \( (R, \ms R) \), \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \ms S \times \ms T \). Suppose also that \( f: R \to S \times T \), so that \( f(x) = \left(f_1(x), f_2(x)\right) \) for \( x \in \R \), where \( f_1: R \to S \) and \( f_2: R \to T \) are the coordinate functions. Then \( f \) is measurable if and only if \( f_1 \) and \( f_2 \) are measurable.

Details:

Note that \( f_1 = p_1 \circ f \) and \( f_2 = p_2 \circ f \). So if \( f \) is measurable then \( f_1 \) and \( f_2 \) are compositions of measurable functions, and hence are measurable by . Conversely, suppose that \( f_1 \) and \( f_2 \) are measurable. If \( A \in \ms S \) and \( B \in \ms T \) then \( f^{-1}(A \times B) = f_1^{-1}(A) \cap f_2^{-1}(B) \in \ms R \). Since products of measurable sets generate \( \ms S \times \ms T \), it follows that \( f \) is measurable.

Our next goal is to consider cross sections of sets in a product space and cross sections of functions defined on a product space. It will help to introduce some new functions, which in a sense are complementary to the projection functions.

Suppose again that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \ms S \times \ms T \).

  1. For \( x \in S \) the function \( 1_x : T \to S \times T \), defined by \( 1_x(y) = (x, y) \) for \( y \in T \), is measurable.
  2. For \( y \in T \) the function \( 2_y: S \to S \times T \), defined by \( 2_y(x) = (x, y) \) for \( x \in S \), is measurable.
Details:

To show that the functions are measurable, if suffices to consider inverse images of products of measurable sets, since such sets generate \( \ms S \times \ms T \). Thus, let \( A \in \ms S \) and \( B \in \ms T \).

  1. For \( x \in S \) note that \( 1_x^{-1}(A \times B) \) is \( B \) if \( x \in A \) and is \( \emptyset \) if \( x \notin A \). In either case, \( 1_x^{-1}(A \times B) \in \ms T \).
  2. Similarly, for \( y \in T \) note that \( 2_y^{-1}(A \times B) \) is \( A \) if \( y \in B \) and is \( \emptyset \) if \( y \notin B \). In either case, \( 2_y^{-1}(A \times B) \in \ms S \).

Now our work is easy.

Suppose again that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces, and that \( C \in \ms S \times \ms T \). Then

  1. For \( x \in S \), \( \{y \in T: (x, y) \in C\} \in \ms T \).
  2. For \( y \in T \), \( \{x \in S: (x, y) \in C\} \in \ms S\).
Details:

These result follow immediately from the measurability of the functions \( 1_x \) and \( 2_y \) in :

  1. For \( x \in S \), \( 1_x^{-1}(C) = \{y \in T: (x, y) \in C\} \).
  2. For \( y \in T \), \( 2_y^{-1}(C) = \{x \in S: (x, y) \in C\} \).

The set in (a) is the cross section of \( C \) in the first coordinate at \( x \), and the set in (b) is the cross section of \( C \) in the second coordinate at \( y \). As a simple corollary to the theorem, note that if \( A \subseteq S \), \( B \subseteq T \) and \( A \times B \in \ms S \times \ms T \) then \( A \in \ms S \) and \( B \in \ms T \). That is, the only measurable product sets are products of measurable sets. Here is the measurability result for cross-sectional functions:

Suppose again that \( (S, \ms S) \) and \( (T, \ms T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \ms S \times \ms T \). Suppose also that \( (U, \ms{U}) \) is another measurable space, and that \( f: S \times T \to U \) is measurable. Then

  1. The function \( y \mapsto f(x, y) \) from \( T \) to \( U \) is measurable for each \( x \in S \).
  2. The function \( x \mapsto f(x, y) \) from \( S \) to \( U \) is measurable for each \( y \in T \).
Details:

Note that the function in (a) is just \( f \circ 1_x\), and the function in (b) is just \( f \circ 2_y \), both are compositions of measurable functions.

A measurable space \((S, \ms S)\) has a measurable diagonal if \[D = \{(x, x): x \in S\} \in \ms S \times \ms S\]

A space \((S, \ms S)\) with a measurable diagonal has many nice properties. First, \(\{x\} \in \ms S\) for \(x \in S\). Even more impressive is the following theorem:

Suppose that the measurable space \((T, \ms T)\) has a measurable diagonal. If \((S, \ms S)\) is another measurable space and \(f: S \to T\) is a measurable function, then the graph of \(f\) is measurable: \[\{(x, f(x)): x \in S\} \in \ms S \times \ms T\]

Because of properties such as these, measurable diagonal is sometimes used as an assumption. Here are a few technical connections: A space \((S, \ms S)\) has a measurable diagonal if and only if \(\ms S\) is generated by a countable collection of sets \(\ms A\) that separated points. That is, if \(x, \, y \in S\) then there exists \(A \in \ms A\) such that \(x \in A\) and \(y \notin A\), or \(y \in A\) and \(x \notin A\). Here is a standard example of a measurable space without a measurable diagonal:

If \(S\) is an uncountable set and \(\ms C\) is the \(\sigma\)-algebra of countable and co-countable subsets, then \((S, \ms C)\) does not have a measurable diagonal.

Details:

This follows from .

The results for products of two spaces generalize in a completely straightforward way to a product of \( n \) spaces.

Suppose \( n \in \N_+ \) and that \( (S_i, \ms S_i) \) is a measurable space for each \( i \in \{1, 2, \ldots, n\} \). The product \( \sigma \)-algebra on the Cartesian product set \( S_1 \times S_2 \times \cdots \times S_n \) is \[ \ms S_1 \times \ms S_2 \times \cdots \times \ms S_n = \sigma\left\{ A_1 \times A_2 \times \cdots \times A_n: A_i \in \ms S_i \text{ for all } i \in \{1, 2, \ldots, n\}\right\} \]

So again, the product \( \sigma \)-algebra is generated by products of measurable sets. Results analogous to the theorems above hold. In the special case that \( (S_i, \ms S_i) = (S, \ms S) \) for \( i \in \{1, 2, \ldots, n\} \), the Cartesian product becomes \( S^n \) and the corresponding product \( \sigma \)-algebra is denoted \( \ms S^n \). The notation is natural, but again potentially confusing. Note that \( \ms S^n \) is not the Cartesian product of \( \ms S \) of order \( n \), but rather the \( \sigma \)-algebra generated by sets of the form \( A_1 \times A_2 \times \cdots \times A_n \) where \( A_i \in \ms S \) for \( i \in \{1, 2, \ldots, n\} \).

We can also extend these ideas to a general product. To recall the definition, suppose that \( S_i \) is a set for each \( i \) in a nonempty index set \( I \). The product set \( \prod_{i \in I} S_i \) consists of all functions \( x: I \to \bigcup_{i \in I} S_i \) such that \( x(i) \in S_i \) for each \( i \in I \). To make the notation look more like a simple Cartesian product, we often write \( x_i \) instead of \( x(i) \) for the value of a function in the product set at \( i \in I \). The next definition gives the appropriate \( \sigma \)-algebra for the product set.

Suppose that \( (S_i, \ms S_i) \) is a measurable space for each \(i \) in a nonempty index set \( I \). The product \( \sigma \)-algebra on the product set \( \prod_{i \in I} S_i \) is \[ \prod_{i \in I} \ms S_i = \sigma\left\{\prod_{i \in I} A_i: A_i \in \ms S_i \text{ for each } i \in I \text{ and } A_i = S_i \text{ for all but finitely many } i \in I \right\}\]

The definition can also be understood in terms of projections. Recall that the projection onto coordinate \( j \in I \) is the function \( p_j: \prod_{i \in I} S_i \to S_j \) given by \( p_j(x) = x_j \). The product \( \sigma \)-algebra is the smallest \( \sigma \)-algebra on the product set that makes all of the projections measurable.

Suppose again that \( (S_i, \ms S_i)\) is a measurable space for each \( i \) in a nonempty index set \( I \), Then \(\prod_{i \in I} \ms S_i = \sigma\{p_i: i \in I\} \).

Details:

Let \( j \in I \) and \( A \in \ms S_j \). Then \( p_j^{-1}(A) = \prod_{i \in I} A_i \) where \( A_i = S_i \) for \( i \ne j \) and \( A_j = A \). This set is in \( \prod_{i \in I} \ms S_i \) so \( p_j \) is measurable. Hence \( \sigma\{p_i: i \in I\} \subseteq \prod_{i \in I} \ms S_i \). For the other direction, consider a product set \( \prod_{i \in I} A_i \) where \( A_i = S_i \) except for \( i \in J \), where \( J \subseteq I \) is finite. Then \( \prod_{i \in I} A_i = \bigcap_{j \in J} p_j^{-1}(A_j) \). This set is in \( \sigma\{p_i: i \in I\} \). Product sets of this form generate \( \prod_{i \in I} \ms S_i \) so it follows that \( \prod_{i \in I} \ms S_i \subseteq \sigma\{p_i: i \in I\} \).

In the special case that \( (S, \ms S) \) is a fixed measurable space and \( (S_i, \ms S_i) = (S, \ms S) \) for all \( i \in I \), the product set \( \prod_{i \in I} S \) is just the collection of functions from \( I \) into \( S \), often denoted \( S^I \). The product \( \sigma \)-algebra is then denoted \( \ms S^I \), a notation that is natural, but again potentially confusing. Here is the main measurability result for a function mapping into a product space.

Suppose that \( (R, \ms R) \) is a measurable space, and that \( (S_i, \ms S_i) \) is a measurable space for each \( i \) in a nonempty index set \( I \). As before, let \(\prod_{i \in I} S_i \) have the product \( \sigma \)-algebra. Suppose now that \( f: R \to \prod_{i \in I} S_i \). For \( i \in I \) let \( f_i: R \to S_i \) denote the \( i \)th coordinate function of \( f \), so that \( f_i(x) = [f(x)]_i \) for \( x \in R \). Then \( f \) is measurable if and only if \( f_i \) is measurable for each \( i \in I \).

Details:

Suppose that \( f \) is measurable. For \( i \in I \) note that \( f_i = p_i \circ f \) is a composition of measurable functions, and hence is measurable by . Conversely, suppose that \( f_i \) is measurable for each \( i \in I \). To show that measurability of \( f \) we need only consider inverse images of sets that generate the product \( \sigma \)-algebra. Thus, suppose that \( A_j \in \ms S_j \) for \( j \) in a finite subset \( J \subseteq I \), and let \( A_i = S_i \) for \( i \in I - J \). Then \( f^{-1}\left(\prod_{i \in I} A_i\right) = \bigcap_{j \in J} f_j^{-1}(A_j) \). This set is in \( \ms R \) since the intersection is over a finite index set.

Just as with the product of two sets, cross-sectional sets and functions are measurable with respect to the product measure. Again, it's best to work with some special functions.

Suppose that \( (S_i, \ms S_i) \) is a measurable space for each \( i \) in an index set \( I \) with at least two elements. For \( j \in I \) and \( u \in S_j \), define the function \( j_u: \prod_{i \in I - \{j\}} \to \prod_{i \in I} S_i \) by \( j_u(x) = y \) where \( y_i = x_i \) for \( i \ne j \) and \( y_j = u \). Then \( j_u \) is measurable with respect to the product \( \sigma \)-algebras.

Details:

Once again, it suffices to consider the inverse image of the sets that generate the product \( \sigma \)-algebra. So suppose \( A_i \in \ms S_i \) for \( i \in I \) with \( A_i = S_i \) for all but finitely many \( i \in I \). Then \( j_u^{-1}\left(\prod_{i \in I} A_i\right) = \prod_{i \in I - \{j\}} A_i \) if \( u \in A_j \), and the inverse image is \( \emptyset \) otherwise. In either case, \( j_u^{-1}\left(\prod_{i \in I} A_i\right) \) is in the product \( \sigma \)-algebra on \( \prod_{i \in I - \{j\}} S_i \).

In words, for \( j \in I \) and \( u \in S_j \), the function \( j_u \) takes a point in the product set \( \prod_{ i \in I - \{j\}} S_i \) and assigns \( u \) to coordinate \( j \) to give a point in \( \prod_{i \in I} S_i \). If \( A \subseteq \prod_{i \in I} S_i \), then \( j_u^{-1}(A) \) is the cross section of \( A \) in coordinate \( j \) at \( u \). So it follows immediately from the previous result that the cross sections of a measurable set are measurable. Cross sections of measurable functions are also measurable. Suppose that \( (T, \ms T) \) is another measurable space, and that \( f: \prod_{i \in I} S_i \to T \) is measurable. The cross section of \( f \) in coordinate \( j \in I \) at \( u \in S_j \) is simply \( f \circ j_u: S_{I - \{j\}} \to T\), a composition of measurable functions.

However, a non-measurable set can have measurable cross sections, even in a product of two spaces.

Suppose that \( S \) is an uncountable set with the \( \sigma \)-algebra \( \ms C \) of countable and co-countable sets as defined in . Consider \( S \times S \) with the product \( \sigma \)-algebra \( \ms C \times \ms C \). Let \( D = \{(x, x): x \in S\}\), the diagonal of \( S \times S \). Then \( D \) has measurable cross sections, but \( D \) is not measurable.

Details:

For \( x \in S \), the cross section of \( D \) in the first coordinate at \( x \) is \( \{y \in S: (x, y) \in D\} = \{x\} \in \ms C \). Similarly, for \( y \in S \), the cross section of \( D \) in the second coordinate at \( y \) is \( \{x \in S: (x, y) \in D\} = \{ y\} \in \ms C \). But as noted in , \(D\) is not measurable.

In terms of topology, suppose that \((S, \ms U)\) and \((T, \ms V)\) are topological spaces. Recall that the product topology on \(S \times T\) is the topology with base \(\ms B = \{A \times B: A \in \ms U, B \in \ms V\}\). Given the similarities of the definitions, you might think that the Borel \(\sigma\)-algebra on \(S \times T\) corresponding to the product topolgy is the product of the Borel \(\sigma\)-algebras of \(S\) and \(T\). That fails in general, but is true if the topological spaces are sufficiently nice.

Suppose that \((S, \ms U)\) and \((T, \ms V)\) are topological spaces corresponding to separable metric spaces. Then the Borel \(\sigma\)-algebra on \(S \times T\) corresponding to the product topology is the product of the Borel \(\sigma\)-algebras on \(S\) and \(T\). In symbols \[\sigma(\ms U \times \ms V) = \sigma(\ms U) \times \sigma(\ms V)\]

As noted above, having a measurable diagonal in is a simple property that implies a number of seemingly stronger properties. Here is the connection to topology.

Suppose that \((S, \ms U)\) is a topological space corresponding to a separable metric space and let \(\ms S = \sigma(\ms U)\) be the Borel \(\sigma\)-algebra. Then \((S, \ms S)\) has a measurable diagonal.

Details:

Since \((S, \ms U)\) is Hausdorff, the diagonal \(D = \{(x, x): x \in S\}\) is closed in the product topology, and hence is measurable for the Borel \(\sigma\)-algebra corresponding to the product topology. But by , this is the product of the Borel \(\sigma\)-algebras on \(S\).

In particular, the previous results apply to the standard LCCB topological spaces.

Special Cases

Most of the sets encountered in applied probability are either countable, or subsets of \(\R^n\) for some \(n\), or more generally, subsets of a product of a countable number of sets of these types. In the study of stochastic processes, various spaces of functions play an important role. In this subsection, we will explore the most important special cases.

Discrete Spaces

If \(S\) is countable and \(\ms S = \ms P(S)\) is the collection of all subsets of \(S\), then \((S, \ms S)\) is a discrete measurable space.

Thus if \((S, \ms S)\) is discrete, all subsets of \( S \) are measurable and every function from \( S \) to another measurable space is measurable. The power set is also the discrete topology on \( S \), so \( \ms S \) is a Borel \( \sigma \)-algebra as well. As a topological space, \( (S, \ms S) \) is complete, locally compact, Hausdorff, and since \( S \) is countable, separable. Moreover, the discrete topology corresponds to the discrete metric \( d \), defined by \( d(x, x) = 0 \) for \( x \in S \) and \( d(x, y) = 1 \) for \( x, \, y \in S \) with \( x \ne y \).

Euclidean Spaces

Recall that for \(n \in \N_+\), the Euclidean topology on \(\R^n\) is generated by the standard Euclidean metric \( d_n \) given by \[ d_n(\bs{x}, \bs{y}) = \sqrt{\sum_{i=1}^n (x_i - y_i)^2}, \quad \bs x = (x_1, x_2, \ldots, x_n), \, \bs y = (y_1, y_2, \ldots, y_n) \in \R^n \] With this topology, \( \R^n \) is complete, connected, locally compact, Hausdorff, and separable.

For \(n \in \N_+\), the \(n\)-dimensional Euclidean measurable space is \((\R^n, \ms R^n)\) where \(\ms R^n\) is the Borel \(\sigma\)-algebra corresponding to the standard Euclidean topology on \(\R^n\).

The one-dimensional case is particularly important. In this case, the standard Euclidean metric \( d \) is given by \( d(x, y) = \left|x - y\right| \) for \( x, \, y \in \R \). The Borel \(\sigma\)-algebra \(\ms R\) can be generated by various collections of intervals.

Each of the following collections generates \( \ms R \).

  1. \( \ms B_1 = \{I \subseteq \R: I \text{ is an interval} \} \)
  2. \( \ms B_2 = \{(a, b]: a, \, b \in \R, \; a \lt b \}\)
  3. \( \ms B_3 = \{(-\infty, b]: b \in \R \} \)
Details:

The proof involves showing that each set in any one of the collections is in the \( \sigma \)-algebra of any other collection. Let \( \ms S_i = \sigma(\ms B_i) \) for \( i \in \{1, 2, 3\} \).

  1. Clearly \( \ms B_2 \subseteq \ms B_1 \) and \( \ms B_3 \subseteq \ms B_1 \) so \( \ms S_2 \subseteq \ms S_1 \) and \( \ms S_3 \subseteq \ms S_1 \).
  2. If \( a, \, b \in \R \) with \( a \le b \) then \( [a, b] = \bigcap_{n=1}^\infty \left(a - \frac{1}{n}, b\right] \) and \( (a, b) = \bigcup_{n=1}^\infty \left(a, b - \frac{1}{n}\right] \), so \( [a, b], \, (a, b) \in \ms S_2 \). Also \( [a, b) = \bigcup_{n=1}^\infty \left[a, b - \frac{1}{n}\right] \) so \( [a, b) \in \ms R^2 \). Thus all bounded intervals are in \( \ms S_2 \). Next, \( [a, \infty) = \bigcup_{n=1}^\infty [a, a + n) \), \( (a, \infty) = \bigcup_{n=1}^\infty (a, a + n) \), \( (-\infty, a] = \bigcup_{n=1}^\infty (a - n, a] \), and \( (-\infty, a) = \bigcup_{n=1}^\infty (a - n, a) \), so each of these intervals is in \( \ms S_2 \). Of course \( \R \in \ms S_2 \), so we now have that \( I \in \ms S_2 \) for every interval \( I \). Thus \( \ms S_1 \subseteq \ms S_2 \), and so from (a), \( \ms S_2 = \ms S_1\).
  3. If \( a, \, b \in \R \) with \( a \lt b \) then \( (a, b] = (-\infty, b] - (-\infty, a] \) so \( (a, b] \in \ms S_3 \). Hence \( \ms S_2 \subseteq \ms S_3 \). But then from (a) and (b) it follows that \( \ms S_3 = \ms S_1 \).

Since the Euclidean topology has a countable base, \(\ms R\) is countably generated. In fact each collection of intervals above, but with endpoints restricted to \( \Q \), generates \(\ms R\). Moreover, \( \ms R \) can also be constructed from \( \sigma \)-algebras that are generated by countable partitions as in . First recall that for \( n \in \N \), the set of dyadic rationals (or binary rationals) of rank \( n \) or less is \( \D_n = \{j / 2^n: j \in \Z\} \). Note that \( \D_n \) is countable and \( \D_n \subseteq \D_{n+1} \) for \( n \in \N \). Moreover, the set \( \D = \bigcup_{n \in \N} \D_n \) of all dyadic rationals is dense in \( \R \). The dyadic rationals are often useful in various applications because \( \D_n \) has the natural ordered enumeration \( j \mapsto j / 2^n \) for each \( n \in \N \). Now let \[ \ms{D}_n = \left\{\left(j / 2^n, (j + 1) / 2^n\right]: j \in \Z\right\}, \quad n \in \N \] Then \( \ms{D}_n \) is a countable partition of \( \R \) into nonempty intervals of equal size \( 1 / 2^n \), so \( \ms{E}_n = \sigma(\ms{D}_n) \) consists of unions of sets in \( \ms{D}_n \) as described in . Every set \( \ms{D}_{n} \) is the union of two sets in \( \ms{D}_{n+1} \) so clearly \( \ms{E}_n \subseteq \ms{E}_{n+1} \) for \( n \in \N \). Finally, the Borel \( \sigma \)-algebra on \( \R \) is \( \ms R = \sigma\left(\bigcup_{n=0}^\infty \ms{E}_n\right) = \sigma\left(\bigcup_{n=0}^\infty \ms{D}_n\right) \). This construction turns out to be useful in a number of settings.

For \( n \in \{2, 3, \ldots\} \), the Euclidean topology on \(\R^n\) is the \( n \)-fold product topology formed from the Euclidean topology on \( \R \). So the Borel \( \sigma \)-algebra \( \ms R^n \) is also the \( n \)-fold product \( \sigma \)-algebra formed from \( \ms R \). Finally, \( \ms R^n \) can be generated by \( n \)-fold products of sets in any of the three collections in .

Space of Real Functions

Suppose that \( (S, \ms S) \) is a measurable space. Recall that the usual arithmetic operations on functions from \( S \) into \( \R \) are defined pointwise.

If \( f: S \to \R \) and \( g: S \to \R \) are measurable and \( a \in \R \), then each of the following functions from \( S \) into \( \R \) is also measurable:

  1. \( f + g \)
  2. \( f - g \)
  3. \( f g \)
  4. \( a f \)
Details:

These results follow from the fact that the arithmetic operators are continuous, and hence measurable. That is, \( (x, y) \mapsto x + y \), \( (x, y) \mapsto x - y \), and \( (x, y) \mapsto x y \) are continuous as functions from \( \R^2 \) into \( \R \). Thus, if \( f, \, g: S \to \R \) are measurable, then \( (f, g): S \to \R^2 \) is measurable by . Then, \( f + g \), \( f - g \), \( f g \) are the compositions, respectively, of \( + \), \( - \), \( \cdot \) with \( (f, g) \). Of course, (d) is a simple corollary of (c).

Similarly, if \( f: S \to \R \setminus \{0\} \) is measurable, then so is \( 1 / f \). Recall that the set of functions from \( S \) into \( \R \) is a vector space, under the pointwise definitions of addition and scalar multiplication. But once again, we usually want to restrict our attention to measurable functions. Thus, it's nice to know that the measurable functions from \( S \) into \( \R \) also form a vector space. This follows immediately from the closure properties (a) and (d) of . Of particular importance in probability and stochastic processes is the vector space of bounded, measurable functions \( f: S \to \R \), with the supremum norm \[ \|f\| = \sup\left\{\left|f(x)\right|: x \in S \right\} \]

The elementary functions that we encounter in calculus and other areas of applied mathematics are functions from subsets of \( \R \) into \( \R \). The elementary functions include algebraic functions (which in turn include the polynomial and rational functions), the usual transcendental functions (exponential, logarithm, trigonometric), and the usual functions constructed from these by composition, the arithmetic operations, and by piecing together. As we might hope, all of the elementary functions are measurable.