\(\newcommand{\R}{\mathbb{R}}\)
\(\newcommand{\N}{\mathbb{N}}\)
\(\newcommand{\Z}{\mathbb{Z}}\)
\(\newcommand{\Q}{\mathbb{Q}}\)
\(\newcommand{\D}{\mathbb{D}}\)

In this section we discuss some topics from measure theory that are a bit more advanced than the topics in the previous sections of this chapter. However, measure-theoretic ideas are essential for a deep understanding of probability, since probability is itself a measure. The most important of the definitions is the \(\sigma\)-algebra. These play a fundamental role, even for applied probability, in encoding the state of information about a random experiment.

On the other hand, we won't be overly pedantic about measure-theoretic details in this text. Unless we say otherwise, we assume that all sets that appear are measurable (that is, members of the appropriate \(\sigma\)-algebras), and that all functions are measurable (relative to the appropriate \(\sigma\)-algebras).

Although this section is somewhat abstract, many of the proofs are straightforward. Be sure to try the proofs yourself before reading the ones in the text.

Suppose that \(S\) is a set, playing the role of a universal set for a particular mathematical model. It is sometimes impossible to include *all* subsets of \(S\) in our model, particularly when \(S\) is uncountable. In a sense, the more sets that we include, the harder it is to have consistent theories. However, we almost always want the collection of admissible subsets to be closed under the basic set operations. This leads to some important definitions.

Suppose that \(\mathscr{S}\) is a nonempty collection of subsets of \(S\). Then \(\mathscr{S}\) is said to be an algebra (or field) if it is closed under complement and union:

- If \(A \in \mathscr{S}\) then \(A^c \in \mathscr{S}\).
- If \(A \in \mathscr{S}\) and \(B \in \mathscr{S}\) then \(A \cup B \in \mathscr{S}\).

If \(\mathscr{S}\) is an algebra of subsets of \(S\) then

- \( S \in \mathscr{S} \)
- \( \emptyset \in \mathscr{S} \)

- Since \( \mathscr{S} \) is nonempty, there exists \( A \in \mathscr{S} \). Hence \( A^c \in \mathscr{S} \) so \( S = A \cup A^c \in \mathscr{S} \).
- \( \emptyset = S^c \in \mathscr{S} \)

Suppose that \(\mathscr{S}\) is an algebra of subsets of \(S\) and that \(A_i \in \mathscr{S}\) for each \(i\) in a finite index set \(I\).

- \(\bigcup_{i \in I} A_i \in \mathscr{S}\)
- \(\bigcap_{i \in I} A_i \in \mathscr{S}\)

- This follows by induction on the number of elements in \(I\).
- Thie follows from (a) and DeMorgan's law. If \( A_i \in \mathscr{S} \) for \( i \in I \) then \( A_i^c \in \mathscr{S} \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \mathscr{S} \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \mathscr{S} \).

Thus it follows that an algebra of sets is closed under a finite number of set operations. That is, if we start with a finite number of sets in the algebra \( \mathscr{S} \), and build a new set with a finite number of set operations (union, intersection, complement), then the new set is also in \( \mathscr{S} \). However in many mathematical theories, probability in particular, this is not sufficient; we often need the collection of admissible subsets to be closed under a *countable* number of set operations.

Suppose that \(\mathscr{S}\) is a nonempty collection of subsets of \(S\). Then \(\mathscr{S}\) is said to be a \(\sigma\)-algebra (or \(\sigma\)-field) if the following axioms are satisfied:

- If \(A \in \mathscr{S}\) then \(A^c \in \mathscr{S}\).
- If \(A_i \in \mathscr{S}\) for each \(i\) in a countable index set \(I\), then \(\bigcup_{i \in I} A_i \in \mathscr{S}\).

Clearly a \(\sigma\)-algebra of subsets is also an algebra of subsets, so the basic results for algebras above still hold. In particular, \( S \in \mathscr{S} \) and \( \emptyset \in \mathscr{S} \).

If \(A_i \in \mathscr{S}\) for each \(i\) in a countable index set \(I\), then \(\bigcap_{i \in I} A_i \in \mathscr{S}\).

The proof is just like the one above for algebras. If \( A_i \in \mathscr{S} \) for \( i \in I \) then \( A_i^c \in \mathscr{S} \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \mathscr{S} \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \mathscr{S} \).

Thus a \(\sigma\)-algebra of subsets of \(S\) is closed under countable unions and intersections. This is the reason for the symbol \(\sigma\) in the name. As mentioned in the introductory paragraph, \( \sigma \)-algebras are of fundamental importance in mathematics generally and probability theory specifically. If \( S \) is a set and \( \mathscr{S} \) a \( \sigma \)-algebra of subsets of \( S \), then the pair \( (S, \mathscr{S}) \) is called a measurable space.

Suppose that \(S\) is a set and that \(\mathscr{S}\) is a finite algebra of subsets of \(S\). Then \(\mathscr{S}\) is also a \(\sigma\)-algebra.

Any countable union of sets in \(\mathscr{S}\) reduces to a finite union.

However, there *are* algebras that are not \(\sigma\)-algebras. Here is the classic example:

Suppose that \( S \) is an infinite set. The collection of finite and co-finite subsets of \( S \) defined below is an algebra of subsets of \( S \), but not a \(\sigma\)-algebra: \[ \mathscr{F} = \{A \subseteq S: A \text{ is finite or } A^c \text{ is finite}\} \]

\( S \in \mathscr{F} \) since \( S^c = \emptyset \) is finite. If \( A \in \mathscr{F} \) then \( A^c \in \mathscr{F} \) by the symmetry of the definition. Suppose that \( A, \, B \in \mathscr{F} \). If \( A \) and \( B \) are both finite then \( A \cup B \) is finite. If \( A^c \) or \( B^c \) is finite, then \( (A \cup B)^c = A^c \cap B^c \) is finite. In either case, \( A \cup B \in \mathscr{F} \). Thus \( \mathscr{F} \) is an algebra of subsets of \( S \).

Since \( S \) is infinite, it contains a countably infinite subset \( \{x_0, x_1, x_2, \ldots\} \). Let \( A_n = \{x_{2 n}\} \) for \( n \in \N \). Then \( A_n \) is finite, so \( A_n \in \mathscr{F} \) for each \( n \in \N \). Let \( E = \bigcup_{n=0}^\infty A_n = \{x_0, x_2, x_4, \ldots\} \). Then \( E \) is infinite by construction. Also \(\{x_1, x_3, x_5, \ldots\} \subseteq E^c \), so \( E^c \) is infinite as well. Hence \( E \notin \mathscr{F} \) and so \( \mathscr{F} \) is not a \( \sigma \)-algebra.

Recall that \(\mathscr{P}(S)\) denotes the collection of *all* subsets of \(S\), called the power set of \(S\). Trivially, \(\mathscr{P}(S)\) is the largest \(\sigma\)-algebra of \(S\). The power set is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but as noted above, is sometimes too large to be useful if \( S \) is uncountable. At the other extreme, the smallest \(\sigma\)-algebra of \(S\) is given in the following exercise.

The collection \(\{\emptyset, S\}\) is a \(\sigma\)-algebra.

Clearly \( \{\emptyset, S\} \) is a finite algebra: \( S \) and \( \emptyset \) are complements of each other, and \( S \cup \emptyset = S \). Hence \( \{S, \emptyset\} \) is a \( \sigma \)-algebra by the result above for finite algebras.

In many cases, we want to construct a \(\sigma\)-algebra that contains certain basic sets. The next two results show how to do this.

Suppose that \(\mathscr{S}_i\) is a \(\sigma\)-algebra of subsets of \(S\) for each \(i\) in a nonempty index set \(I\). Then \( \mathscr{S} = \bigcap_{i \in I} \mathscr{S}_i\) is also a \(\sigma\)-algebra of subsets of \(S\).

The proof is completely straightforward. First, \( S \in \mathscr{S}_i \) for each \( i \in I \) so \( S \in \mathscr{S} \). If \( A \in \mathscr{S} \) then \( A \in \mathscr{S}_i \) for each \( i \in I \) and hence \( A^c \in \mathscr{S}_i \) for each \( i \in I \). Therefore \( A^c \in \mathscr{S} \). Finally suppose that \( A_j \in \mathscr{S} \) for each \( j \) in a countable index set \( J \). Then \( A_j \in \mathscr{S}_i \) for each \( i \in I \) and \( j \in J \) and therefore \( \bigcup_{j \in J} A_j \in \mathscr{S}_i \) for each \( i \in I \). It follows that \( \bigcup_{j \in J} A_j \in \mathscr{S} \).

Note that no restrictions are placed on the index set \( I \), other than it be nonempty, so in particular it may well be uncountable.

Suppose that \( S \) is a set and that \(\mathscr{B}\) is a collection of subsets of \(S\). The \(\sigma\)-algebra generated by \(\mathscr{B}\) is \[\sigma(\mathscr{B}) = \bigcap \{\mathscr{S}: \mathscr{S} \text{ is a } \sigma\text{-algebra of subsets of } S \text{ and } \mathscr{B} \subseteq \mathscr{S}\}\] If \( \mathscr{B} \) is countable then \( \mathscr{S} = \sigma(\mathscr{B}) \) is said to be countably generated.

So the \(\sigma\)-algebra generated by \(\mathscr{B}\) is the intersection of all \(\sigma\)-algebras that contain \(\mathscr{B}\), which by the previous result really is a \(\sigma\)-algebra. Note that the collection of \( \sigma \)-algebras in the intersection is not empty, since \( \mathscr{P}(S) \) is in the collection. Think of the sets in \(\mathscr{B}\) as * basic sets* that we want to be measurable, but do not form a \(\sigma\)-algebra.

The \(\sigma\)-algebra \(\sigma(\mathscr{B})\) is the smallest \(\sigma\) algebra containing \(\mathscr{B}\).

- \(\mathscr{B} \subseteq \sigma(\mathscr{B})\)
- If \(\mathscr{S}\) is a \(\sigma\)-algebra of subsets of \(S\) and \(\mathscr{B} \subseteq \mathscr{S}\) then \(\sigma(\mathscr{B}) \subseteq \mathscr{S}\).

Both of these properties follows from the definition of \( \sigma(\mathscr{B}) \) as the intersection of all \( \sigma \)-algebras that contain \( \mathscr{B} \).

Note that the conditions in the previous theorem completely characterize \( \sigma(\mathscr{B}) \). If \( \mathscr{S}_1 \) and \( \mathscr{S}_2 \) satisfy the conditions, then by (a), \( \mathscr{B} \subseteq \mathscr{S}_1 \) and \( \mathscr{B} \subseteq \mathscr{S}_2 \). But then by (b), \( \mathscr{S}_1 \subseteq \mathscr{S}_2 \) and \( \mathscr{S}_2 \subseteq \mathscr{S}_1\).

If \(A\) is a subset of \(S\) then \(\sigma\{A\} = \{\emptyset, A, A^c, S\}\)

Let \( \mathscr{S} = \{\emptyset, A, A^c, S\} \). Clearly \( \mathscr{S} \) is an algebra: \( A \) and \( A^c \) are complements of each other, as are \( \emptyset \) and \( S \). Also, \( A \cup A^c = A \cup S = A^c \cup S = S \cup S = \emptyset \cup S = S \), \( A \cup \emptyset = A \cup A = A \), \( A^c \cup \emptyset = A^c \cup A^c = A^c \), and \( \emptyset \cup \emptyset = \emptyset \). Since \( \mathscr{S} \) is finite, it is a \( \sigma \)-algebra by the result above for finite algebras.

Next, \( A \in \mathscr{S} \). Conversely, if \( \mathscr{T} \) is a \( \sigma \)-algebra and \( A \in \mathscr{T} \) then of course \( \emptyset, S, A^c \in \mathscr{T} \) so \( \mathscr{S} \subseteq \mathscr{T} \). Hence \( \mathscr{S} = \sigma\{A\} \)

We can generalize the previous result. Recall that a collection of subsets \( \mathscr{A} = \{A_i: i \in I\} \) is a partition of \( S \) if \( A_i \cap A_j = \emptyset \) for \( i, \; j \in I \) with \( i \ne j \), and \( \bigcup_{i \in I} A_i = S \).

Suppose that \( \mathscr{A} = \{A_i: i \in I\} \) is a countable partition of \( S \) into nonempty subsets. Then \( \sigma(\mathscr{A}) \) is the collection of all unions of sets in \( \mathscr{A} \). That is, \[ \sigma(\mathscr{A}) = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \]

Let \( \mathscr{S} = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \). Note that \( S \in \mathscr{S} \) since \( S = \bigcup_{i \in I} A_i \). Next, suppose that \( B \in \mathscr{S} \). Then \( B = \bigcup_{j \in J} A_j \) for some \( J \subseteq I \). But then \( B^c = \bigcup_{j \in J^c} A_j \), so \( B^c \in \mathscr{S} \). Next, suppose that \( B_k \in \mathscr{S} \) for \( k \in K \) where \( K \) is a countable index set. Then for each \( k \in K \) there exists \( J_k \subseteq I \) such that \( B_k = \bigcup_{j \in J_k} A_j \). But then \( \bigcup_{k \in K} B_k = \bigcup_{k \in K} \bigcup_{j \in J_k} A_j = \bigcup_{j \in J} A_j \) where \( J = \bigcup_{k \in K} J_k \). Hcnce \( \bigcup_{k \in K} B_k \in \mathscr{S} \). Therefore \( \mathscr{S} \) is a \( \sigma \)-algebra of subsets of \( S \). Trivially, \( \mathscr{A} \subseteq \mathscr{S} \). If \( \mathscr{T} \) is a \( \sigma \)-algebra of subsets of \( S \) and \( \mathscr{A} \subseteq \mathscr{T} \), then clearly \( \bigcup_{j \in J} A_j \in \mathscr{T} \) for every \( J \subseteq I \). Hence \( \mathscr{S} \subseteq \mathscr{T}\).

A \( \sigma \)-algebra of this form is said to be generated by a countable partition. Note that since \( A_i \ne \emptyset \) for \( i \in I \), the representation of a set in \( \sigma(\mathscr{A}) \) as a union of sets in \( \mathscr{A} \) is unique. That is, if \( J, \; K \subseteq I \) and \( J \ne K \) then \( \bigcup_{j \in J} A_j \ne \bigcup_{k \in K} A_k \). In particular, if there are \( n \) nonempty sets in \( \mathscr{A} \), so that \( \#(I) = n \), then there are \( 2^n \) subsets of \( I \) and hence \( 2^n \) sets in \( \sigma(\mathscr{A}) \).

Suppose now that \( \mathscr{A} = \{A_1, A_2, \ldots, A_n\} \) is a collection of \(n\) subsets of \(S\) (not necessarily disjoint). To describe the \( \sigma \)-algebra generated by \( \mathscr{A} \) we need a bit more notation. For \( x = (x_1, x_2, \ldots, x_n) \in \{0, 1\}^n \) (a bit string of length \( n \)), let \( B_x = \bigcap_{i=1}^n A_i^{x_i} \) where \( A_i^1 = A_i \) and \( A_i^0 = A_i^c \).

In the setting above,

- \( \mathscr{B} = \{B_x: x \in \{0, 1\}^n\} \) partitions \( S \).
- \( A_i = \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\) for \(i \in \{1, 2, \ldots, n\}\).
- \(\sigma(\mathscr{A}) = \sigma(\mathscr{B}) = \left\{\bigcup_{x \in J} B_x: J \subseteq \{0, 1\}^n\right\}\).

- Suppose that \( x, \; y \in \{0, 1\}^n \) and that \( x \ne y \). Without loss of generality we can suppose that for some \( j \in \{1, 2, \ldots, n\} \), \(x_j = 0 \) while \( y_j = 1 \). Then \( B_x \subseteq A_j^c \) and \( B_y \subseteq A_j \) so \( B_x \) and \( B_y \) are disjoint. Suppose that \( s \in S \). Construct \( x \in \{0, 1\}^n \) by \( x_i = 1 \) if \( s \in A_i \) and \( x_i = 0 \) if \( s \notin A_i \), for each \( i \in \{1, 2, \ldots, n\} \). Then by definition, \( s \in B_x \). Hence \( \mathscr{B} \) partitions \( S \).
- Fix \( i \in \{1, 2, \ldots, n\}\). Again if \( x \in \{0, 1\}^n \) and \( x_i = 1 \) then \( B_x \subseteq A_i \). Hence \(\bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\} \subseteq A_i\). Conversely, suppose \( s \in A_i \). Define \( y \in \{0, 1\}^n \) by \( y_j = 1 \) if \( s \in A_j \) and \( y_j = 0 \) if \( s \notin A_j \) for each \( j \in \{1, 2, \ldots, n\} \). Then \( y_i = 1 \) and \( s \in B_y \). Hence \( s \in \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\).
- Clearly, every \( \sigma \)-algebra of subsets of \( S \) that contains \( \mathscr{A} \) must also contain \( \mathscr{B} \), and every \( \sigma \)-algebra of subsets of \( S \) that contains \( \mathscr{B} \) must also contain \( \mathscr{A} \). It follows that \( \sigma(\mathscr{A}) = \sigma(\mathscr{B}) \). The characterization in terms of unions now follows from the previous result.

Recall that there are \( 2^n \) bit strings of length \( n \). The sets in \( \mathscr{A} \) are said to be in general position if the sets in \( \mathscr{B} \) are distinct (and hence there are \( 2^n \) of them) and are nonempty. In this case, there are \( 2^{2^n} \) sets in \( \sigma(\mathscr{A}) \).

Open the Venn diagram app. This app shows two subsets \(A\) and \(B\) of \(S\) in general position, and lists the 16 sets in \( \sigma\{A, B\} \).

- Select each of the 4 sets that partition \( S \): \( A \cap B \), \( A \cap B^c \), \( A^c \cap B \), \( A^c \cap B^c \).
- Select each of the other 12 sets in \(\sigma\{A, B\}\) and note how each is a union of some of the sets in (a).

Sketch a Venn diagram with sets \( A_1, \; A_2, \; A_3 \) in general position. Identify the set \( B_x \) for each \( x \in \{0, 1\}^3 \).

One of the most important ways to generate a \( \sigma \)-algebra is by means of topology. Recall that a topological space consists of a set \( S \) and a topology \(\mathscr{S}\), the collection of open subsets of \( S \). Most sets that occur in probability and stochastic processes have natural topologies on them.

Suppose that \( (S, \mathscr{S}) \) is a topological space. Then \( \sigma(\mathscr{S}) \) is the Borel \( \sigma \)-algebra of the space.

So the Borel \( \sigma \)-algebra on \( S \), named for Émile Borel is generated by the open subsets of \( S \). Thus, a *topologicaal space* \( (S, \mathscr{S}) \) naturally leads to a *measurable space* \( (S, \sigma(\mathscr{S}))\). Since a closed set is simply the complement of an open set, the Borel \( \sigma \)-algebra contains the closed sets as well. More generally, it contains countable intersections of open sets (sometimes called \( G_\delta \) sets) and countable unions of closed sets (sometimes called \( F_\sigma \) sets). Of course, the Borel \( \sigma \)-algebra will typically contain lots of other sets as well.

As a trivial special case, if \( S \) has the discrete topology \( \mathscr{P}(S) \), so that every set is open (and closed), then of course the Borel \( \sigma \)-algebra is also \( \mathscr{P}(S) \). As noted above, this is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but is often too large if \( S \) is uncountable.

Recall that a base for a topological space \( (S, \mathscr{S}) \) is a collection \( \mathscr{B} \subseteq \mathscr{T} \) with the property that every set in \(\mathscr{T}\) is a union of a collection of sets in \( \mathscr{B} \). In short, every open set is a union of some of the basic open sets.

Suppose that \( (S, \mathscr{S}) \) is a topological space with a countable base \( \mathscr{B} \). Then \( \sigma(\mathscr{B}) = \sigma(\mathscr{S}) \).

Since \( \mathscr{B} \subseteq \mathscr{S} \) it follows trivially that \( \sigma(\mathscr{B}) \subseteq \sigma(\mathscr{S}) \). Conversely, if \( U \in \mathscr{S} \), there exists a collection of sets in \( \mathscr{B} \) whose union is \( U \). Since \( \mathscr{B} \) is countable, \( U \in \sigma(\mathscr{B}) \).

The topological spaces that occur in probability and stochastic processes are usually assumed to have a countable base (along with other nice properties such as Hausdorff and locally compact). The \( \sigma \)-algebra used for such a space is usually the Borel \( \sigma \)-algebra, which by the previous result, is countably generated.

If a \( \sigma \)-algebra is generated by a collection of basic sets, then each set in the \( \sigma \)-algebra is generated by a countable number of the basic sets.

Suppose that \( S \) is a set and \( \mathscr{B} \) a nonempty collection of subsets of \( S \). Then

\[ \sigma(\mathscr{B}) = \{A \subseteq S: A \in \sigma(\mathscr{C}) \text{ for some countable } \mathscr{C} \subseteq \mathscr{B}\} \]Let \( \mathscr{S} \) denote the collection on the right. We first show that \( \mathscr{S} \) is a \( \sigma \)-algebra. First, pick \( B \in \mathscr{B} \), which we can do since \( \mathscr{B} \) is nonempty. Then \( S \in \sigma\{B\} \) so \( S \in \mathscr{S} \). Let \( A \in \mathscr{S} \) so that \( A \in \sigma(\mathscr{C}) \) for some countable \( \mathscr{C} \subseteq \mathscr{B} \). Then \( A^c \in \sigma(\mathscr{C}) \) so \( A^c \in \mathscr{S} \). Finally, suppose that \( A_i \in \mathscr{S} \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \), there exists a countable \( \mathscr{C}_i \subseteq \mathscr{B} \) such that \( A_i \in \sigma(\mathscr{C}_i) \). But then \( \bigcup_{i \in I} \mathscr{C}_i \) is also countable and \( \bigcup_{i \in I} A_i \in \sigma\left(\bigcup_{i \in I} \mathscr{C}_i \right) \). Hence \( \bigcup_{i \in I} A_i \in \mathscr{S} \).

Next if \( B \in \mathscr{B} \) then \( B \in \sigma\{B\} \) so \( B \in \mathscr{S} \). Hence \( \sigma(\mathscr{B}) \subseteq \mathscr{S} \). Conversely, if \( A \in \sigma(\mathscr{C}) \) for some countable \( \mathscr{C} \subseteq \mathscr{B} \) then trivially \( A \in \sigma(\mathscr{B}) \).

Suppose that \((S, \mathscr{S})\) is a measurable space, and that \(R \subseteq S\). Let \(\mathscr{R} = \{A \cap R: A \in \mathscr{S}\}\). Then

- \( \mathscr{R} \) is a \(\sigma\)-algebra of subsets of \(R\).
- If \(R \in \mathscr{S}\) then \(\mathscr{R} = \{B \in \mathscr{S}: B \subseteq R\}\).

- First, \( S \in \mathscr{S} \) and \( S \cap R = R \) so \( R \in \mathscr{R} \). Next suppose that \( B \in \mathscr{R} \). Then there exists \( A \in \mathscr{S} \) such that \( B = A \cap R \). But then \( A^c \in \mathscr{S} \) and \( R \setminus B = R \cap B^c = R \cap A^c \), so \( R \setminus B \in \mathscr{R} \). Finally, suppose that \( B_i \in \mathscr{R} \) for \( i \) in a countable index set \( I \). For each \( i \in I \) there exists \( A_i \in \mathscr{S} \) such that \( B_i = A_i \cap R \). But then \( \bigcup_{i \in I} A_i \in \mathscr{S} \) and \( \bigcup_{i \in I} B_i = \left(\bigcup_{i \in I} A_i \right) \cap R \), so \( \bigcup_{i \in I} B_i \in \mathscr{R} \).
- Suppose that \( R \in \mathscr{S} \). Then \( A \cap R \in \mathscr{S} \) for every \( A \in \mathscr{S} \), and of course, \( A \cap R \subseteq R \). Conversely, if \( B \in \mathscr{S} \) and \( B \subseteq R \) then \( B = B \cap R \) so \( B \in \mathscr{R} \)

The \( \sigma \)-algebra \(\mathscr{R}\) is the \(\sigma\)-algebra on \(R\) induced by \(\mathscr{S}\). The following construction is useful for counterexamples. Compare this example with the one above for finite and co-finite sets:

Let \( S \) be a nonempty set. The collection of countable and co-countable subsets of \( S \) is \[ \mathscr{C} = \{A \subseteq S: A \text{ is countable or } A^c \text{ is countable}\} \]

- \( \mathscr{C} \) is a \( \sigma \)-algebra
- \( \mathscr{C} = \sigma\{\{x\}: x \in S\} \), the \( \sigma \)-algebra generated by the singleton sets.

- First, \( S \in \mathscr{C} \) since \( S^c = \emptyset \) is countable. If \( A \in \mathscr{C} \) then \( A^c \in \mathscr{C} \) by the symmetry of the definition. Suppose that \( A_i \in \mathscr{C} \) for each \( i \) in a countable index set \( I \). If \( A_i \) is countable for each \( i \in I \) then \( \bigcup_{i \in I} A_i \) is countable. If \( A_j^c \) is countable for some \( j \in I \) then \( \left(\bigcup_{i \in I} A_i \right)^c = \bigcap_{i \in I} A_i^c \subseteq A_j^c \) is countable. In either case, \( \bigcup_{i \in I} A_i \in \mathscr{C} \).
- Let \( \mathscr{D} = \sigma\{\{x\}: x \in S\} \). Clearly \( \{x\} \in \mathscr{C} \) for \( x \in S \). Hence \( \mathscr{D} \subseteq \mathscr{C} \). Conversely, suppose that \( A \in \mathscr{C} \). If \( A \) is countable, then \( A = \bigcup_{x \in S} \{x\} \in \mathscr{D} \). If \( A^c \) is countable, then by an identical argument, \( A^c \in \mathscr{D} \) and hence \( A \in \mathscr{D} \).

Of course, if \( S \) is itself countable then \( \mathscr{C} = \mathscr{P}(S) \). On the other hand, if \( S \) is uncountable, then there exists \( A \subseteq S \) such that \( A \) and \( A^c \) are uncountable. Thus, \( A \notin \mathscr{C} \), but \( A = \bigcup_{x \in A} \{x\} \), and of course \( \{x\} \in \mathscr{C} \). Thus, we have an example of a \( \sigma \)-algebra that is not closed under general unions.

Recall that a set usually comes with a \(\sigma\)-algebra of admissible subsets. A natural requirement on a function is that the inverse image of an admissible set in the range space be admissible in the domain space. Here is the formal definition.

Suppose that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces. A function \( f: S \to T \) is measurable if \( f^{-1}(A) \in \mathscr{S} \) for every \( A \in \mathscr{T} \).

If the \( \sigma \)-algebra in the range space is generated by a collection of basic sets, then to check the measurability of a function, we need only consider inverse images of basic sets:

Suppose again that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces, and that \( \mathscr{T} = \sigma(\mathscr{B}) \) for a collection of subsets \( \mathscr{B} \) of \( T \). Then \( f: S \to T \) is measurable if and only if \( f^{-1}(B) \in \mathscr{S} \) for every \( B \in \mathscr{B} \).

First \( \mathscr{B} \subseteq \mathscr{T} \), so if \( f: S \to T \) is measurable then the condition in the theorem trivially holds. Conversely, suppose that the condition in the theorem holds, and let \( \mathscr{U} = \{A \in \mathscr{T}: f^{-1}(A) \in \mathscr{S}\} \). Then \( T \in \mathscr{U} \) since \( f^{-1}(T) = S \in \mathscr{S} \). If \( A \in \mathscr{U} \) then \( f^{-1}(A^c) = \left[f^{-1}(A)\right]^c \in \mathscr{S} \), so \( A^c \in \mathscr{U} \). If \( A_i \in \mathscr{U} \) for \( i \) in a countable index set \( I \), then \( f^{-1}\left(\bigcup_{i \in I} A_i\right) = \bigcup_{i \in I} f^{-1}(A_i) \in \mathscr{S} \), and hence \( \bigcup_{i \in I} A_i \in \mathscr{U} \). Thus \( \mathscr{U} \) is a \( \sigma \)-algebra of subsets of \( T \). But \( \mathscr{B} \subseteq \mathscr{U} \) by assumption, so \( \mathscr{T} = \sigma(\mathscr{B}) \subseteq \mathscr{U} \). Of course \( \mathscr{U} \subseteq \mathscr{T} \) by definition, so \( \mathscr{U} = \mathscr{T} \) and hence \( f \) is measurable.

If you have reviewed the section on topology then you may have noticed a striking parallel between the definition of *continuity* for functions on topological spaces and the defintion of *measurability* for functions on measurable spaces: A function from one topological space to another is continuous if the inverse image of an open set in the range space is open in the domain space. A function from one measurable space to another is measurable if the inverse image of a measurable set in the range space is measurable in the domain space. If we start with topological spaces, which we often do, and use the Borel \( \sigma \)-algebras to get measurable spaces, then we get the following (hardly surprising) connection.

Suppose that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are topological spaces, and that we give \( S \) and \( T \) the Borel \( \sigma \)-algebras \( \sigma(\mathscr{S}) \) and \( \sigma(\mathscr{T}) \) respectively. If \( f: S \to T \) is continuous, then \( f \) is measurable.

If \( V \in \mathscr{T} \) then \( f^{-1}(V) \in \mathscr{S} \subseteq \sigma(\mathscr{S}) \). Hence \( f \) is measurable by the previous theorem.

Measurability is preserved under composition, the most important method for combining functions.

Suppose that \((R, \mathscr{R})\), \((S, \mathscr{S})\), and \((T, \mathscr{T})\) are measurable spaces. If \(f: R \to S\) is measurable and \(g: S \to T\) is measurable, then \(g \circ f: R \to T\) is measurable.

If \( A \in \mathscr{T} \) then \( g^{-1}(A) \in \mathscr{S} \) since \( g \) is measurable, and hence \( (g \circ f)^{-1}(A) = f^{-1}\left[g^{-1}(A)\right] \in \mathscr{R} \) since \( f \) is measurable.

If \( T \) is given the smallest possible \( \sigma \)-algebra or if \( S \) is given the largest one, then any function from \( S \) into \( T \) is measurable.

Every function \( f: S \to T \) is measurable in each of the following cases:

- \( \mathscr{T} = \{\emptyset, T\} \) and \( \mathscr{S} \) is an arbitrary \( \sigma \)-algebra of subsets of \( S \)
- \( \mathscr{S} = \mathscr{P}(S) \) and \( \mathscr{T} \) is an arbitrary \( \sigma \)-algebra of subsets of \( T \).

- Suppose that \( \mathscr{T} = \{\emptyset, T\} \) and that \( \mathscr{S} \) is an arbitrary \( \sigma \)-algebra on \( S \). If \( f: S \to T \), then \( f^{-1}(T) = S \in \mathscr{S} \) and \( f^{-1}(\emptyset) = \emptyset \in \mathscr{S} \) so \( f \) is measurable.
- Suppose that \( \mathscr{S} = \mathscr{P}(S) \) and that \( \mathscr{T} \) is an arbitrary \( \sigma \)-algebra on \( T \). If \( f: S \to T \), then trivially \( f^{-1}(A) \in \mathscr{S} \) for every \( A \in \mathscr{T} \) so \( f \) is measurable.

When there are several \( \sigma \)-algebras for the same set, then we use the phrase with respect to so that we can be precise. If a function is measurable with respect to a given \( \sigma \)-algebra on its domain, then it's measurable with respect to any larger \( \sigma \)-algebra on this space. If the function is measurable with respect to a \( \sigma \)-algebra on the range space then its measurable with respect to any smaller \( \sigma \)-algebra on this space.

Suppose that \( S \) has \( \sigma \)-algebras \( \mathscr{R} \) and \( \mathscr{S} \) with \( \mathscr{R} \subseteq \mathscr{S} \), and that \( T \) has \( \sigma \)-algebras \( \mathscr{T} \) and \( \mathscr{U} \) with \( \mathscr{T} \subseteq \mathscr{U} \). If \( f: S \to T \) is measurable with respect to \( \mathscr{R} \) and \( \mathscr{U} \), then \( f \) is measureable with respect to \( \mathscr{S} \) and \( \mathscr{T} \).

If \( A \in \mathscr{T} \) then \( A \in \mathscr{U} \). Hence \( f^{-1}(A) \in \mathscr{R} \) so \( f^{-1}(A) \in \mathscr{S} \).

The following construction is particularly important in probability theory:

Suppose that \( S \) is a set and \( (T, \mathscr{T}) \) is a measurable space. Suppose also that \(f: S \to T\) and define \(\sigma(f) = \left\{f^{-1}(A): A \in \mathscr{T}\right\}\). Then

- \( \sigma(f) \) is a \(\sigma\)-algebra on \(S\).
- \( \sigma(f) \) is the smallest \( \sigma \)-algebra on \( S \) that makes \( f \) measurable.

- The key to the proof is that the inverse image preserves all set operations. First, \( S \in \sigma(f) \) since \( T \in \mathscr{T} \) and \( f^{-1}(T) = S \). If \( B \in \sigma(f) \) then \( B = f^{-1}(A) \) for some \( A \in \mathscr{T} \). But then \( A^c \in \mathscr{T} \) and hence \( B^c = f^{-1}(A^c) \in \sigma(f) \). Finally, suppose that \( B_i \in \sigma(f) \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \) there exists \( A_i \in \mathscr{T} \) such that \( B_i = f^{-1}(A_i) \). But then \( \bigcup_{i \in I} A_i \in \mathscr{T} \) and \( \bigcup_{i \in I} B_i = f^{-1}\left(\bigcup_{i \in I} A_i \right) \). Hence \( \bigcup_{i \in I} B_i \in \sigma(f) \).
- If \( \mathscr{S} \) is a \( \sigma \)-algebra on \( S \) and \( f \) is measurable with respect to \( \mathscr{S} \) and \( \mathscr{T} \), then by definition \( f^{-1}(A) \in \mathscr{S} \) for every \( A \in \mathscr{T} \), so \( \sigma(f) \subseteq \mathscr{S} \).

Appropriately enough, \( \sigma(f) \) is called the \(\sigma\)-algebra generated by \(f\). Often, \( S \) will have a given \( \sigma \)-algebra \( \mathscr{S} \) and \( f \) will be measurable with respect to \( \mathscr{S} \) and \( \mathscr{T} \). In this case, \( \sigma(f) \subseteq \mathscr{S} \). We can generalize to an arbitrary collection of functions on \( S \).

Suppose \( S \) is a set and that \((T_i, \mathscr{T}_i)\) is a measurable space for each \(i\) in a nonempty index set \(I\). Suppose also that \(f_i: S \to T_i\) for each \(i \in I\). The \(\sigma\)-algebra generated by this collection of functions is \[ \sigma\left\{f_i: i \in I\right\} = \sigma\left\{\sigma(f_i): i \in I\right\} = \sigma\left\{f_i^{-1}(A): i \in I, \, A \in \mathscr{T}_i\right\} \]

Product sets arise naturally in the form of the higher-dimensional Euclidean spaces \( \R^n \) for \( n \in \{2, 3, \ldots\} \). In addition, product spaces are particularly important in probability, where they are used to describe the spaces associated with sequences of random variables. More general product spaces arise in the study of stochastic processes. We start with the product of two sets; the generalization to products of \( n \) sets and to general products is straightforward, although the notation gets more complicated.

Suppose that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces. The product \( \sigma \)-algebra on \( S \times T \) is \[\mathscr{S} \otimes \mathscr{T} = \sigma\{A \times B: A \in \mathscr{S}, \; B \in \mathscr{T}\} \]

So the definition is natural: the product \( \sigma \)-algebra is generated by products of measurable sets. Our next goal is to consider the measurability of functions defined on, or mapping into, product spaces. Of basic importance are the projection functions. If \( S \) and \( T \) are sets, let \( p_1: S \times T \to S \) and \( p_2: S \times T \to T \) be defined by \( p_1(x, y) = x \) and \( p_2(x, y) = y \) for \( (x, y) \in S \times T \). Recall that \( p_1 \) is the projection onto the first coordinate and \( p_2 \) is the projection onto the second coordinate. The product \( \sigma \) algebra is the smallest \( \sigma \)-algebra that makes the projections measurable:

Suppose again that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces. Then \( \mathscr{S} \otimes \mathscr{T} = \sigma\{p_1, p_2\} \).

If \( A \in \mathscr{S} \) then \( p_1^{-1}(A) = A \times T \in \mathscr{S} \otimes \mathscr{T}\). Similarly, if \( B \in \mathscr{T} \) then \( p_2^{-1}(B) = S \times B \in \mathscr{S} \otimes \mathscr{T} \). Hence \( p_1 \) and \( p_2 \) are measurable, so \( \sigma\{p_1, p_2\} \subseteq \mathscr{S} \otimes \mathscr{T} \). Conversely, if \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \) then \( A \times B = p_1^{-1}(A) \cap p_2^{-1}(B) \in \sigma\{p_1, p_2\}\). Since sets of this form generate the product \( \sigma \)-algebra, we have \( \mathscr{S} \otimes \mathscr{T} \subseteq \sigma\{p_1, p_2\} \).

Projection functions make it easy to study functions mapping into a product space.

Suppose that \( (R, \mathscr{R}) \), \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr{S} \otimes \mathscr{T} \). Suppose also that \( f: R \to S \times T \), so that \( f(x) = \left(f_1(x), f_2(x)\right) \) for \( x \in \R \), where \( f_1: R \to S \) and \( f_2: R \to T \) are the coordinate functions. Then \( f \) is measurable if and only if \( f_1 \) and \( f_2 \) are measurable.

Note that \( f_1 = p_1 \circ f \) and \( f_2 = p_2 \circ f \). So if \( f \) is measurable then \( f_1 \) and \( f_2 \) are compositions of measurable functions, and hence are measurable. Conversely, suppose that \( f_1 \) and \( f_2 \) are measurable. If \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \) then \( f^{-1}(A \times B) = f_1^{-1}(A) \cap f_2^{-1}(B) \in \mathscr{R} \). Since products of measurable sets generate \( \mathscr{S} \otimes \mathscr{T} \), it follows that \( f \) is measurable.

Our next goal is to consider cross sections of sets in a product space and cross sections of functions defined on a product space. It will help to introduce some new functions, which in a sense are complementary to the projection functions.

Suppose again that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr{S} \otimes \mathscr{T} \).

- For \( x \in S \) the function \( 1_x : T \to S \times T \), defined by \( 1_x(y) = (x, y) \) for \( y \in T \), is measurable.
- For \( y \in T \) the function \( 2_y: S \to S \times T \), defined by \( 2_y(x) = (x, y) \) for \( x \in S \), is measurable.

To show that the functions are measurable, if suffices to consider inverse images of products of measurable sets, since such sets generate \( \mathscr{S} \otimes \mathscr{T} \). Thus, let \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \).

- For \( x \in S \) note that \( 1_x^{-1}(A \times B) \) is \( B \) if \( x \in A \) and is \( \emptyset \) if \( x \notin A \). In either case, \( 1_x^{-1}(A \times B) \in \mathscr{T} \).
- Similarly, for \( y \in T \) note that \( 2_y^{-1}(A \times B) \) is \( A \) if \( y \in B \) and is \( \emptyset \) if \( y \notin B \). In either case, \( 2_y^{-1}(A \times B) \in \mathscr{S} \).

Now our work is easy.

Suppose again that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces, and that \( C \in \mathscr{S} \otimes \mathscr{T} \). Then

- For \( x \in S \), \( \{y \in T: (x, y) \in C\} \in \mathscr{T} \).
- For \( y \in T \), \( \{x \in S: (x, y) \in C\} \in \mathscr{S}\).

These result follow immediately from the measurability of the functions \( 1_x \) and \( 2_y \):

- For \( x \in S \), \( 1_x^{-1}(C) = \{y \in T: (x, y) \in C\} \).
- For \( y \in T \), \( 2_y^{-1}(C) = \{x \in S: (x, y) \in C\} \).

The set in (a) is the cross section of \( C \) in the first coordinate at \( x \), and the set in (b) is the cross section of \( C \) in the second coordinate at \( y \). As a simple corollary to the theorem, note that if \( A \subseteq S \), \( B \subseteq T \) and \( A \times B \in \mathscr{S} \otimes \mathscr{T} \) then \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \). That is, the only measurable product sets are products of measurable sets. Here is the measurability result for cross-sectional functions:

Suppose again that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr{S} \otimes \mathscr{T} \). Suppose also that \( (U, \mathscr{U}) \) is another measurable space, and that \( f: S \times T \to U \) is measurable. Then

- The function \( y \mapsto f(x, y) \) from \( T \) to \( U \) is measurable for each \( x \in S \).
- The function \( x \mapsto f(x, y) \) from \( S \) to \( U \) is measurable for each \( y \in T \).

Note that the function in (a) is just \( f \circ 1_x\), and the function in (b) is just \( f \circ 2_y \), both are compositions of measurable functions

The results for products of two spaces generalize in a completely straightforward way to a product of \( n \) spaces. Thus, suppose that \( (S_i, \mathscr{S}_i) \) is a measurable space for each \( i \in \{1, 2, \ldots, n\} \). For the Cartesian product set \( S_1 \times S_2 \times \cdots \times S_n \), we usually use the product \( \sigma \)-algebra generated by products of measurable sets: \[ \mathscr{S}_1 \otimes \mathscr{S}_2 \otimes \cdots \otimes \mathscr{S}_n = \sigma\left\{ A_1 \times A_2 \times \cdots \times A_n: A_i \in \mathscr{S}_i \text{ for all } i \in \{1, 2, \ldots, n\}\right\} \] Results analogous to the theorems above hold. We can also extend these ideas to a general product. To recall the definition, suppose that \( S_i \) is a set for each \( i \) in a nonempty index set \( I \). The product set \( \prod_{i \in I} S_i \) consists of all functions \( x: I \to \bigcup_{i \in I} S_i \) such that \( x(i) \in S_i \) for each \( i \in I \). To make the notation look more like a simple Cartesian product, we often write \( x_i \) instead of \( x(i) \) for the value of a function in the product set at \( i \in I \). The next definition gives the appropriate \( \sigma \)-algebra for the product set.

Suppose that \( (S_i, \mathscr{S}_i) \) is a measurable space for each \(i \) in a nonempty index set \( I \). The product \( \sigma \)-algebra on the product set \( \prod_{i \in I} S_i \) is \[ \sigma\left\{\prod_{i \in I} A_i: A_i \in \mathscr{S}_i \text{ for each } i \in I \text{ and } A_i = S_i \text{ for all but finitely many } i \in I \right\}\]

If you have reviewed the section on topology, the definition should look familiar. If the spaces were *topological spaces* instead of *measurable spaces*, with \( \mathscr{S}_i \) the topology of \( S_i \) for \( i \in I \), then the set of products in the displayed expression above is a base for the product topology on \( S_I \).

The definition can also be understood in terms of projections. Recall that the projection onto coordinate \( j \in I \) is the function \( p_j: \prod_{i \in I} S_i \to S_j \) given by \( p_j(x) = x_j \). The product \( \sigma \)-algebra is the smallest \( \sigma \)-algebra on the product set that makes all of the projections measurable.

Suppose again that \( (S_i, \mathscr{S}_i)\) is a measurable space for each \( i \) in a nonempty index set \( I \), and let \( \mathfrak{S} \) denote the product \( \sigma \)-algebra on the product set \( S_I = \prod_{i \in I} S_i \). Then \(\mathfrak{S} = \sigma\{p_i: i \in I\} \).

Let \( j \in I \) and \( A \in \mathscr{S}_j \). Then \( p_j^{-1}(A) = \prod_{i \in I} A_i \) where \( A_i = S_i \) for \( i \ne j \) and \( A_j = A \). This set is in \( \mathfrak{S} \) so \( p_j \) is measurable. Hence \( \sigma\{p_i: i \in I\} \subseteq \mathfrak{S} \). For the other direction, consider a product set \( \prod_{i \in I} A_i \) where \( A_i = S_i \) except for \( i \in J \), where \( J \subseteq I \) is finite. Then \( \prod_{i \in I} A_i = \bigcap_{j \in J} p_j^{-1}(A_j) \). This set is in \( \sigma\{p_i: i \in I\} \). Product sets of this form generate \( \mathfrak{S} \) so it follows that \( \mathfrak{S} \subseteq \sigma\{p_i: i \in I\} \).

In the special case that \( (S, \mathscr{S}) \) is a fixed measurable space and \( (S_i, \mathscr{S}_i) = (S, \mathscr{S}) \) for all \( i \in I \), the product set \( \prod_{i \in I} S \) is just the collection of functions from \( I \) into \( S \), often denoted \( S^I \). The product \( \sigma \)-algebra is a natural \( \sigma \)-algebra to use on this space of functions. However, do not confuse this with the \( \sigma \)-algebra on the common domain of a collection of functions, discussed above. Here is the main measurability result for a function mapping into a product space.

Suppose that \( (R, \mathscr{R}) \) is a measurable space, and that \( (S_i, \mathscr{S}_i) \) is a measurable space for each \( i \) in a nonempty index set \( I \). As before, let \(\prod_{i \in I} S_i \) have the product \( \sigma \)-algebra. Suppose now that \( f: R \to \prod_{i \in I} S_i \). For \( i \in I \) let \( f_i: R \to S_i \) denote the \( i \)th coordinate function of \( f \), so that \( f_i(x) = [f(x)]_i \) for \( x \in R \). Then \( f \) is measurable if and only if \( f_i \) is measurable for each \( i \in I \).

Suppose that \( f \) is measurable. For \( i \in I \) note that \( f_i = p_i \circ f \) is a composition of measurable functions, and hence is measurable. Conversely, suppose that \( f_i \) is measurable for each \( i \in I \). To show that measurability of \( f \) we need only consider inverse images of sets that generate the product \( \sigma \)-algebra. Thus, suppose that \( A_j \in \mathscr{S}_j \) for \( j \) in a finite subset \( J \subseteq I \), and let \( A_i = S_i \) for \( i \in I - J \). Then \( f^{-1}\left(\prod_{i \in I} A_i\right) = \bigcap_{j \in J} f_j^{-1}(A_j) \). This set is in \( \mathscr{R} \) since the intersection is over a finite index set.

Just as with the product of two sets, cross-sectional sets and functions are measurable with respect to the product measure. Again, it's best to work with some special functions.

Suppose that \( (S_i, \mathscr{S}_i) \) is a measurable space for each \( i \) in an index set \( I \) with at least two elements. For \( j \in I \) and \( u \in S_j \), define the function \( j_u: \prod_{i \in I - \{j\}} \to \prod_{i \in I} S_i \) by \( j_u(x) = y \) where \( y_i = x_i \) for \( i \ne j \) and \( y_j = u \). Then \( j_u \) is measurable with respect to the product \( \sigma \)-algebras.

Once again, it suffices to consider the inverse image of the sets that generate the product \( \sigma \)-algebra. So suppose \( A_i \in \mathscr{S}_i \) for \( i \in I \) with \( A_i = S_i \) for all but finitely many \( i \in I \). Then \( j_u^{-1}\left(\prod_{i \in I} A_i\right) = \prod_{i \in I - \{j\}} A_i \) if \( u \in A_j \), and the inverse image is \( \emptyset \) otherwise. In either case, \( j_u^{-1}\left(\prod_{i \in I} A_i\right) \) is in the product \( \sigma \)-algebra on \( \prod_{i \in I - \{j\}} S_i \).

In words, for \( j \in I \) and \( u \in S_j \), the function \( j_u \) takes a point in the product set \( \prod_{ i \in I - \{j\}} S_i \) and assigns \( u \) to coordinate \( j \) to give a point in \( \prod_{i \in I} S_i \). If \( A \subseteq \prod_{i \in I} S_i \), then \( j_u^{-1}(A) \) is the cross section of \( A \) in coordinate \( j \) at \( u \). So it follows immediately from the previous result that the cross sections of a measurable set are measurable. Cross sections of measurable functions are also measurable. Suppose that \( (T, \mathscr{T}) \) is another measurable space, and that \( f: \prod_{i \in I} S_i \to T \) is measurable. The cross section of \( f \) in coordinate \( j \in I \) at \( u \in S_j \) is simply \( f \circ j_u: S_{I - \{j\}} \to T\), a composition of measurable functions.

However, a non-measurable set can have measurable cross sections, even in a product of two spaces.

Suppose that \( S \) is an uncountable set with the \( \sigma \)-algebra \( \mathscr{C} \) of countable and co-countable sets. Consider \( S \times S \) with the product \( \sigma \)-algebra \( \mathscr{C} \otimes \mathscr{C} \). Let \( D = \{(x, x): x \in S\}\), the diagonal of \( S \times S \). Then \( D \) has measurable cross sections, but \( D \) is not measurable.

For \( x \in S \), the cross section of \( D \) in the first coordinate at \( x \) is \( \{y \in S: (x, y) \in D\} = \{x\} \in \mathscr{C} \). Similarly, for \( y \in S \), the cross section of \( D \) in the second coordinate at \( y \) is \( \{x \in S: (x, y) \in D\} = \{ y\} \in \mathscr{C} \). But \( D \) cannot be generated by a countable collection of sets of the form \( A \times B \) with \( A, \, B \in \mathscr{C} \), so \( D \notin \mathscr{C} \otimes \mathscr{C} \), by the result above.

Most of the sets encountered in applied probability are either countable, or subsets of \(\R^n\) for some \(n\), or more generally, subsets of a product of a countable number of sets of these types. In the study of stochastic processes, various spaces of functions play an important role. In this subsection, we will explore some of the special cases.

If \(S\) is countable, we usually use the power set \(\mathscr{P}(S)\) as the basic \(\sigma\)-algebra. Thus, all subsets of \( S \) are measurable and every function from \( S \) to another measurable space is measurable. The power set is also the discrete topology on \( S \), so \( \mathscr{P}(S) \) is a Borel \( \sigma \)-algebra as well. With the discrete topology, \( S \) is complete, locally compact, Hausdorff, and since \( S \) is countable, separable. Moreover, the discrete topology corresponds to the discrete metric \( d \), defined by \( d(x, x) = 0 \) for \( x \in S \) and \( d(x, y) = 1 \) for \( x, \, y \in S \) with \( x \ne y \).

The set of real numbers \( \R \) is usually given the Borel \( \sigma \)-algebra \( \mathscr{R} \) corresponding to the standard Euclidean topology. In turn, this topology is generated by the standard Euclidean metric \( d \) given by \( d(x, y) = \left|x - y\right| \) for \( x, \, y \in \R \). With the Euclidean topology, \( \R \) is complete, connected, locally compact, Hausdorff, and separable.

Each of the following collections generate the Borel \( \sigma \)-algebra of subsets of \( \R \).

- \( \mathscr{B}_1 = \{I \subseteq \R: I \text{ is an interval} \} \)
- \( \mathscr{B}_2 = \{(a, b]: a \in \R, \; b \in \R, \; a \lt b \}\)
- \( \mathscr{B}_3 = \{(-\infty, b]: b \in \R \} \)

The proof involves showing that each set in any one of the collections is in the \( \sigma \)-algebra of any other collection. Let \( \mathscr{R}_i = \sigma(\mathscr{B}_i) \) for \( i \in \{1, 2, 3\} \).

- Clearly \( \mathscr{B}_2 \subseteq \mathscr{B}_1 \) and \( \mathscr{B}_3 \subseteq \mathscr{B}_1 \) so \( \mathscr{R_2} \subseteq \mathscr{R}_1 \) and \( \mathscr{R}_3 \subseteq \mathscr{R}_1 \).
- If \( a, \, b \in \R \) with \( a \le b \) then \( [a, b] = \bigcap_{n=1}^\infty \left(a - \frac{1}{n}, b\right] \) and \( (a, b) = \bigcup_{n=1}^\infty \left(a, b - \frac{1}{n}\right] \), so \( [a, b], \, (a, b) \in \mathscr{R}_2 \). Also \( [a, b) = \bigcup_{n=1}^\infty \left[a, b - \frac{1}{n}\right] \) so \( [a, b) \in \mathscr{R}_2 \). Thus all bounded intervals are in \( \mathscr{R}_2 \). Next, \( [a, \infty) = \bigcup_{n=1}^\infty [a, a + n) \), \( (a, \infty) = \bigcup_{n=1}^\infty (a, a + n) \), \( (-\infty, a] = \bigcup_{n=1}^\infty (a - n, a] \), and \( (-\infty, a) = \bigcup_{n=1}^\infty (a - n, a) \), so each of these intervals is in \( \mathscr{R}_2 \). Of course \( \R \in \mathscr{R}_2 \), so we now have that \( I \in \mathscr{R}_2 \) for every interval \( I \). Thus \( \mathscr{R}_1 \subseteq \mathscr{R}_2 \), and so from (a), \( \mathscr{R}_2 = \mathscr{R}_1\).
- If \( a, \; b \in \R \) with \( a \lt b \) then \( (a, b] = (-\infty, b] - (-\infty, a] \) so \( (a, b] \in \mathscr{R}_3 \). Hence \( \mathscr{R}_2 \subseteq \mathscr{R}_3 \). But then from (a) and (b) it follows that \( \mathscr{R}_3 = \mathscr{R}_1 \).

Since the Euclidean topology has a countable base, the Borel \( \sigma \)-algebra is countably generated. In fact each collection of intervals above, but with endpoints restricted to \( \Q \), generates the Borel \( \sigma \)-algebra. The Borel \( \sigma \)-algebra on \( \R \) can also be constructed from \( \sigma \)-algebras that are generated by countable partitions. First recall that for \( n \in \N \), the set of dyadic rationals (or binary rationals) of rank \( n \) or less is \( \D_n = \{j / 2^n: j \in \Z\} \). Note that \( \D_n \) is countable and \( \D_n \subseteq \D_{n+1} \) for \( n \in \N \). Moreover, the set \( \D = \bigcup_{n \in \N} \D_n \) of *all* dyadic rationals is dense in \( \R \). The dyadic rationals are often useful in various applications because \( \D_n \) has the natural ordered enumeration \( j \mapsto j / 2^n \) for each \( n \in \N \). Now let
\[ \mathscr{D}_n = \left\{\left(\frac{j}{2^n}, \frac{j + 1}{2^n}\right]: j \in \Z\right\}, \quad n \in \N \]
Then \( \mathscr{D}_n \) is a countable partition of \( \R \) into nonempty intervals of equal size \( 1 / 2^n \), so \( \mathscr{E}_n = \sigma(\mathscr{D}_n) \) consists of unions of sets in \( \mathscr{D}_n \) as described above. Every set \( \mathscr{D}_{n} \) is the union of two sets in \( \mathscr{D}_{n+1} \) so clearly \( \mathscr{E}_n \subseteq \mathscr{E}_{n+1} \) for \( n \in \N \). Finally, the Borel \( \sigma \)-algebra on \( \R \) is \( \mathscr{R} = \sigma\left(\bigcup_{n=0}^\infty \mathscr{E}_n\right) = \sigma\left(\bigcup_{n=0}^\infty \mathscr{D}_n\right) \). This construction turns out to be useful in a number of settings.

For \( n \in \{2, 3, \ldots\} \), \(\R^n\) is also given the Borel \( \sigma \)-algebra corresponding to the standard Euclidean topology. This topology is the \( n \)-fold product topology formed from the Euclidean topology on \( \R \). So the Borel \( \sigma \)-algebra on \( \R^n \) is also the \( n \)-fold product \( \sigma \)-algebra formed from the Borel \( \sigma \)-algebra on \( \R \). Finally, the Borel \( \sigma \)-algebra on \( \R^n \) can be generated by \( n \)-fold products of sets in any of the three collections in the previous result.

Of course, the Euclidean topology is generated by the standard Euclidean metric \( d_n \) given by \[ d_n(x, y) = \sqrt{\sum_{i=1}^n (x_i - y_i)^2}, \quad x = (x_1, x_2, \ldots, x_n), \, y \in (y_1, y_2, \ldots, y_n) \in \R^n \] With this topology, \( \R^n \) is complete, connected, locally compact, Hausdorff, and separable.

Suppose that \( (S, \mathscr{S}) \) is a measurable space. From our general discussion of functions, recall that the usual arithmetic operations on functions from \( S \) into \( \R \) are defined pointwise.

If \( f: S \to \R \) and \( g: S \to \R \) are measurable and \( a \in \R \), then each of the following functions from \( S \) into \( \R \) is also measurable:

- \( f + g \)
- \( f - g \)
- \( f g \)
- \( a f \)

These results follow from the fact that the arithmetic operators are continuous, and hence measurable. That is, \( (x, y) \mapsto x + y \), \( (x, y) \mapsto x - y \), and \( (x, y) \mapsto x y \) are continuous as functions from \( \R^2 \) into \( \R \). Thus, if \( f, \, g: S \to \R \) are measurable, then \( (f, g): S \to \R^2 \) is measurable by the result above. Then, \( f + g \), \( f - g \), \( f g \) are the compositions, respectively, of \( + \), \( - \), \( \cdot \) with \( (f, g) \). Of course, (d) is a simple corollary of (c).

Similarly, if \( f: S \to \R \setminus \{0\} \) is measurable, then so is \( 1 / f \). Recall that the set of functions from \( S \) into \( \R \) is a vector space, under the pointwise definitions of addition and scalar multiplication. But once again, we usually want to restrict our attention to *measurable* functions. Thus, it's nice to know that the measurable functions from \( S \) into \( \R \) also form a vector space. This follows immediately from the closure properties (a) and (d) of the previous theorem. Of particular importance in probability and stochastic processes is the vector space of bounded, measurable functions \( f: S \to \R \), with the supremum norm
\[ \|f\| = \sup\left\{\left|f(x)\right|: x \in S \right\} \]

The elementary functions that we encounter in calculus and other areas of applied mathematics are functions from subsets of \( \R \) into \( \R \). The elementary functions include algebraic functions (which in turn include the polynomial and rational functions), the usual transcendental functions (exponential, logarithm, trigonometric), and the usual functions constructed from these by composition, the arithmetic operations, and by piecing together. As we might hope, all of the elementary functions are measurable.