\( \newcommand{\P}{\mathbb{P}} \) \( \newcommand{\R}{\mathbb{R}} \) \( \newcommand{\N}{\mathbb{N}} \) \( \newcommand{\Z}{\mathbb{Z}} \) \( \newcommand{\bs}{\boldsymbol} \) \( \newcommand{\ms}{\mathscr} \) \(\newcommand{\supp}{\text{supp}}\)
  1. Random
  2. 0. Foundations
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9
  12. 10
  13. 11
  14. 12
  15. 13
  16. 14
  17. 15
  18. 16
  19. 17
  20. 18
  21. 19

12. Measure Spaces

In this section we discuss positive measure spaces, which include probability spaces, the fundamental topic of this text. The section on measurable spaces is an essential prerequisite.

Positive Measure

Definitions

Suppose that \( S \) is a set, playing the role of a universal set for a mathematical theory. As we have noted before, \( S \) usually comes with a \( \sigma \)-algebra \( \ms S \) of admissible subsets of \( S \), so that \( (S, \ms S) \) is a measurable space. Here is the fundamental definition of this section.

A positive measure on \((S, \ms S)\) is a function \(\mu: \ms S \to [0, \infty] \) that satisfies the following axioms:

  1. \( \mu(\emptyset) = 0 \)
  2. If \(\{A_i: i \in I\}\) is a countable, pairwise disjoint collection of sets in \(\ms S\) then \[\mu\left(\bigcup_{i \in I} A_i\right) = \sum_{i \in I} \mu(A_i)\]

The triple \((S, \ms S, \mu)\) is a measure space.

Axiom (b) of is called countable additivity, and is the essential property. The measure of a set that consists of a countable union of disjoint pieces is the sum of the measures of the pieces. Note also that since the terms in the sum are positive, there is no issue with the order of the terms in the sum, although of course, \( \infty \) is a possible value.

A union of four disjoint sets
Union1.png

So perhaps the term measurable space for \( (S, \ms S) \) makes a little more sense now—a measurable space is one that can have a positive measure defined on it.

Suppose that \( (S, \ms S, \mu) \) is a measure space.

  1. If \( \mu(S) \lt \infty \) then \( (S, \ms S, \mu) \) is a finite measure space.
  2. If \( \mu(S) = 1 \) then \( (S, \ms S, \mu) \) is a probability space.

So probability measures, the fundamental objects of study in this text, are positive measures. But positive measures are important beyond the application to probability. The standard measures on the Euclidean spaces are all positive measures: the extension of length for measurable subsets of \( \R \), the extension of area for measurable subsets of \( \R^2 \), the extension of volume for measurable subsets of \( \R^3 \), and the higher dimensional analogues. We will actually construct these measures in the section on existence and uniqueness. In addition, counting measure \( \# \) is a positive measure on the subsets of a set \( S \). We will explore even more general measures that can take positive and negative values.

Properties

The following results give some simple properties of a positive measure space \( (S, \ms S, \mu) \). The proofs are relatively simple, but the measure of a set may be infinite so we must be careful to avoid the dreaded indeterminate form \( \infty - \infty \).

If \( A, \, B \in \ms S \), then \( \mu(B) = \mu(A \cap B) + \mu(B \setminus A) \).

Details:

Note that \( B = (A \cap B) \cup (B \setminus A) \), and the sets in the union are disjoint.

If \( A, \, B \in \ms S \) and \( A \subseteq B \) then

  1. \( \mu(B) = \mu(A) + \mu(B \setminus A) \)
  2. \( \mu(A) \le \mu(B) \)
Details:

Part (a) follows from , since \( A \cap B = A \). Part (b) follows from part (a).

Thus \( \mu \) is an increasing function, relative to the subset partial order \( \subseteq \) on \( \ms S \) and the ordinary order \( \le \) on \( [0, \infty] \). In particular, if \( \mu \) is a finite measure, then \( \mu(A) \lt \infty \) for every \( A \in \ms S \). The following results are simple corollaries. Parts (a) is the difference rule, part (b) is the proper difference rules, and part (c) is the complement rule.

Suppose that \(A, \, B \in \ms S\).

  1. If \(\mu(B) \lt \infty\) then \(\mu(B \setminus A) = \mu(B) - \mu(A \cap B)\)
  2. If \(\mu(B) \lt \infty\) and \(A \subseteq B\) then \(\mu(B \setminus A) = \mu(B) - \mu(A)\)
  3. If \(\mu\) is a finite measure then \(\mu(A^c) = \mu(S) - \mu(A)\).

The next result is referred to as the subadditive property. In the application to probability, the result is referred to as Boole's inequality, named for George Boole.

Suppose that \( A_i \in \ms S \) for \( i \) in a countable index set \( I \). Then \[ \mu\left(\bigcup_{i \in I} A_i \right) \le \sum_{i \in I} \mu(A_i) \]

Details:

Assume that \( I = \N_+ \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus (A_1 \cup \ldots \cup A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Then \( \{B_i: i \in I\} \) is a disjoint collection of sets in \( \ms S \) with the same union as \( \{A_i: i \in I\} \). Also \( B_i \subseteq A_i \) for each \( i \) so \( \mu(B_i) \le \mu(A_i) \). Hence \[ \mu\left(\bigcup_{i \in I} A_i \right) = \mu\left(\bigcup_{i \in I} B_i \right) = \sum_{i \in I} \mu(B_i) \le \sum_{i \in I} \mu(A_i) \]

The following result is the inclusion-exclusion formula. It generalizes the one that we saw for counting measure and the one that will study later for probability measures.

Suppose that \(A_i \in \ms S\) for each \(i \in I\) where \(\#(I) = n\), and that \( \mu(A_i) \lt \infty \) for \( i \in I \). Then \[\mu \left( \bigcup_{i \in I} A_i \right) = \sum_{k = 1}^n (-1)^{k - 1} \sum_{J \subseteq I, \; \#(J) = k} \mu \left( \bigcap_{j \in J} A_j \right)\]

Details:

The proof is by induction on \(n\). The proof for \( n = 2 \) is simple: \( A_1 \cup A_2 = A_1 \cup (A_2 \setminus A_1) \). The union on the right is disjoint, so using additivity and the difference rule, \[ \mu(A_1 \cup A_2) = \mu (A_1) + \mu(A_2 \setminus A_1) = \mu(A_1) + \mu(A_2) - \mu(A_1 \cap A_2) \] Suppose now that the inclusion-exclusion formula holds for a given \( n \in \N_+ \), and consider the case \( n + 1 \). Then \[ \bigcup_{i=1}^{n + 1} A_i = \left(\bigcup_{i=1}^n A_i \right) \cup \left[ A_{n+1} \setminus \left(\bigcup_{i=1}^n A_i\right) \right] \] As before, the set in parentheses and the set in square brackets are disjoint. Thus using the additivity axiom, the difference rule, and the distributive rule we have \[ \mu\left(\bigcup_{i=1}^{n+1} A_i\right) = \mu\left(\bigcup_{i=1}^n A_i\right) + \mu(A_{n+1}) - \mu\left(\bigcup_{i=1}^n (A_{n+1} \cap A_i) \right) \] By the induction hypothesis, the inclusion-exclusion formula holds for each union of \( n \) sets on the right. Applying the formula and simplifying gives the inclusion-exclusion formula for \( n + 1 \) sets.

The following results are referred to as the continuity theorems.

Suppose that \( (A_1, A_2, \ldots) \) is a sequence of sets in \( \ms S \).

  1. If the sequence is increasing then \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).
  2. If sequence is decreasing and \( \mu(A_1) \lt \infty \) then \( \mu\left(\bigcap_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).
Details:
  1. Note that if \( \mu(A_k) = \infty \) for some \( k \) then \( \mu(A_n) = \infty \) for \( n \ge k \) and \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \infty \). Thus, suppose that \( \mu(A_i) \lt \infty \) for each \( i \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus A_{i-1} \) for \( i \in \{2, 3, \ldots\} \). Then \( (B_1, B_2, \ldots) \) is a disjoint sequence with the same union as \( (A_1, A_2, \ldots) \). Also, \( \mu(B_1) = \mu(A_1) \) and by the proper difference rule in , \( \mu(B_i) = \mu(A_i) - \mu(A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Hence \[ \mu\left(\bigcup_{i=1}^\infty A_i \right) = \mu \left(\bigcup_{i=1}^\infty B_i \right) = \sum_{i=1}^\infty \mu(B_i) = \lim_{n \to \infty} \sum_{i=1}^n \mu(B_i) \] But \( \sum_{i=1}^n \mu(B_i) = \mu(A_1) + \sum_{i=2}^n [\mu(A_i) - \mu(A_{i-1})] = \mu(A_n) \).
  2. Note that \( A_1 \setminus A_n \) is increasing in \( n \). Hence using the continuity result for increasing sets, \begin{align} \mu \left(\bigcap_{i=1}^\infty A_i \right) & = \mu\left[A_1 \setminus \bigcup_{i=1}^\infty (A_1 \setminus A_i) \right] = \mu(A_1) - \mu\left[\bigcup_{i=1}^\infty (A_1 \setminus A_n)\right]\\ & = \mu(A_1) - \lim_{n \to \infty} \mu(A_1 \setminus A_n) = \mu(A_1) - \lim_{n \to \infty} \left[\mu(A_1) - \mu(A_n)\right] = \lim_{n \to \infty} \mu(A_n) \end{align}

If \( (A_1, A_2, \ldots) \) is increasing, \( \bigcup_{i=1}^\infty A_i \) is sometimes denoted \( \lim_{n \to \infty} A_n \), and if \( (A_1, A_2, \ldots) \) is decreasing, \( \bigcap_{i=1}^\infty A_i \) is also denoted \( \lim_{n \to \infty} A_n \). In both cases, the continuity theorem has the form \[\mu\left(\lim_{n \to \infty} A_n\right) = \lim_{n \to \infty} \mu(A_n) \] The continuity theorem for decreasing sets fails without the additional assumption of finite measure. A simple counterexample is given in .

The following corollary of the inclusion-exclusion law gives a condition for countable additivity that does not require that the sets be disjoint, but only that the intersections have measure 0. The result is used below in on completion.

Suppose that \( A_i \in \ms S \) for each \( i \) in a countable index set \( I \) and that \( \mu(A_i) \lt \infty \) for \( i \in I \) and \( \mu(A_i \cap A_j) = 0 \) for distinct \( i, \, j \in I \). Then \[ \mu\left(\bigcup_{i \in I} A_i \right) = \sum_{i \in I} \mu(A_i) \]

Details:

We will assume that \( I = \N_+ \). For \( n \in \N_+ \), \[ \mu\left(\bigcup_{i=1}^n A_i\right) = \sum_{i=1}^n \mu(A_i) \] as an immediate consequence of the inclusion-exclusion law, under the assumption that \( \mu(A_i \cap A_j) = 0 \) for distinct \( i, j \in \{1, 2, \ldots, n\} \). Next \( \bigcup_{i=1}^n A_i \uparrow \bigcup_{i=1}^\infty A_i \) as \( n \to \infty \), and hence by the continuity theorem for increasing events, \( \mu\left(\bigcup_{i=1}^n A_i\right) \to \mu\left(\bigcup_{i=1}^\infty A_i\right) \) as \( n \to \infty \). On the other hand, \( \sum_{i=1}^n \mu(A_i) \to \sum_{i=1}^\infty \mu(A_i) \) as \( n \to \infty \) by the definition of an infinite series of nonnegative terms.

More Definitions

If a positive measure is not finite, then the following definition gives the next best thing.

The measure space \( (S, \ms S, \mu) \) is \( \sigma \)-finite if there exists a countable collection \(\{A_i: i \in I\} \subseteq \ms S\) with \( \bigcup_{i \in I} A_i = S \) and \( \mu(A_i) \lt \infty \) for each \( i \in I \).

So of course, if \(\mu\) is a finite measure on \((S, \ms S)\) then \(\mu\) is \(\sigma\)-finite, but not conversely in general. On the other hand, for \( i \in I \), let \( \ms S_i = \{A \in \ms S: A \subseteq A_i\} \). Then \( \ms S_i \) is a \( \sigma \)-algebra of subsets of \( A_i \) and \( \mu \) restricted to \( \ms S_i \) is a finite measure. The point of this (and the reason for the definition) is that often nice properties of finite measures can be extended to \( \sigma \)-finite measures. In particular, \( \sigma \)-finite measure spaces play a crucial role in the construction of product measure spaces, and for the completion of a measure space in .

Suppose that \( (S, \ms S, \mu) \) is a \( \sigma \)-finite measure space.

  1. There exists an increasing sequence satisfying the condition in
  2. There exists a disjoint sequence satisfying the condition in .
Details:

Without loss of generality, we can take \(\N_+\) as the index set in the definition. So there exists \( A_n \in \ms S\) for \(n \in \N_+ \) such that \( \mu(A_n) \lt \infty \) for each \( n \in \N_+ \) and \( S = \bigcup_{n=1}^\infty A_n \). The proof uses some of the same tricks that we have seen before.

  1. Let \( B_n = \bigcup_{i = 1}^n A_i \). Then \( B_n \in \ms S \) for \( n \in \N_+ \) and this sequence is increasing. Moreover, \( \mu(B_n) \le \sum_{i=1}^n \mu(A_i) \lt \infty \) for \( n \in \N_+ \) and \( \bigcup_{n=1}^\infty B_n = \bigcup_{n=1}^\infty A_n = S \).
  2. Let \( C_1 = A_1 \) and let \( C_n = A_n \setminus \bigcup_{i=1}^{n-1} A_i \) for \( n \in \{2, 3, \ldots\} \). Then \( C_n \in \ms S \) for each \( n \in \N_+ \) and this sequence is disjoint. Moreover, \( C_n \subseteq A_n \) so \( \mu(C_n) \le \mu(A_n) \lt \infty \) and \( \bigcup_{n=1}^\infty C_n = \bigcup_{n=1}^\infty A_n = S \).

Our next definition concerns sets where a measure is concentrated, in a certain sense.

Suppose that \((S, \ms S, \mu)\) is a measure space. An atom of the space is a set \(A \in \ms S\) with the following properties:

  1. \(\mu(A) \gt 0\)
  2. If \(B \in \ms S\) and \(B \subseteq A\) then either \(\mu(B) = \mu(A)\) or \(\mu(B) = 0\).

A measure space that has no atoms is called non-atomic or diffuse.

In probability theory, we are often particularly interested in atoms that are singleton sets. Note that \( \{x\} \in \ms S \) is an atom if and only if \( \mu(\{x\}) \gt 0 \), since the only subsets of \( \{x\} \) are \( \{x\} \) itself and \( \emptyset \).

Constructions

There are several simple ways to construct new positive measures from existing ones. As usual, we start with a measurable space \( (S, \ms S) \).

Suppose that \( (R, \ms R) \) is a measurable subspace of \( (S, \ms S) \). If \( \mu \) is a positive measure on \( (S, \ms S) \) then \( \mu \) restricted to \( \ms R \) is a positive measure on \( (R, \ms R) \). If \( \mu \) is a finite measure on \( (S, \ms S) \) then \( \mu \) is a finite measure on \( (R, \ms R) \).

Details:

The assumption is that \( \ms R \) is a \( \sigma \)-algebra of subsets of \( R \) and \( \ms R \subseteq \ms S \). In particular \( R \in \ms S \). Since the additivity property of \( \mu \) holds for a countable, disjoint collection of events in \( \ms S \), it trivially holds for a countable, disjoint collection of events in \( \ms R \). Finally, by the increasing property, \( \mu(R) \le \mu(S) \) so if \( \mu(S) \lt \infty \) then \( \mu(R) \lt \infty \).

However, if \(\mu\) is \(\sigma\)-finite on \( (S, \ms S) \), it is not necessarily true that \(\mu\) is \(\sigma\)-finite on \( (R, \ms R) \). A counterexample is given in . Theorem would apply, in particular, when \( R = S \) so that \( \ms R \) is a sub \( \sigma \)-algebra of \( \ms S \). Next, a positive multiple of a positive measure gives another positive measure.

If \( \mu \) is a positive measure on \( (S, \ms S) \) and \( c \in (0, \infty) \), then \( c \mu \) is also a positive measure on \( (S, \ms S) \).

  1. If \(\mu\) is finite then \(c \mu\) is finite.
  2. If \(\mu\) is \(\sigma\)-finite then \(c \mu\) is \(\sigma\)-finite.
Details:

Clearly \( c \mu: \ms S \to [0, \infty] \). Also \( (c \mu)(\emptyset) = c \mu(\emptyset) = 0 \). Next if \( \{A_i: i \in I\} \) is a countable, disjoint collection of events in \( \ms S \) then \[ (c \mu)\left(\bigcup_{i \in I} A_i\right) = c \mu\left(\bigcup_{i \in I} A_i\right) = c \sum_{i \in I} \mu(A_i) = \sum_{i \in I} c \mu(A_i) \] Finally, since \( \mu(A) \lt \infty \) if and only if \( (c \mu)(A) \lt \infty \) for \( A \in \ms S \), the finiteness and \( \sigma \)-finiteness properties are trivially preserved.

Sums of positive measures are also positive measures.

If \( \mu_i \) is a positive measure on \( (S, \ms S) \) for each \( i \) in a countable index set \( I \) then \( \mu = \sum_{i \in I} \mu_i \) is also a positive measure on \( (S, \ms S) \).

  1. If \( I \) is finite and \( \mu_i \) is finite for each \(i \in I\) then \(\mu\) is finite.
  2. If \( I \) is finite and \(\mu_i\) is \( \sigma \)-finite for each \( i \in I \) then \( \mu \) is \( \sigma \)-finite.
Details:

Clearly \( \mu: \ms S \to [0, \infty] \). First \( \mu(\emptyset) = \sum_{i \in I} \mu_i(\emptyset) = 0 \). Next if \( \{A_j: j \in J\} \) is a countable, disjoint collection of events in \( \ms S \) then \[ \mu\left(\bigcup_{j \in J} A_j\right) = \sum_{i \in I} \mu_i \left(\bigcup_{j \in J} A_j\right) = \sum_{i \in I} \sum_{j \in J} \mu_i(A_j) = \sum_{j \in J} \sum_{i \in I} \mu_i(A_j) = \sum_{j \in J} \mu(A_j) \] The interchange of sums is permissible since the terms are nonnegative. Suppose now that \( I \) is finite.

  1. If \( \mu_i \) is finite for each \( i \in I \) then \( \mu(S) = \sum_{i \in I} \mu_i(S) \lt \infty \) so \( \mu \) is finite.
  2. Suppose that \( \mu_i \) is \( \sigma \)-finite for each \( i \in I \). Then for each \( i \in I \) there exists a collection \( \ms A_i = \{A_{i j}: j \in \N\} \subseteq \ms S \) such that \( \bigcup_{j=1}^\infty A_{i j} = S \) and \( \mu_i(A_{i j}) \lt \infty \) for each \( j \in \N \). For \( j \in \N \), let \( B_j = \bigcap_{i \in I} A_{i,j} \). Then \( B_j \in \ms S \) for each \( j \in \N \) and \[ \bigcup_{j=1}^\infty B_j = \bigcup_{j=1}^\infty \bigcap_{i \in I} A_{i j} = \bigcap_{i \in I} \bigcup_{j=1}^\infty A_{i j} = \bigcap_{i \in I} S = S \] Moreover, \[ \mu(B_j) = \sum_{i \in I} \mu_i(B_j) \le \sum_{i \in I} \mu_i(A_{i j}) \lt \infty \] so \( \mu \) is \( \sigma \)-finnite.

In the context of , if \(I\) is countably infinite and \(\mu_i\) is finite for each \(i \in I\), then \(\mu\) is not necessarily \(\sigma\)-finite. A counterexample is given in . In this case, \(\mu\) is said to be \(s\)-finite, but we've had enough definitions, so we won't pursue this one. From and , note that a positive linear combination of positive measures is a positive measure. The next method of constructing a measure is fundamentally important, and is sometimes referred to as a change of variables.

Suppose that \( (S, \ms S, \mu) \) is a measure space. Suppose also that \( (T, \ms T) \) is another measurable space and that \( f: S \to T \) is measurable. Then \( \nu \) defined as follows is a positive measure on \( (T, \ms T) \) \[ \nu(B) = \mu\left[f^{-1}(B)\right], \quad B \in \ms T \] If \( \mu \) is finite then \( \nu \) is finite.

Details:

Clearly \(\nu: \ms T \to [0, \infty]\). The proof is easy since inverse images preserve all set operations. First \( f^{-1}(\emptyset) = \emptyset \) so \( \nu(\emptyset) = 0 \). Next, if \( \left\{B_i: i \in I\right\} \) is a countable, disjoint collection of sets in \( \ms T \), then \( \left\{f^{-1}(B_i): i \in I\right\} \) is a countable, disjoint collection of sets in \( \ms S \), and \( f^{-1}\left(\bigcup_{i \in I} B_i\right) = \bigcup_{i \in I} f^{-1}(B_i) \). Hence \[ \nu\left(\bigcup_{i \in I} B_i\right) = \mu\left[f^{-1}\left(\bigcup_{i \in I} B_i\right)\right] = \mu\left[\bigcup_{i \in I} f^{-1}(B_i)\right] = \sum_{i \in I} \mu\left[f^{-1}(B_i)\right] = \sum_{i \in I} \nu(B_i) \] Finally, if \(\mu\) is finite then \(\nu(T) = \mu[f^{-1}(T)] = \mu(S) \lt \infty\) so \(\nu\) is finite.

In the context of , if \(\mu\) is \(\sigma\)-finite on \((S, \ms S)\), it is not necessarily true that \(\nu\) is \(\sigma\)-finite on \((T, \ms T)\), even if \(f\) is one-to-one. A counterexample is given in . The takeaway is that \(\sigma\)-finiteness of \(\nu\) depends very much on the nature of the \(\sigma\)-algebra \(\ms T\). Our next result shows that it's easy to explicitly construct a positive measure on a \( \sigma \)-algebra generated by a countable partition. Such \( \sigma \)-algebras are important for counterexamples and to gain insight, and also because many \( \sigma \)-algebras that occur in applications can be constructed from them.

Suppose that \( \ms A = \{A_i: i \in I\} \) is a countable partition of \( S \) into nonempty sets, and that \( \ms S = \sigma(\ms{A}) \), the \( \sigma \)-algebra generated by the partition. For \( i \in I \), define \( \mu(A_i) \in [0, \infty] \) arbitrarily. For \( A = \bigcup_{j \in J} A_j \) where \( J \subseteq I \), define \[ \mu(A) = \sum_{j \in J} \mu(A_j) \] Then \( \mu \) is a positive measure on \( (S, \ms S) \).

  1. The atoms of the measure are the sets of the form \(A = \bigcup_{j \in J} A_j\) where \(J \subseteq I\) and where \(\mu(A_j) \gt 0\) for one and only one \(j \in J\).
  2. If \(\mu(A_i) \lt \infty\) for \(i \in I\) and \(I\) is finite then \(\mu\) is finite.
  3. If \(\mu(A_i) \lt \infty\) for \(i \in I\) and \(I\) is countably infinite then \(\mu\) is \(\sigma\)-finite.
Details:

Recall that every \( A \in \ms S \) has a unique representation of the form \( A = \bigcup_{j \in J} A_j \) where \( J \subseteq I \). In particular, \( J = \emptyset \) in this representation gives \( A = \emptyset \). The sum over an empty index set is 0, so \( \mu(\emptyset) = 0 \). Next suppose that \( \{B_k: k \in K\} \) is a countable, disjoint collection of sets in \( \ms S \). Then there exists a disjoint collection \(\{J_k: k \in K\}\) of subsets of \(I\) such that \( B_k = \bigcup_{j \in J_k} A_j \). Hence \[ \mu\left(\bigcup_{k \in K} B_k\right) = \mu\left(\bigcup_{k \in K} \bigcup_{j \in J_k} A_j\right) = \sum_{k \in k}\sum_{j \in J_k} \mu(A_j) = \sum_{k \in K} \mu(B_k) \] The fact that the terms are all nonnegative means that we do not have to worry about the order of summation.

  1. Again, every \(A \in \ms S\) has the unique representation \(A = \bigcup_{j \in J} A_j\) where \(J \subseteq I\). The subsets of \(A\) that are in \(\ms S\) are \(\bigcup_{k \in K} A_k\) ahere \(K \subseteq J\). Hence \(A\) is an atom if and only if \(\mu(A_j) \gt 0\) for one and only one \(j \in J\).
  2. If \(I\) is finite and \(\mu(A_i) \lt \infty\) then \(\mu(S) = \sum_{i \in I} \mu(A_i) \lt \infty\), so \(\mu\) is finite.
  3. If \(I\) is countabley infinite and \(\mu(A_i) \lt \infty\) for \(i \in I\) then \(\ms A\) satisfies the condition for \(\mu\) to be \(\sigma\)-finite.

One of the most general ways to construct new measures from old ones is via the theory of integration with respect to a positive measure. The construction of positive measures more or less from scratch is considered in the section on existence and uniqueness. We close this discussion with a simple result that is useful for counterexamples.

Suppose that the measure space \( (S, \ms S, \mu) \) has an atom \( A \in \ms S \) with \( \mu(A) = \infty \). Then the space is not \( \sigma \)-finite.

Details:

Let \( \{A_i: i \in I\} \) be a countable disjoint collection of sets in \( \ms S \) that partitions \( S \). Then \( \{A \cap A_i: i \in I\} \) partitions \( A \). Since \( \mu(A) = \sum_{i \in I} \mu(A \cap A_i) \), we must have \( \mu(A \cap A_i) \gt 0 \) for some \( i \in I \). Since \( A \) is an atom and \( A \cap A_i \subseteq A \) it follows that \( \mu(A \cap A_i) = \infty \). Hence also therefore \( \mu(A_i) = \infty \).

Measure and Topology

Often the spaces that occur in probability and stochastic processes are topological spaces. Recall that a topological space \( (S, \ms T) \) consists of a set \( S \) and a topology \( \ms T \) on \( S \) (the collection of open sets). The topology as well as the measure theory plays an important role, so it's natural to want these two types of structures to be compatible. We have already seen the most important step in this direction: Recall that \( \ms S = \sigma(\ms T) \), the \( \sigma \)-algebra generated by the topology, is the Borel \( \sigma \)-algebra on \( S \), named for Émile Borel. Since the complement of an open set is a closed set, \(\ms S\) is also the \(\sigma\)-algebra generated by the collection of closed sets. Moreover, \(\ms S\) contains countable intersections of open sets (called \(G_\delta\) sets) and countable unions of closed sets (called \(F_\sigma\) sets).

Suppose that \( (S, \ms T) \) is a topological space and let \(\ms S = \sigma(\ms T)\) be the Borel \(\sigma\)-algebra. A positive measure \( \mu \) on \( (S, \ms S) \) is a Borel measure and then \((S, \ms S, \mu)\) is a Borel measure space.

The next definition concerns the subset on which a Borel measure is concentrated, in a certain sense.

Suppose that \((S, \ms S, \mu)\) is a Borel measure space. The support of \(\mu\) is \[\supp(\mu) = \{x \in S: \mu(U) \gt 0 \text{ for every open neighborhood } U \text{ of } x\}\] The set \(\supp(\mu)\) is closed.

Details:

Let \(A = \supp(\mu)\). For \(x \in A^c\), there exists an open neighborhood \(V_x\) of \(x\) such that \(\mu(V_x) = 0\). If \(y \in V_x\), then \(V_x\) is also an open neighborhood of \(y\), so \(y \in A^c\). Hence \(V_x \subseteq A^c\) for every \(x \in A^c\) and so \( A^c \) is open.

The term Borel measure has different definitions in the literature. Often the topological space is required to be locally compact, Hausdorff, and with a countable base (LCCB). Then a Borel measure \( \mu \) is required to have the additional condition that \( \mu(C) \lt \infty \) if \( C \subseteq S \) is compact. In this text, we use the term Borel measures in this more restricted sense.

Suppose that \((S, \ms S, \mu)\) is a Borel measure space corresponding to an LCCB topolgy. Then the space is \(\sigma\)-finite.

Details:

Since the topological space is locally compact and has a countable base, \(S = \bigcup_{i \in I} C_i\) where \(\{C_i: i \in I\}\) is a countable collection of compact sets. Since \(\mu\) is a Borel measure, \(\mu(C_i) \lt \infty\) and hence \(\mu\) is \(\sigma\)-finite.

Here are a couple of other definitions that are important for Borel measures, again linking topology and measure in natural ways.

Suppose again that \( (S, \ms S, \mu) \) is a Borel measure space.

  1. \( \mu \) is inner regular if \( \mu(A) = \sup\{\mu(C): C \text{ is compact and } C \subseteq A\} \) for \( A \in \ms S \).
  2. \( \mu \) is outer regular if \( \mu(A) = \inf\{\mu(U): U \text{ is open and } A \subseteq U\} \) for \( A \in \ms S \).
  3. \( \mu \) is regular if it is both inner regular and outer regular.

The measure spaces that occur in probability and stochastic processes are usually regular Borel spaces associated with LCCB topologies.

Null Sets and Equivalence

Sets of measure 0 in a measure space turn out to be very important precisely because we can often ignore the differences between mathematical objects on such sets. In this disucssion, we assume that we have a fixed measure space \((S, \ms S, \mu)\).

A set \(A \in \ms S\) is null if \(\mu(A) = 0\).

Consider a measurable statement with \( x \in S \) as a free variable. (Technically, such a statement is a predicate on \( S \).) If the statement is true for all \( x \in S \) except for \( x \) in a null set, we say that the statement holds almost everywhere on \( S \). This terminology is used often in measure theory and captures the importance of the definition.

Let \( \ms D = \{A \in \ms S: \mu(A) = 0 \text{ or } \mu(A^c) = 0\}\), the collection of null and co-null sets. Then \( \ms D \) is a sub \(\sigma\)-algebra of \( \ms S \).

Details:

Trivially \( S \in \ms D \) since \(S^c = \emptyset\) and \(\mu(\emptyset) = 0\). Next if \(A \in \ms D\) then \(A^c \in \ms D\) by the symmetry of the definition. Finally, suppose that \( A_i \in \ms D \) for \( i \in I \) where \( I \) is a countable index set. If \( \mu(A_i) = 0 \) for every \( i \in I \) then \( \mu\left(\bigcup_{i \in I} A_i \right) \le \sum_{i \in I} \mu(A_i) = 0 \) by the subadditive property in . On the other hand, if \( \mu(A_j^c) = 0 \) for some \( j \in J \) then \( \mu\left[\left(\bigcup_{i \in I} A_i \right)^c\right] = \mu\left(\bigcap_{i \in I} A_i^c\right) \le \mu(A_j^c) = 0 \). In either case, \( \bigcup_{i \in I} A_i \in \ms D \).

Of course \(\mu\) restricted to \(\ms D\) is not very interesting since \(\mu(A) = 0\) or \(\mu(A) = \mu(S)\) for every \(A \in \ms S\). Our next definition is a type of equivalence between sets in \(\ms S\). To make this precise, recall first that the symmetric difference between subsets \( A \) and \( B \) of \(S\) is \( A \bigtriangleup B = (A \setminus B) \cup (B \setminus A) \). This is the set that consists of points in one of the two sets, but not both, and corresponds to exclusive or.

Sets \(A, \, B \in \ms S\) are equivalent (with respect to the measure \(\mu\)) if \(\mu(A \bigtriangleup B) = 0 \), and we denote this by \( A \equiv B \).

Thus \(A \equiv B\) if and only if \(\mu(A \bigtriangleup B) = \mu(A \setminus B) + \mu(B \setminus A) = 0\) if and only if \(\mu(A \setminus B) = \mu(B \setminus A) = 0\). In the predicate terminology mentioned above, the statement \[ x \in A \text{ if and only if } x \in B \] is true for almost every \( x \in S \). As the name suggests, the relation \( \equiv \) really is an equivalence relation on \( \ms S \) and hence \( \ms S \) is partitioned into disjoint classes of mutually equivalent sets. Two sets in the same equivalence class differ by a set of measure 0.

The relation \( \equiv \) is an equivalence relation on \( \ms S \). That is, for \( A, \, B, \, C \in \ms S \),

  1. \(A \equiv A\) (the reflexive property).
  2. If \(A \equiv B\) then \(B \equiv A\) (the symmetric property).
  3. If \(A \equiv B\) and \(B \equiv C\) then \(A \equiv C\) (the transitive property).
Details:
  1. The reflexive property is trivial since \(A \bigtriangleup A = \emptyset\).
  2. The symmetric property is also trivial since \(A \bigtriangleup B = B \bigtriangleup A\).
  3. For the transitive property, suppose that \( A \equiv B \) and \( B \equiv C \). Note that \( A \setminus C \subseteq (A \setminus B) \cup (B \setminus C) \), and hence \( \mu(A \setminus C) = 0 \). By a symmetric argument, \( \mu(C \setminus A) = 0 \).

Equivalence is preserved under the standard set operations.

If \( A, \, B \in \ms S \) and \( A \equiv B \) then \( A^c \equiv B^c \).

Details:

Note that \( A^c \setminus B^c = B \setminus A \) and \( B^c \setminus A^c = A \setminus B \), so \( A^c \bigtriangleup B^c = A \bigtriangleup B \).

Suppose that \( A_i, \, B_i \in \ms S \) and that \( A_i \equiv B_i \) for \( i \) in a countable index set \( I \). Then

  1. \( \bigcup_{i \in I} A_i \equiv \bigcup_{i \in I} B_i \)
  2. \( \bigcap_{i \in I} A_i \equiv \bigcap_{i \in I} B_i \)
Details:
  1. Note that \[ \left(\bigcup_{i \in I} A_i\right) \bigtriangleup \left(\bigcup_{i \in I} B_i\right) \subseteq \bigcup_{i \in I} (A_i \bigtriangleup B_i) \] To see this, note that if \( x \) is in the set on the left then either \( x \in A_j \) for some \( j \in I \) and \( x \notin B_i \) for every \( i \in I \), or \( x \notin A_i \) for every \( i \in I \) and \( x \in B_j \) for some \( j \in I \). In either case, \( x \in A_j \bigtriangleup B_j \) for some \( j \in I \).
  2. Similarly \[ \left(\bigcap_{i \in I} A_i\right) \bigtriangleup \left(\bigcap_{i \in I} B_i\right) \subseteq \bigcup_{i \in I} (A_i \bigtriangleup B_i) \] If \( x \) is in the set on the left then \( x \in A_i \) for every \( i \in I \) and \( x \notin B_j \) for some \( j \in I \), or \( x \in B_i \) for every \( i \in I \) or \( x \notin A_j \) for some \( j \in I \). In either case, \( x \in A_j \bigtriangleup B_j \) for some \( j \in I \)

In both parts, the proof is completed by noting that the common set on the right in the displayed equations is null: \[ \mu\left[\bigcup_{i \in I} (A_i \bigtriangleup B_i) \right] \le \sum_{i \in I} \mu(A_i \bigtriangleup B_i) = 0 \]

Equivalent sets have the same measure.

If \( A, \, B \in \ms S \) and \(A \equiv B\) then \(\mu(A) = \mu(B)\).

Details:

Note again that \( A = (A \cap B) \cup (A \setminus B) \). If \( A \equiv B \) then \( \mu(A) = \mu(A \cap B) \). By a symmetric argument, \( \mu(B) = \mu(A \cap B) \).

The converse trivially fails, and a counterexample is given in . However, the collection of null sets and the collection of co-null sets do form equivalence classes.

Suppose that \( A \in \ms S \).

  1. If \(\mu(A) = 0\) then \(A \equiv B\) if and only if \(\mu(B) = 0\).
  2. If \(\mu(A^c) = 0\) then \(A \equiv B\) if and only if \(\mu(B^c) = 0\).
Details:
  1. Suppose that \( \mu(A) = 0 \) and \( A \equiv B\). Then \( \mu(B) = 0 \) by . Conversely, note that \( A \setminus B \subseteq A \) and \( B \setminus A \subseteq B \) so if \( \mu(A) = \mu(B) = 0 \) then \( \mu(A \bigtriangleup B) = 0 \) so \( A \equiv B \).
  2. Part (b) follows from part (a) and .

We can extend the notion of equivalence to measruable functions with a common range space. Thus suppose that \( (T, \ms T) \) is another measurable space. If \( f, \, g: S \to T \) are measurable, then \( (f, g): S \to T \times T \) is measurable with respect the usual product \( \sigma \)-algebra \( \ms T \times \ms T \). We also assume that \((T, \ms T)\) has a measurable diagonal so that \(D = \{(y, y): y \in T\} \in \ms T^2\).

Measurable functions \(f, \, g: S \to T\) are equivalent (with respect to \(\mu\)) if \( \mu\{x \in S: f(x) \ne g(x)\} = 0 \). Again we write \( f \equiv g \).

Details:

Note that \(\{x \in S: f(x) \ne g(x)\} = \{x \in S: (f(x), g(x)) \in D\}^c \in \ms S\) by our assumption, so the definition makes sense.

In the terminology discussed earlier, \( f \equiv g \) means that \( f(x) = g(x) \) almost everywhere on \( S \). As with measurable sets, the relation \( \equiv \) really does define an equivalence relation on the collection of measurable functions from \(S\) to \(T\). Thus, the collection of such functions is partitioned into disjoint classes of mutually equivalent variables.

The relation \( \equiv \) is an equivalence relation on the collection of measurable functions from \(S\) to \(T\). That is, for measurable \(f, \, g, \, h: S \to T\),

  1. \(f \equiv f\) (the reflexive property).
  2. If \(f \equiv g\) then \(g \equiv f\) (the symmetric property).
  3. If \( f \equiv g\) and \(g \equiv h\) then \(f \equiv h\) (the transitive property).
Details:

Parts (a) and (b) are trivial. For (c) note that \( f(x) = g(x) \) and \( g(x) = h(x) \) implies \( f(x) = h(x) \) for \( x \in S \). Negating this statement gives \( f(x) \ne h(x) \) implies \( f(x) \ne g(x) \) or \( g(x) \ne h(x) \). So \[ \{x \in S: f(x) \ne h(x)\} \subseteq \{x \in S: f(x) \ne g(x)\} \cup \{ x \in S: g(x) \ne h(x)\} \] Since \( f \equiv g \) and \( g \equiv h \), the two sets on the right have measure 0. Hence, so does the set on the left.

Suppose agaom that \(f, \, g: S \to T\) are measurable and that \(f \equiv g\). Then for every \(B \in \ms T\), the sets \(f^{-1}(B) \equiv g^{-1}(B)\).

Details:

Note that \( f^{-1}(B) \bigtriangleup g^{-1}(B) \subseteq \{x \in S: f(x) \ne g(x)\} \).

Thus if \( f, \, g: S \to T \) are measurable and \( f \equiv g \), then by the previous result, \(\nu_f = \nu_g\) where \(\nu_f, \, \nu_g\) are the measures on \((T, \ms T)\) associated with \( f \) and \( g \), as in . Again, the converse fails with a passion.

It often happens that a definition for functions subsumes the corresponding definition for sets, by considering the indicator functons of the sets. So it is with equivalence. In the following result, we can take \(T = \{0, 1\}\) with \(\ms T\) the collection of all subsets.

Suppose that \(A, \, B \in \ms S\). Then \(A \equiv B\) if and only if \(\bs{1_A} \equiv \bs{1_B}\).

Details:

Note that \( \left\{x \in S: \bs 1_A(x) \ne \bs 1_B(x) \right\} = A \bigtriangleup B \).

Equivalence is preserved under composition. For the next result, suppose that \((U, \ms U)\) is yet another measurable space.

Suppose that \(f, \, g: S \to T\) are measurable and that \(h: T \to U\) is measurable. If \(f \equiv g\) then \(h \circ f \equiv h \circ g\).

Details:

Note that \( \{x \in S: h[f(x)] \ne h[g(x)]\} \subseteq \{x \in S: f(x) \ne g(x)\} \).

Suppose again that \( (S, \ms S, \mu) \) is a measure space. Let \( \ms V \) denote the collection of all measurable real-valued random functions from \( S \) into \( \R \). (As usual, \(\R\) is given the Borel \(\sigma\)-algebra.) From our previous discussion of measurable spaces, we know that with the usual definitions of addition and scalar multiplication, \( (\ms V, +, \cdot) \) is a vector space. However, in measure theory, we often do not want to distinguish between functions that are equivalent, so it's nice to know that the vector space structure is preserved when we identify equivalent functions. Formally, let \( [f] \) denote the equivalence class generated by \( f \in \ms V \), and let \( \ms W \) denote the collection of all such equivalence classes. In modular notation, \( \ms W\) is \(\ms V \big/ \equiv \). We define addition and scalar multiplication on \( \ms W \) by \[ [f] + [g] = [f + g], \; c [f] = [c f]; \quad f, \, g \in \ms V, \; c \in \R \]

\( (\ms W, +, \cdot) \) is a vector space.

Details:

All that we have to show is that addition and scalar multiplication are well defined. That is, we must show that the definitions do not depend on the particularly representative of the equivalence class. Then the other properties that define a vector space are inherited from \( (\ms V, +, \cdot) \). Thus we must show that if \( f_1 \equiv f_2 \) and \( g_1 \equiv g_2 \), and if \( c \in \R \), then \( f_1 + g_1 \equiv f_2 + g_2 \) and \( c f_1 \equiv c f_2 \). For the first problem, note that \((f_1, g_1)\) and \((f_2, g_2)\) are measurable functions from \(S\) to \(\R^2\). (\(\R^2\) is given the product \(\sigma\)-algebra which also happens to be the Borel \(\sigma\)-algebra corresponding to the standard Euclidean topolgy). Moreover, \((f_1, g_1) \equiv (f_2, g_2)\) since \[\{x \in S: (f_1(x), g_1(x)) \ne (f_2(x), g_2(x))\} = \{x \in S: f_1(x) \ne f_2(x)\} \cup \{x \in S: g_1(x) \ne g_2(x)\}\] But the function \((a, b) \mapsto a + b\) from \(\R^2\) into \(\R\) is measurable and hence from , it follows that \(f_1 + g_1 \equiv f_2 + g_2\). The second problem is easier. The function \(a \mapsto c \cdot a\) from \(\R\) into \(\R\) is measurable so again it follows from that \(c f_1 \equiv c f_2\).

Often we don't bother to use the special notation for the equivalence class associated with a function. Rather, it's understood that equivalent functions represent the same object. Spaces of functions in a measure space are studied further in a separate section.

Completion

Suppose that \( (S, \ms S, \mu) \) is a measure space and let \( \ms N = \{A \in \ms S: \mu(A) = 0\} \) denote the collection of null sets of the space. If \( A \in \ms N \) and \( B \in \ms S \) is a subset of \( A \), then we know that \( \mu(B) = 0 \) so \( B \in \ms N \) also. However, in general there might be subsets of \( A \) that are not in \( \ms S \). This leads naturally to the following definition.

The measure space \( (S, \ms S, \mu) \) is complete if \( A \in \ms N \) and \( B \subseteq A \) imply \( B \in \ms S \) (and hence \( B \in \ms N \)).

Our goal in this discussion is to show that if \( (S, \ms S, \mu) \) is a \( \sigma \)-finite measure that is not complete, then it can be completed. That is \( \mu \) can be extended to \( \sigma \)-algebra that includes all of the sets in \( \ms S \) and all subsets of null sets. The first step is to extend the equivalence relation defined in our previous discussion to \( \ms P(S) \).

For \( A, \, B \subseteq S \), define \( A \equiv B \) if and only if there exists \( N \in \ms N \) such that \( A \bigtriangleup B \subseteq N \). The relation \( \equiv \) is an equivalence relation on \( \ms{P}(S) \): For \( A, \, B, \, C \subseteq S \),

  1. \( A \equiv A \) (the reflexive property).
  2. If \( A \equiv B \) then \( B \equiv A \) (the symmetric property).
  3. If \( A \equiv B \) and \( B \equiv C \) then \( A \equiv C \) (the transitive property).
Details:
  1. Note that \( A \bigtriangleup A = \emptyset \) and \( \emptyset \in \ms N \).
  2. Suppose that \( A \bigtriangleup B \subseteq N \) where \( N \in \ms N \). Then \( B \bigtriangleup A = A \bigtriangleup B \subseteq N\).
  3. Suppose that \( A \bigtriangleup B \subseteq N_1 \) and \( B \bigtriangleup C \subseteq N_2\) where \( N_1, \; N_2 \in \ms N \). Then \( A \bigtriangleup C \subseteq (A \bigtriangleup B) \cup (B \bigtriangleup C) \subseteq N_1 \cup N_2 \), and \( N_1 \cup N_2 \in \ms N \).

So the equivalence relation \( \equiv \) partitions \( \ms P(S) \) into mutually disjoint equivalence classes. Two sets in an equivalence class differ by a subset of a null set. In particular, \( A \equiv \emptyset \) if and only if \( A \subseteq N \) for some \( N \in \ms N \). The extended relation \( \equiv \) is preserved under the set operations, just as before. Our next step is to enlarge the \( \sigma \)-algebra \( \ms S \) by adding any set that is equivalent to a set in \( \ms S \).

Let \( \ms S_0 = \{A \subseteq S: A \equiv B \text{ for some } B \in \ms S \} \). Then \( \ms S_0 \) is a \( \sigma \)-algebra of subsets of \( S \), and in fact is the \( \sigma \)-algebra generated by \( \ms S \cup \{A \subseteq S: A \equiv \emptyset\} \).

Details:

Note that if \( A \in \ms S \) then \( A \equiv A \) so \( A \in \ms S_0 \). In particular, \( S \in \ms S_0 \). Also, \( \emptyset \in \ms S \) so if \( A \equiv \emptyset \) then \( A \in \ms S_0 \). Suppose that \( A \in \ms S_0 \) so that \( A \equiv B \) for some \( B \in \ms S \). Then \( B^c \in \ms S \) and \( A^c \equiv B^c \) so \( A^c \in \ms S_0 \). Next suppose that \( A_i \in \ms S_0 \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \) there exists \( B_i \in \ms S \) such that \( A_i \equiv B_i \). But then \( \bigcup_{i \in I} B_i \in \ms S \) and \( \bigcup_{i \in I} A_i \equiv \bigcup_{i \in I} B_i \), so \( \bigcup_{i \in I} A_i \in \ms S_0 \). Therefore \( \ms S_0 \) is a \( \sigma \)-algebra of subsets of \( S \). Finally, suppose that \( \ms T \) is a \( \sigma \)-algebra of subsets of \( S \) and that \( \ms S \cup \{A \subseteq S: A \equiv \emptyset\} \subseteq \ms T \). We need to show that \( \ms S_0 \subseteq \ms T \). Thus, suppose that \( A \in \ms S_0 \) Then there exists \( B \in \ms S \) such that \( A \equiv B \). But \( B \in \ms T \) and \( A \bigtriangleup B \in \ms T \) so \( A \cap B = B \setminus (A \bigtriangleup B) \in \ms T\). Also \( A \setminus B \in \ms T \), so \( A = (A \cap B) \cup (A \setminus B) \in \ms T \).

Our last step is to extend \( \mu \) to a positive measure on the enlarged \( \sigma \)-algebra \( \ms S_0 \).

Suppose that \( A \in \ms S_0 \) so that \( A \equiv B \) for some \( B \in \ms S \). Define \( \mu_0(A) = \mu(B) \). Then

  1. \( \mu_0 \) is well defined.
  2. \( \mu_0(A) = \mu(A) \) for \( A \in \ms S \).
  3. \( \mu_0 \) is a positive measure on \( \ms S_0 \).

The measure space \( (S, \ms S_0, \mu_0) \) is complete and is known as the completion of \( (S, \ms S, \mu) \).

Details:
  1. Suppose that \( A \in \ms S_0 \) and that \( A \equiv B_1 \) and \( A \equiv B_2 \) where \( B_1, \, B_2 \in \ms S \). Then \(B_1 \equiv B_2 \) so by , \( \mu(B_1) = \mu(B_2) \). Thus, \( \mu_0 \) is well-defined.
  2. Next, if \( A \in \ms S \) then of course \( A \equiv A \) so \( \mu_0(A) = \mu(A) \).
  3. Trivially \( \mu_0(A) \ge 0 \) for \( A \in \ms S_0 \). Thus we just need to show the countable additivity property. To understand the proof you need to keep several facts in mind: the functions \( \mu \) and \( \mu_0 \) agree on \( \ms S \) (property (b)); equivalence is preserved under set operations; equivalent sets have the same value under \( \mu_0 \) (property (a)). Since the measure space \( (S, \ms S, \mu) \) is \( \sigma \)-finite, there exists a countable disjoint collection \( \{C_i: i \in I\} \) of sets in \( \ms S \) such that \( S = \bigcup_{i \in I} C_i \) and \( \mu(C_i) \lt \infty \) for each \( i \in I \). Suppose first that \( A \in \ms S_0 \), so that there exists \( B \in \ms S \) with \( A \equiv B \). Then \[\mu_0(A) = \mu_0\left[\bigcup_{i \in I} (A \cap C_i)\right] = \mu\left[\bigcup_{i \in I} (B \cap C_i)\right] = \sum_{i \in I} \mu(B \cap C_i) = \sum_{i \in I} \mu_0(A \cap C_i)\] Suppose next that \( (A_1, A_2, \ldots) \) is a sequence of pairwise disjoint sets in \( \ms S_0 \) so that there exists a sequence \( (B_1, B_2, \ldots) \) of sets in \( \ms S \) such that \( A_i \equiv B_i \) for each \( i \in \N_+ \). For fixed \( i \in I \), \[ \mu_0\left[\bigcup_{n=1}^\infty (A_n \cap C_i)\right] = \mu_0\left[\bigcup_{n=1}^\infty (B_n \cap C_i)\right] = \mu\left[\bigcup_{n=1}^\infty (B_n \cap C_i)\right] = \sum_{in=1}^\infty \mu(B_n \cap C_i) = \sum_{n=1}^\infty \mu_0(A_n \cap C_i) \] The next-to-the-last equality use the inclusion-exclusion law, since we don't know (and it's probably not true) that the sequence \( (B_1, B_2, \ldots) \) is disjoint. The use of inclusion-exclusion is why we need \( (S, \ms S, \mu) \) to be \( \sigma \)-finite. Finally, using the previous displayed equations, \begin{align*} \mu_0\left(\bigcup_{n=1}^\infty A_n\right) & = \sum_{i \in I} \mu_0\left[\left(\bigcup_{n=1}^\infty A_n\right) \cap C_i\right] = \sum_{i \in I} \mu_0\left(\bigcup_{n=1}^\infty A_n \cap C_i \right) \\ & = \sum_{i \in I} \sum_{n=1}^\infty \mu_0(A_n \cap C_i) = \sum_{n=1}^\infty \sum_{i \in I} \mu_0(A_n \cap C_i) = \sum_{n=1}^\infty \mu_0(A_n) \end{align*}

Examples and Exercises

As always, be sure to try the computational exercises and proofs yourself before expanding the details. Recall that a discrete measure space consists of a countable set, with the \( \sigma \)-algebra of all subsets, and with counting measure \( \# \).

Counterexamples

The continuity theorem for decreasing events can fail if the events do not have finite measure.

Consider \( \Z \) with counting measure \( \# \) on the \( \sigma \)-algebra of all subsets. Let \( A_n = \{ z \in \Z: z \le -n\} \) for \( n \in \N_+ \). The continuity theorem fails for \( (A_1, A_2, \ldots) \).

Details:

The sequence is decreasing and \( \#(A_n) = \infty \) for each \( n \), but \( \# \left(\bigcap_{i=1}^\infty A_i\right) = \#(\emptyset) = 0 \).

Equal measure certainly does not imply equivalent sets.

Suppose that \( (S, \ms S, \mu) \) is a measure space with the property that there exist disjoint sets \( A, \, B \in \ms S\) such that \( \mu(A) = \mu(B) \gt 0 \). Then \( A \) and \( B \) are not equivalent.

Details:

Note that \( A \bigtriangleup B = A \cup B \) and \( \mu(A \cup B) \gt 0 \).

For a concrete example, we could take \( S = \{0, 1\} \) with counting measure \( \# \) on \( \sigma \)-algebra of all subsets, and \( A = \{0\} \), \( B = \{1\} \).

The \( \sigma \)-finite property is not necessarily inherited by a sub-measure space. To set the stage for the counterexample, let \( \ms R \) denote the Borel \( \sigma \)-algebra of \( \R \), that is, the \( \sigma \)-algebra generated by the standard Euclidean topology. There exists a positive measure \( \lambda \) on \( (\R, \ms R) \) that generalizes length. The measure \( \lambda \), known as Lebesgue measure, is constructed in the section on existence and uniqueness. Next let \( \ms C \) denote the \( \sigma \)-algebra of countable and co-countable sets: \[ \ms C = \{A \subseteq \R: A \text{ is countable or } A^c \text{ is countable}\} \] Recall that \( \ms C \) is a \( \sigma \)-algebra.

\( (\R, \ms C) \) is a subspace of \( (\R, \ms R) \). Moreover, \( (\R, \ms R, \lambda) \) is \( \sigma \)-finite but \( (\R, \ms C, \lambda) \) is not.

Details:

If \( x \in \R \), then the singleton \( \{x\} \) is closed and hence is in \( \ms R \). A countable set is a countable union of singletons, so if \( A \) is countable then \( A \in \ms R \). It follows that \( \ms C \subset \ms R \). Next, let \( I_n \) denote the interval \( [n, n + 1) \) for \( n \in \Z \). Then \( \lambda(I_n) = 1 \) for \( n \in Z \) and \( \R = \bigcup_{n \in \Z} I_n \), so \( (\R, \ms R, \lambda) \) is \( \sigma \)-finite. On the other hand, \( \lambda\{x\} = 0 \) for \( x \in R \) (since the set is an interval of length 0). Therefore \( \lambda(A) = 0 \) if \( A \) is countable and \( \lambda(A) = \infty \) if \( A^c \) is countable. It follows that \( \R \) cannot be written as a countable union of sets in \( \ms C \), each with finite measure.

We can modify to construct an example in the context of that is not \(\sigma\) finite.

Consider again the measure space \((\R, \ms R, \lambda)\) where \(\ms R\) is the Borel \(\sigma\)-algebra on \(\R\) and where \(\lambda\) is Lesbesgue (length) measure on \((\R, \ms R)\). Let \(\ms C\) denote the \(\sigma\)-algebra of countable and co-countable subsets of \(\R\). Let \(f: \R \to \R\) be the identity function so that \(f(x) = x\) for \(x \in \R\). Then

  1. \(f\) is measurable relative to \((\R, \ms R)\) and \((\R, \ms C)\).
  2. \(f\) is one-to-one
  3. \((\R, \ms R, \lambda)\) is \(\sigma\)-finite.
  4. \((\R, \ms C, \nu)\) is not \(\sigma\)-finite where \(\nu\) is the measure induced by \(f\).
Details:

Note that \(f^{-1}(A) = A\) for \(A \in \ms C\). Part (a) follows since countable and co-countable sets are in \(\ms R\). Part (b) is trivial and part (c) was shown in . For part (d), note that \(\nu(A) = \lambda(A)\) for \(A \in \ms C\), so \(\nu(A) = 0\) if \(A\) is countable and \(\nu(A) = \infty\) if \(A^c\) is countable. Hence \((\R, \ms C, \nu)\) is not \(\sigma\)-finite.

A sum of finite measures may not be \( \sigma \)-finite.

Let \( S \) be a nonempty, finite set with the \( \sigma \)-algebra \( \ms S \) of all subsets. Let \( \mu_n\) be counting measure \(\#\) on \( (S, \ms S) \) for \( n \in \N_+ \). Then \( \mu_n \) is a finite measure for each \( n \in \N_+ \), but \( \mu = \sum_{n \in \N_+} \mu_n \) is not \( \sigma \)-finite.

Details:

Note that \( \mu \) is the trivial measure on \( (S, \ms S) \) given by \( \mu(A) = \infty \) if \( A \ne \emptyset \) (and of course \( \mu(\emptyset) = 0 \)).

Basic Properties

In the following problems, \( \mu \) is a positive measure on the measurable space \( (S, \ms S) \).

Suppose that \( \mu(S) = 20 \) and that \(A, B \in \ms S\) with \(\mu(A) = 5\), \(\mu(B) = 6 \), \(\mu(A \cap B) = 2\). Find the measure of each of the following sets:

  1. \(A \setminus B\)
  2. \(A \cup B\)
  3. \(A^c \cup B^c\)
  4. \(A^c \cap B^c\)
  5. \(A \cup B^c\)
Details:
  1. 3
  2. 9
  3. 18
  4. 11
  5. 16

Suppose that \( \mu(S) = \infty \) and that \(A, \, B \in \ms S\) with \(\mu(A \setminus B) = 2\), \(\mu(B \setminus A) = 3\), and \(\mu(A \cap B) = 4\). Find the measure of each of the following sets:

  1. \(A\)
  2. \(B\)
  3. \(A \cup B\)
  4. \( A^c \cap B^c \)
  5. \( A^c \cup B^c \)
Details:
  1. 6
  2. 7
  3. 9
  4. \(\infty\)
  5. \(\infty\)

Suppose that \( \mu(S) = 10 \) and that \(A, \, B \in \ms S\) with \(\mu(A) = 3\), \(\mu(A \cup B) = 7\), and \(\mu(A \cap B) = 2\). Find the measure of each of the following events:

  1. \(B\)
  2. \(A \setminus B\)
  3. \(B \setminus A\)
  4. \(A^c \cup B^c\)
  5. \(A^c \cap B^c\)
Details:
  1. 6
  2. 1
  3. 4
  4. 8
  5. 3

Suppose that \( A, \, B, \, C \in \ms S \) with \( \mu(A) = 10 \), \( \mu(B) = 12 \), \( \mu(C) = 15 \), \( \mu(A \cap B) = 3 \), \( \mu(A \cap C) = 4 \), \( \mu(B \cap C) = 5 \), and \( \mu(A \cap B \cap C) = 1S \). Find the probabilities of the various unions:

  1. \( A \cup B \)
  2. \( A \cup C \)
  3. \( B \cup C \)
  4. \( A \cup B \cup C \)
Details:
  1. 21
  2. 23
  3. 22
  4. 28