\(\newcommand{\R}{\mathbb{R}}\)
\(\newcommand{\N}{\mathbb{N}}\)
\(\newcommand{\E}{\mathbb{E}}\)
\(\newcommand{\P}{\mathbb{P}}\)
\( \newcommand{\bs}{\boldsymbol} \)

- Random
- 4. Special Distributions
- General Uniform Distributions

This section explores uniform distributions in an abstract setting. If you are a new student of probability, or are not familiar with measure theory, you may want to skip this section and read the sections on the uniform distribution on an interval and the discrete uniform distributions.

Suppose that \( (S, \mathscr{S}, \lambda) \) is a measure space. That is, \( S \) is a set, \( \mathscr{S} \) a \( \sigma \)-algebra of subsets of \( S \), and \( \lambda \) a positive measure on \( \mathscr{S} \). Suppose also that \( 0 \lt \lambda(S) \lt \infty \), so that \( \lambda \) is a finite positive measure.

Random variable \( X \) with values in \( S \) has the uniform distribution on \( S \) (with respect to \( \lambda \)) if \[ \P(X \in A) = \frac{\lambda(A)}{\lambda(S)}, \quad A \in \mathscr{S} \]

Thus, the probability assigned to a set \( A \in \mathscr{S}\) depends only on the size of \( A \) (as measured by \( \lambda \)). The most common special cases are as follows:

- \( S \) is a measureable subset of \( \R^n \) (with positive, finite measure), \( \mathscr{S} \) is the collection of measureable subsets of \( S \); and \( \lambda = \lambda_n \), the usual Lebesgue measure on \( \R^n \). Recall that \( \lambda_1 \) is length measure on \( \R \), \( \lambda_2 \) is area measure on \( \R^2 \), \( \lambda_3 \) is volume measure on \( \R^3 \), and in general \( \lambda_n \) is sometimes referred to as \( n \)-dimensional volume. Thus, \( S \subseteq \R^n \) is a set with positive, finite volume.
- \( S \) is a non-empty, finite set, \( \mathscr{S} \) is the \( \sigma \)-algebra of all subsets of \( S \), and \( \lambda = \# \) (counting measure).

Suppose that \( X \) is uniformly distributed on \( S \), as above.

The probability density function \( f \) of \( X \) (with respect to \( \lambda \) is \[ f(x) = \frac{1}{\lambda(S)}, \quad x \in S \]

This follows directly from the definition of probability density function.

Thus, the defining property of the uniform distribution on a set is constant density on that set. Another basic property is that uniform distributions are preserved under conditioning.

Suppose that \( R \in \mathscr{S} \) with \( \lambda(R) \gt 0 \). The conditional distribution of \( X \) given \( X \in R \) is uniform on \( R \).

For \( A \subseteq R \), \[ \P(X \in A \mid X \in R) = \frac{\P(X \in A)}{\P(X \in R)} = \frac{\lambda(A)/\lambda(S)}{\lambda(R)/\lambda(S)} = \frac{\lambda(A)}{\lambda(R)} \]

In the setting of previous result, suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of independent variables, each uniformly distributed on \( S \). Let \( N = \min\{n \in \N_+: X_n \in R\} \). Then \( N \) has the geometric distribution on \( \N_+ \) with success parameter \( p = \P(X \in R) \). More importantly, the distribution of \( X_N \) is the same as the conditional distribution of \( X \) given \( X \in R \), and hence is uniform on \( R \). This is the basis of the rejection method of simulation. If we can simulate a uniform distribution on \( S \), then we can simulate a uniform distribution on \( R \).

If \( h \) is a real-valued function on \( S \), then \( \E[h(X)] \) is the average value of \( h \) on \( S \), as measured by \( \lambda \):

If \( h: S \to \R \) is integrable with respect to \( \lambda \) Then \[ \E[h(X)] = \frac{1}{\lambda(S)} \int_S h(x) \, d\lambda(x) \]

This result follows from the change of variables theorem for expected value, since \( \E[h(X)] = \int_S h(x) f(x) \, d\lambda(x) \).

The entropy of the uniform distribution on \( S \) depends only on the size of \( S \), as measured by \( \lambda \):

The entropy of \( X \) is \( H(X) = \ln[\lambda(S)] \).

Suppose now that \( (S, \mathscr{S}, \lambda) \) and \( (T, \mathscr{T}, \mu) \) are measure spaces as above, with \( 0 \lt \lambda(S) \lt \infty \) and \( 0 \lt \mu(T) \lt \infty) \). Recall the product space \( (S \times T, \mathscr{S} \otimes \mathscr{T}, \lambda \otimes \mu) \).

\( (X, Y) \) is uniformly distributed on \( S \times T \) if and only if \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent.

Suppose first that \( (X, Y) \) is uniformly distributed on \( S \times T\). If \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \) then \[ \P(X \in A, Y \in B) = \P[(X, Y) \in A \times B] = \frac{(\lambda \otimes \mu)(A \times B)}{(\lambda \otimes \mu)(S \times T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} \] Taking \( B = T \) in the displayed equation gives \( \P(X \in A) = \lambda(A) \big/ \lambda(S) \) for \( A \in \mathscr{S} \), so \( X \) is uniformly distributed on \( S \). Taking \( A = S \) in the displayed equation gives \( \P(Y \in B) = \mu(B) \big/ \mu(T) \) for \( B \in \mathscr{T} \), so \( Y \) is uniformly distributed on \( T \). Returning to the displayed equation generally gives \( \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) \) for \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \), so \( X \) and \( Y \) are independent.

Conversely, suppose that \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent. Then for \( A \in \mathscr{S} \) and \( B \in \mathscr{T} \), \[ \P[(X, Y) \in A \times B] = \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{(\lambda \otimes \mu)(A \times B)}{(\lambda \otimes \mu)(S \times T)} \] It then follows (see the section on existence and uniqueness of measures) that \( \P[(X, Y) \in C] = (\lambda \otimes \mu)(C) / (\lambda \otimes \mu)(S \times T) \) for every \( C \in \mathscr{S} \otimes \mathscr{T} \), so \( (X, Y) \) is uniformly distributed on \( S \times T \).