\( \newcommand{\R}{\mathbb{R}} \)
\( \newcommand{\N}{\mathbb{N}} \)
\( \newcommand{\Z}{\mathbb{Z}} \)
\( \newcommand{ \E}{\mathbb{E}} \)
\( \newcommand{\P}{\mathbb{P}} \)
\( \newcommand{\var}{\text{var}} \)
\( \newcommand{\sd}{\text{sd}} \)
\( \newcommand{\bs}{\boldsymbol} \)
\( \newcommand{\sgn}{\text{sgn}} \)
\( \newcommand{\skw}{\text{skew}} \)
\( \newcommand{\kur}{\text{kurt}} \)

- Random
- 4. Special Distributions
- The Cauchy Distribution

The Cauchy distribution, named of course for the ubiquitous Augustin Cauchy, is interesting for a couple of reasons. First, it is a simple family of distributions for which the expected value (and other moments) do not exist. Second, the family is closed under the formation of sums of independent variables, and hence is an infinitely divisible family of distributions.

Random variable \( X \) has the standard Cauchy distribution if \( X \) has a continuous distribution on \( \R \) with probability density function \( g \) given by \[ g(x) = \frac{1}{\pi \left(1 + x^2\right)}, \quad x \in \R \]

Yes, \( g \) really is a probability density function.

- \( g \) is symmetric about \( x = 0 \)
- \(g\) increases and then decreases, with mode \( x = 0 \).
- \(g\) is concave upward, then downward, and then upward again, with inflection points at \(x = \pm \frac{1}{\sqrt{3}}\).
- \( g(x) \to 0 \) as \( x \to \infty \) and as \( x \to -\infty \)

Note that \[ \int_{-\infty}^\infty \frac{1}{1 + x^2} \, dx = \arctan(x) \biggm|_{-\infty}^\infty = \frac{\pi}{2} - \left(-\frac{\pi}{2}\right) = \pi \] and hence \( g \) is a PDF. Parts (a)–(d) follow from basic calculus.

Thus, the graph of \(g\) has a simple, symmetric, unimodal shape that is qualitatively (but certainly not quantitatively) like the standard normal probability density function. The probability density function \(g\) is obtained by normalizing the function \[ x \mapsto \frac{1}{1 + x^2}, \quad x \in \R \] The graph of this function is known as the witch of Agnesi, named for the Italian mathematician Maria Agnesi.

Open the special distribution simulator and select the Cauchy distribution. Keep the default parameter values to get the standard Cauchy distribution and note the shape and location of the probability density function. Run the simulation 1000 times and compare the empirical density function to the probability density function.

\( X \) has distribution function \( G \) given by \( G(x) = \frac{1}{2} + \frac{1}{\pi} \arctan(x)\) for \(x \in \R \)

For \(x \in \R\), \[ G(x) = \int_{-\infty}^x g(t) \, dt = \frac{1}{\pi} \arctan(t) \biggm|_{-\infty}^x = \frac{1}{\pi} \arctan(x) + \frac{1}{2}\]

\(X\) has quantile function \(G^{-1}\) given by \( G^{-1}(p) = \tan\left[\pi\left(p - \frac{1}{2}\right)\right]\) for \(p \in (0, 1) \). In particular,

- The first quartile is \(G^{-1}\left(\frac{1}{4}\right) = -1\)
- The median is \(G^{-1}\left(\frac{1}{2}\right) = 0\)
- The third quartile is \(G^{-1}\left(\frac{3}{4}\right) = 1\)

Of course, the fact that the median is 0 also follows from the symmetry of the distribution, as does the fact that \(G^{-1}(1 - p) = -G^{-1}(p)\) for \(p \in (0, 1)\).

Open the special distribution calculator and select the Cauchy distribution. Keep the default parameter values and note the shape of the distribution and probability density functions. Compute a few quantiles.

Suppose again that \(X\) has the standard Cauchy distribution. As we noted in the introduction, part of the fame of this distribution comes from the fact that the expected value does not exist.

\(\E(X)\) does not exist.

By definition, \(\E(U) = \int_{-\infty}^\infty x g(x) \, dx\). For the improper integral to exist, even as an extended real number, at least one of the integrals \(\int_{-\infty}^a x g(x) \, dx\) and \(\int_a^\infty x g(x) \, dx\) must be finite, for some (and hence any) \(a \in \R\). But by a simple substitution, \[ \int_a^\infty x g(x) \, dx = \int_a^\infty x \frac{1}{\pi (1 + x^2)} \, dx = \frac{1}{2 \pi} \ln(1 + x^2) \biggm|_a^\infty = \infty \] and similarly, \(\int_{-\infty}^a x g(x) \, dx = - \infty\).

By symmetry, if the expected value *did* exist, it would have to be 0, just like the median and the mode, but alas the mean does not exist. Moreover, this is not just an artifact of how mathematicians define improper integrals, but has real consequences. Recall that if we think of the probability distribution as a mass distribution, then the mean is center of mass, the balance point, the point where the moment (in the sense of physics) to the right is balanced by the moment to the left. But as the proof of (6) shows, the moments to the right and to the left at any point \(a \in \R\) are infinite. In this sense, 0 is no more important than any other \(a \in \R\). Finally, if you are not convinced by the argument from physics, the next exercise may convince you that the law of large numbers fails as well.

Open the special distribution simulator and select the Cauchy distribution. Keep the default parameter values, which give the standard Cauchy distribution. Run the simulation 1000 times and note the behavior of the sample mean.

Earlier we noted some superficial similarities between the standard Cauchy distribution and the standard normal distribution (unimodal, symmetric about 0). But clearly there are huge quantitative differences. The Cauchy distribution is a heavy tailed distribution because the probability density function \(g(x)\) decreases at a polynomial rate as \(x \to \infty\) and \(x \to -\infty\), as opposed to an exponential rate. This is yet another way to understand why the expected value does not exist.

In terms of the higher moments, \(\E\left(X^n\right)\) does not exist if \(n\) is odd, and is \(\infty\) if \(n\) is even. It follows that the moment generating function \(m(t) = \E\left(e^{t X}\right)\) cannot be finite in an interval about 0. In fact, \(m(t) = \infty\) for every \(t \ne 0\), so this generating function is of no use to us. But *every* distribution on \(\R\) has a characteristic function, and for the Cauchy distribution, this generating function will be quite useful.

\(X\) has characteristic function \(\chi_0\) given by \( \chi_0(t) = \exp\left(-\left|t\right|\right)\) for \(t \in \R \).

By definition, \[ \chi_0(t) = \E(e^{i t X}) = \int_{-\infty}^\infty e^{i t x} \frac{1}{\pi (1 + x^2)} \, dx \] We will compute this integral by evaluating a related contour integral in the complex plane using, appropriately enough, Cauchy's integral formula (named for you know who).

Suppose first that \(t \ge 0\). For \(r \gt 1\), let \(\Gamma_r\) denote the curve in the complex plane consisting of the line segment \(L_r\) on the \(x\)-axis from \(-r\) to \(r\) and the upper half circle \(C_r\) of radius \(r\) centered at the origin. We give \(\Gamma_r\) the usual counter-clockwise orientation. On the one hand we have \[ \int_{\Gamma_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz = \int_{L_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz + \int_{C_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz\] On \(L_r\), \(z = x\) and \(dz = dx\) so \[ \int_{L_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz = \int_{-r}^r \frac{e^{i t x}}{\pi (1 + x^2)} dx \] On \(C_r\), let \(z = x + i y\). Then \(e^{i t z} = e^{-t y + i t x} = e^{-t y} [\cos(t x) + i \sin(t x)]\). Since \(y \ge 0\) on \(C_r\) and \(t \ge 0\), we have \(|e^{i t z} | \le 1\). Also, \(\left|\frac{1}{1 + z^2}\right| \le \frac{1}{r^2 - 1}\) on \(C_r\). It follows that \[ \left|\int_{C_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz \right| \le \frac{1}{\pi (r^2 - 1)} \pi r = \frac{r}{r^2 - 1} \to 0 \text{ as } r \to \infty \] On the other hand, \(e^{i t z} / [\pi (1 + z^2)]\) has one singularity inside \(\Gamma_r\), at \(i\). The residue is \[ \lim_{z \to i} (z - i) \frac{e^{i t z}}{\pi (1 + z^2)} = \lim_{z \to i} \frac{e^{i t z}}{\pi(z + i)} = \frac{e^{-t}}{2 \pi i} \] Hence by Cauchy's integral formula, \[ \int_{\Gamma_r} \frac{e^{i t z}}{\pi (1 + z^2} dz = 2 \pi i \frac{e^{-t}}{2 \pi i} = e^{-t} \]. Putting the pieces together we have \[ e^{-t} = \int_{-r}^r \frac{e^{i t x}}{\pi (1 + x^2)} dx + \int_{C_r} \frac{e^{i t z}}{\pi (1 + z^2)} dz \] Letting \(r \to \infty\) gives \[ \int_{-\infty}^\infty \frac{e^{i t x }}{\pi (1 + x^2)} dx = e^{-t} \] For \(t \lt 0\), we can use the substitution \(u = - x\) and our previous result to get \[ \int_{-\infty}^\infty \frac{e^{i t x}}{\pi (1 + x^2)} dx = \int_{-\infty}^\infty \frac{e^{i (-t) u}}{\pi (1 + u^2)} du = e^{t} \]

The standard Cauchy distribution a member of the Student \(t\) family of distributions.

The standard Cauchy distribution is the Student \( t \) distribution with one degree of freedom.

The Student \( t \) distribution with one degree of freedom has PDF \( g \) given by \[ g(t) = \frac{\Gamma(1)}{\sqrt{\pi} \Gamma(1/2)} \left(1 + t^2\right)^{-1} = \frac{1}{\pi (1 + t^2)}, \quad t \in \R \] which is the standard Cauchy PDF.

The standard Cauchy distribution also arises naturally as the ratio of independent standard normal variables.

Suppose that \(Z\) and \(W\) are independent random variables, each with the standard normal distribution. Then \(X = Z / W\) has the standard Cauchy distribution. Equivalently, the standard Cauchy distribution is the Student \(t\) distribution with 1 degree of freedom.

By definition, \(W^2\) has the chi-square distribution with 1 degree of freedom, and is independent of \(Z\). Hence, also by definition, \(X = Z / \sqrt{W^2} = Z / W\) has the Student \(t\) distribution with 1 degree of freedom, so the theorem follows from the previous result.

If \(X\) has the standard Cauchy distribution, then so does \(Y = 1 / X\)

This is a corollary of the previous result. Suppose that \(Z\) and \(W\) are independent variables, each with the standard normal distribution. Then \(X = Z / W\) has the standard Cauchy distribution. But then \(1/X = W/Z\) also has the standard Cauchy distribution.

The standard Cauchy distribution has the usual connections to the standard uniform distribution via the distribution and quantile functions.

The standard Cauchy distribution and the standard uniform distribution are related as follows:

- If \(U\) has the standard uniform distribution then \( X = G^{-1}(U) = \tan\left[\pi \left(U - \frac{1}{2}\right)\right] \) has the standard Cauchy distribution.
- If \( X \) has the standard Cauchy distribution then \( U = G(X) = \frac{1}{2} + \frac{1}{\pi} \arctan(X) \) has the standard uniform distribution.

Recall that if \( U \) has the standard uniform distribution, then \( G^{-1}(U) \) has distribution function \( G \). Conversely, if \( X \) has distribution function \( G \), then since \( G \) is strictly increasing, \( G(X) \) has the standard uniform distribution.

Since the quantile function has a simple, closed form, it's easy to simulate the standard Cauchy distribution using the random quantile method.

Open the random quantile experiment and select the Cauchy distribution. Keep the default parameter values and note again the shape and location of the distribution and probability density functions. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function. Note the behavior of the empirical mean and standard deviation.

For the Cauchy distribution, the random quantile method has a nice physical interpretation. Suppose that a light source is 1 unit away from position 0 of an infinite, straight wall. We shine the light at the wall at an angle \( \Theta \) (to the perpendicular) that is uniformly distributed on the interval \( \left(-\frac{\pi}{2}, \frac{\pi}{2}\right) \). Then the position \( X = \tan(\Theta) \) of the light beam on the wall has the standard Cauchy distribution. Note that this follows since \( \Theta \) has the same distribution as \( \pi \left(U - \frac{1}{2}\right) \) where \( U \) has the standard uniform distribution.

Open the Cauchy experiment and keep the default parameter values.

- Run the experiment in single-step mode a few times, to make sure that you understand the experiment.
- For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function. Note the behavior of the empirical mean and standard deviation.

Like so many other standard

distributions, the Cauchy distribution is generalized by adding location and scale parameters. Most of the results in this subsection follow immediately from results for the standard Cauchy distribution above and general results for location scale families.

Suppose that \(Z\) has the standard Cauchy distribution and that \(a \in \R\) and \(b \in (0, \infty)\). Then \(X = a + b Z\) has the Cauchy distribution with location parameter \(a\) and scale parameter \(b\).

Suppose that \(X\) has the Cauchy distribution with location parameter \(a \in \R\) and scale parameter \(b \in (0, \infty)\).

\(X\) has probability density function \(f\) given by \[ f(x) = \frac{b}{\pi [b^2 + (x - a)^2]}, \quad x \in \R\]

- \( f \) is symmetric about \( x = a \).
- \( f \) increases and then decreases, with mode \( x = a \).
- \( f \) is concave upward, then downward, then upward again, with inflection points at \( x = a \pm \frac{1}{\sqrt{3}} b \).
- \( f(x) \to 0 \) as \( x \to \infty \) and as \( x \to -\infty \).

Recall that \[ f(x) = \frac{1}{b} g\left(\frac{x - a}{b}\right) \] where \( g \) is the standard Cauchy PDF given above.

Open the special distribution simulator and select the Cauchy distribution. Vary the parameters and note the location and shape of the probability density function. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function.

\(X\) has distribution function \(F\) given by \[ F(x) = \frac{1}{2} + \frac{1}{\pi} \arctan\left(\frac{x - a}{b} \right), \quad x \in \R \]

Recall that \[ F(x) = G\left(\frac{x - a}{b}\right) \] where \( G \) is the standard Cauchy CDF given above.

\(X\) has quantile function \(F^{-1}\) given by \[ F^{-1}(p) = a + b \tan\left[\pi \left(p - \frac{1}{2}\right)\right], \quad p \in (0, 1) \] In particular,

- The first quartile is \(F^{-1}\left(\frac{1}{4}\right) = a - b\).
- The median is \(F^{-1}\left(\frac{1}{2}\right) = a\).
- The third quartile is \(F^{-1}\left(\frac{3}{4}\right) = a + b\).

Recall that \(F^{-1}(p) = a + b G^{-1}(p)\) where \( G^{-1} \) is the standard Cauchy quantile function given above.

Open the special distribution calculator and select the Cauchy distribution. Vary the parameters and note the shape and location of the distribution and probability density functions. Compute a few values of the distribution and quantile functions.

Since the mean and other moments of the standard Cauchy distribution do not exist, they don't exist for the general Cauchy distribution either.

Open the special distribution simulator and select the Cauchy distribution. For selected values of the parameters, run the simulation 1000 times and note the behavior of the sample mean.

But of course the characteristic function of the Cauchy distribution exists and is easy to obtain from the characteristic function of the standard distribution.

\(X\) has characteristic function \(\chi\) given by \( \chi(t) = \exp\left(a i t - b \left|t\right|\right)\) for \(t \in \R \).

Recall that \(\chi(t) = e^{i t a} \chi_0( b t)\) where \( \chi_0 \) is the standard Cauchy characteristic function given above.

Like all location-scale families, the general Cauchy distribution is closed under location-scale transformations.

Suppose that \(X\) has the Cauchy distribution with location parameter \(a \in \R\) and scale parameter \(b \in (0, \infty)\), and that \(c \in \R\) and \(d \in (0, \infty)\). Then \(Y = c + d X\) has the Cauchy distribution with location parameter \(c + d a\) and scale parameter \(b d\).

Much more interesting is the fact that the Cauchy family is closed under sums of independent variables. In fact, this is the main reason that the generalization to a location-scale family is justified.

Suppose that \(X_i\) has the Cauchy distribution with location parameter \(a_i \in \R\) and scale parameter \(b_i \in (0, \infty)\) for \(i \in \{1, 2\}\), and that \(X_1\) and \(X_2\) are independent. Then \(Y = X_1 + X_2\) has the Cauchy distribution with location parameter \(a_1 + a_2\) and scale parameter \(b_1 + b_2\).

This follows easily from characteristic functions. Let \(\chi_i\) denote the characteristic function of \(X_i\) for \(i = 1, 2\) and \(\chi\) the charactersitic function of \(Y\). Then \[ \chi(t) = \chi_1(t) \chi_2(t) = \exp\left(a_1 i t - b_1 \left|t\right|\right) \exp\left(a_2 i t - b_2 \left|t\right|\right) = \exp\left[\left(a_1 + a_2\right) i t - \left(b_1 + b_2\right) \left|t\right|\right] \]

As a corollary, the Cauchy distribution is stable, with index \( \alpha = 1 \):

If \( \bs{X} = (X_1, X_2, \ldots, X_n) \) is a sequence of independent variables, each with the Cauchy distribution with location parameter \( a \in \R \) and scale parameter \( b \in (0, \infty) \), then \( X_1 + X_2 + \cdots + X_n \) has the Cauchy distribution with location parameter \( n a \) and scale parameter \( n b \).

Another corollary is the strange property that the sample mean of a random sample from a Cauchy distribution has that same Cauchy distribution. No wonder the expected value does not exist!

Suppose that \(\bs{X} = (X_1, X_2, \ldots, X_n)\) is a sequence of independent random variables, each with the Cauchy distribution with location parameter \(a \in \R\) and scale parameter \(b \in (0, \infty)\). (That is, \(\bs{X}\) is a random sample of size \( n \) from the Cauchy distribution.) Then the sample mean \(M = \frac{1}{n} \sum_{i=1}^n X_i\) also has the Cauchy distribution with location parameter \(a\) and scale parameter \(b\).

From the previous result, \(Y = \sum_{i=1}^n X_i\) has the Cauchy distribution with location parameter \(n a\) and scale parameter \(n b\). But then by the scaling result above, \(M = Y / n\) has the Cauchy distribution with location parameter \(a\) and scale parameter \(b\).

The next result shows explicitly that the Cauchy distribution is infinitely divisible. But of course, infinite divisibility is also a consequence of stability.

Suppose that \( a \in \R \) and \( b \in (0, \infty) \). For every \(n \in \N_+\) the Cauchy distribution with location parameter \(a\) and scale parameter \(b\) is the distribution of the sum of \(n\) independent variables, each of which has the Cauchy distribution with location parameters \(a / n\) and scale parameter \(b/n\).

Our next result is a very slight generalization of the reciprocal result above for the standard Cauchy distribution.

Suppose that \(X\) has the Cauchy distribution with location parameter \(0\) and scale parameter \(b \in (0, \infty)\). Then \(Y = 1 / X\) has the Cauchy distribution with location parameter \(0\) and scale parameter \(1 / b\).

\(X\) has the same distribution as \(b Z\) where \(Z\) has the standard Cauchy distribution. Hence \(\frac{1}{X}\) has the same distribution as \(\frac{1}{b} \frac{1}{Z}\). But by (10), \(\frac{1}{Z}\) also has the standard Cauchy distribution, so \(\frac{1}{b} \frac{1}{Z}\) has the Cauchy distribution with location parameter \(0\) and scale parameter \(1 / b\).

As with its standard cousin, the general Cauchy distribution has simple connections with the standard uniform distribution via the distribution and quantile functions, and in particular, can be simulated via the random quantile method.

Suppose that \( a \in \R \) and \( b \in (0, \infty) \).

- If \( U \) has the standard uniform distribution, then \( X = F^{-1}(U) = a + b \tan\left[\pi \left(U - \frac{1}{2}\right)\right] \) has the Cauchy distribution with location parameter \( a \) and scale parameter \( g \)
- If \( X \) has the Cauchy distribution with location parameter \( a \) and scale parameter \( b \), then \( U = F(X) = \frac{1}{2} + \frac{1}{\pi} \arctan\left(\frac{X - a}{b} \right) \) has the standard uniform distribution.

Open the random quantile experiment and select the Cauchy distribution. Vary the parameters and note again the shape and location of the distribution and probability density functions. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function. Note the behavior of the empirical mean and standard deviation.

As before, the random quantile method has a nice physical interpretation. Suppose that a light source is \( b \) units away from position \( a \) of an infinite, straight wall. We shine the light at the wall at an angle \( \Theta \) (to the perpendicular) that is uniformly distributed on the interval \( \left(-\frac{\pi}{2}, \frac{\pi}{2}\right) \). Then the position \( X = a + b \tan(\Theta) \) of the light beam on the wall has the Cauchy distribution with location parameter \( a \) and scale parameter \( b \).

Open the Cauchy experiment. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function. Note the behavior of the empirical mean and standard deviation.