The Lognormal Distribution

$\newcommand{\R}{\mathbb{R}}$ $\newcommand{\N}{\mathbb{N}}$ $\newcommand{\E}{\mathbb{E}}$ $\newcommand{\P}{\mathbb{P}}$ $\newcommand{\var}{\text{var}}$ $\newcommand{\sd}{\text{sd}}$ $\newcommand{\cov}{\text{cov}}$ $\newcommand{\cor}{\text{cor}}$ $\newcommand{\skw}{\text{skew}}$ $\newcommand{\kur}{\text{kurt}}$

Basic Theory

Definition

Suppose that $Y$ has the normal distribution with mean $\mu \in \R$ and standard deviation $\sigma \in (0, \infty)$. Then $X = e^Y$ has the lognormal distribution with parameters $\mu$ and $\sigma$.

The parameter $ \sigma $ is the shape parameter of the distribution.
The parameter $ e^\mu$ is the scale parameter of the distribution.

If $Z$ has the standard normal distribution then $W = e^Z$ has the standard lognormal distribution.

So equivalently, if $X$ has a lognormal distribution then $\ln X$ has a normal distribution, hence the name. The lognormal distribution is a continuous distribution on $(0, \infty)$ and is used to model random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables.

Distribution Functions

Suppose that $X$ has the lognormal distribution with parameters $\mu \in \R$ and $\sigma \in (0, \infty)$.

The probability density function $f$ of $X$ is given by \[ f(x) = \frac{1}{\sqrt{2 \pi} \sigma x} \exp \left[-\frac{\left(\ln x - \mu\right)^2}{2 \sigma^2} \right], \quad x \in (0, \infty) \]

$ f $ increases and then decreases with mode at $ x = \exp\left(\mu - \sigma^2\right) $.
$ f $ is concave upward then downward then upward again, with inflection points at $ x = \exp\left(\mu - \frac{3}{2} \sigma^2 \pm \frac{1}{2} \sigma \sqrt{\sigma^2 + 4}\right) $
$ f(x) \to 0 $ as $ x \downarrow 0 $ and as $ x \to \infty $.

Details:

The form of the PDF follows from the change of variables theorem. Let $ g $ denote the PDF of the normal distribution with mean $ \mu $ and standard deviation $ \sigma $, so that \[ g(y) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left[-\frac{1}{2}\left(\frac{y - \mu}{\sigma}\right)^2\right], \quad y \in \R \] The mapping $ x = e^y $ maps $ \R $ one-to-one onto $ (0, \infty) $ with inverse $ y = \ln x $. Hence the PDF $ f $ of $ X = e^Y $ is \[ f(x) = g(y) \frac{dy}{dx} = g\left(\ln x\right) \frac{1}{x} \] Substituting gives the result. Parts (a)–(d) follow from standard calculus.

In the special distribution simulator, select the lognormal distribution. Vary the parameters and note the shape and location of the probability density function. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the true probability density function.

Let $\Phi$ denote the standard normal distribution function, so that $\Phi^{-1}$ is the standard normal quantile function. Recall that values of $\Phi$ and $\Phi^{-1}$ can be obtained from standard mathematical and statistical software packages, and in fact these functions are considered to be special functions in mathematics. The following two results show how to compute the lognormal distribution function and quantiles in terms of the standard normal distribution function and quantiles.

The distribution function $F$ of $X$ is given by \[ F(x) = \Phi \left( \frac{\ln x - \mu}{\sigma} \right), \quad x \in (0, \infty) \]

Details:

Once again, write $ X = e^{\mu + \sigma Z} $ where $ Z $ has the standard normal distribution. For $ x \gt 0 $, \[ F(x) = \P(X \le x) = \P\left(Z \le \frac{\ln x - \mu}{\sigma}\right) = \Phi \left( \frac{\ln x - \mu}{\sigma} \right) \]

The quantile function of $X$ is given by \[ F^{-1}(p) = \exp\left[\mu + \sigma \Phi^{-1}(p)\right], \quad p \in (0, 1) \]

Details:

This follows by solving $ p = F(x) $ for $ x $ in terms of $ p $.

In the quantile app, select the lognormal distribution. Vary the parameters and note the shape and location of the probability density function and the distribution function. With $\mu = 0$ and $\sigma = 1$, find the median and the first and third quartiles.

Moments

The moments of the lognormal distribution can be computed from the moment generating function of the normal distribution. Once again, we assume that $X$ has the lognormal distribution with parameters $\mu \in \R$ and $\sigma \in (0, \infty)$.

For $ t \in \R $, \[ \E\left(X^t\right) = \exp \left( \mu t + \frac{1}{2} \sigma^2 t^2 \right) \]

Details:

Recall that if $ Y $ has the normal distribution with mean $ \mu \in \R $ and standard deviation $ \sigma \in (0, \infty) $, then $ Y $ has moment generating function given by \[ \E\left(e^{t Y}\right) = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right), \quad t \in \R \] Hence the result follows immediately since $ \E\left(X^t\right) = \E\left(e^{t Y}\right) $.

In particular, the mean and variance of $X$ are

$\E(X) = \exp\left(\mu + \frac{1}{2} \sigma^2\right)$
$\var(X) = \exp\left[2 (\mu + \sigma^2)\right] - \exp\left(2 \mu + \sigma^2\right)$

In the simulation of the special distribution simulator, select the lognormal distribution. Vary the parameters and note the shape and location of the mean$ \pm $standard deviation bar. For selected values of the parameters, run the simulation 1000 times and compare the empirical moments to the true moments.

From the general formula for the moments, we can also compute the skewness and kurtosis of the lognormal distribution.

The skewness and kurtosis of $X$ are

$ \skw(X) = \left(e^{\sigma^2} + 2\right) \sqrt{e^{\sigma^2} - 1} $
$\kur(X) = e^{4 \sigma^2} + 2 e^{3 \sigma^2} + 3 e^{2 \sigma^2} - 3$

Details:

These result follow from the first four moments of the lognormal distribution in and the standard computational formulas for skewness and kurtosis.

The fact that the skewness and kurtosis do not depend on $ \mu $ is due to the fact that $ \mu $ is a scale parameter. Recall that skewness and kurtosis are defined in terms of the standard score and so are independent of location and scale parameters. Naturally, the lognormal distribution is positively skewed. Finally, note that the excess kurtosis is \[ \kur(X) - 3 = e^{4 \sigma^2} + 2 e^{3 \sigma^2} + 3 e^{2 \sigma^2} - 6 \]

Even though the lognormal distribution has finite moments of all orders, the moment generating function is infinite at any positive number. This property is one of the reasons for the fame of the lognormal distribution.

$\E\left(e^{t X}\right) = \infty$ for every $t \gt 0$.

Details:

By definition, $X = e^Y$ where $Y$ has the normal distribution with mean $\mu$ and standard deviation $\sigma$. Using the change of variables formula for expected value we have \[\E\left(e^{t X}\right) = \E\left(e^{t e^Y}\right) = \int_{-\infty}^\infty \exp(t e^y) \frac{1}{\sqrt{2 \pi} \sigma} \exp\left[-\frac{1}{2}\left(\frac{y - \mu}{\sigma}\right)^2\right] dy = \frac{1}{\sqrt{2 \pi} \sigma} \int_{-\infty}^\infty \exp\left[t e^y - \frac{1}{2} \left(\frac{y - \mu}{\sigma}\right)^2\right] dy\] If $t \gt 0$ the integrand in the last integral diverges to $\infty$ as $y \to \infty$, so there is no hope that the integral converges.

Related Distributions

The most important relations are the ones between the lognormal and normal distributions in the definition: if $X$ has a lognormal distribution then $\ln X$ has a normal distribution; conversely if $Y$ has a normal distribution then $e^Y$ has a lognormal distribution. It's easy to write a general lognormal variable in terms of a standard lognormal variable.

Suppose that $W$ has the standard lognormal distribution and that $\mu \in \R$ and $\sigma \in (0, \infty)$. Then $X = e^\mu W^\sigma$ has the lognormal distribution with parameters $\mu$ and $\sigma$.

Details:

Suppose that $Z$ has the standard normal distribution and let $W = e^Z$ so that $W$ has the standard lognormal distribution. If $\mu \in \R$ and $\sigma \in (0, \infty)$ then $Y = \mu + \sigma Z$ has the normal distribution with mean $\mu$ and standard deviation $\sigma$ and hence $X = e^Y$ has the lognormal distribution with parameters $\mu$ and $\sigma$. But \[X = e^Y = e^{\mu + \sigma Z} = e^\mu \left(e^Z\right)^\sigma = e^\mu W^\sigma\]

Suppose that $ X $ has the lognormal distribution with parameters $ \mu \in \R $ and $ \sigma \in (0, \infty) $ and that $ c \in (0, \infty) $. Then $ c X $ has the lognormal distribution with parameters $ \mu + \ln c$ and $ \sigma $.

Details:

From definition , we can write $ X = e^Y $ where $ Y $ has the normal distribution with mean $ \mu $ and standard deviation $ \sigma $. Hence \[ c X = c e^Y = e^{\ln c} e^Y = e^{\ln c + Y} \] But $ \ln c + Y $ has the normal distribution with mean $ \ln c + \mu $ and standard deviation $ \sigma $.

If $X$ has the lognormal distribution with parameters $\mu \in \R$ and $\sigma \in (0, \infty)$ then $1 / X$ has the lognormal distribution with parameters $-\mu$ and $\sigma$.

Details:

Again from definition , we can write $ X = e^Y $ where $Y$ has the normal distribution with mean $\mu$ and standard deviation $\sigma$. Hence $1 / X = e^{-Y}$. But $-Y$ has the normal distribution with mean $-\mu$ and standard deviation $\sigma$.

The lognormal distribution is closed under non-zero powers of the underlying variable. In particular, this generalizes .

Suppose that $X$ has the lognormal distribution with parameters $\mu \in \R$ and $\sigma \in (0, \infty)$ and that $a \in \R \setminus \{0\}$. Then $X^a$ has the lognormal distribution with parameters with parameters $a \mu$ and $|a| \sigma$.

Details:

Again from definition , we can write $ X = e^Y $ where $Y$ has the normal distribution with mean $\mu$ and standard deviation $\sigma$. Hence $X^a = e^{a Y}$. But $a Y$ has the normal distribution with mean $a \mu$ and standard deviation $|a| \sigma$.

Since the normal distribution is closed under sums of independent variables, it's not surprising that the lognormal distribution is closed under products of independent variables.

Suppose that $n \in \N_+$ and that $(X_1, X_2, \ldots, X_n)$ is a sequence of independent variables, where $X_i$ has the lognormal distribution with parameters $\mu_i \in \R$ and $\sigma_i \in (0, \infty)$ for $i \in \{1, 2, \ldots, n\}$. Then $\prod_{i=1}^n X_i$ has the lognormal distribution with parameters $\mu$ and $\sigma$ where $\mu = \sum_{i=1}^n \mu_i$ and $\sigma^2 = \sum_{i=1}^n \sigma_i^2$.

Details:

Again from definition , we can write $ X_i = e^{Y_i} $ where $Y_i$ has the normal distribution with mean $\mu_i$ and standard deviation $\sigma_i$ for $i \in \{1, 2, \ldots, n\}$ and where $(Y_1, Y_2, \ldots, Y_n)$ is an independent sequence. Hence $\prod_{i=1}^n X_i = \exp\left(\sum_{i=1}^n Y_i\right)$. But $\sum_{i=1}^n Y_i$ has the normal distribution with mean $\sum_{i=1}^n \mu_i$ and variance $\sum_{i=1}^n \sigma_i^2$.

Suppose that $ X $ has the lognormal distribution with parameters $ \mu \in \R $ and $ \sigma \in (0, \infty) $. The distribution of $ X $ is a 2-parameter exponential family with natural parameters and natural statistics, respectively, given by

$\left( -1 / 2 \sigma^2, \mu / \sigma^2 \right)$
$\left(\ln^2(X), \ln X\right)$

Details:

This follows from the definition of the general exponential family, since we can write the lognormal PDF in in the form \[ f(x) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left(-\frac{\mu^2}{2 \sigma^2}\right) \frac{1}{x} \exp\left[-\frac{1}{2 \sigma^2} \ln^2(x) + \frac{\mu}{\sigma^2} \ln x\right], \quad x \in (0, \infty) \]

Computational Exercises

Suppose that the income $X$ of a randomly chosen person in a certain population (in $1000 units) has the lognormal distribution with parameters $\mu = 2$ and $\sigma = 1$. Find $\P(X \gt 20)$.

Details:

$\P(X \gt 20) = 0.1597$

$\E(X)$
$\var(X)$

Details:

$\E(X) = e^{5/2} \approx 12.1825$
$\sd(X) = \sqrt{e^6 - e^5} \approx 15.9629$