Thinning and Superposition

\(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\bs}{\boldsymbol}\) \(\newcommand{\var}{\text{var}}\) \(\newcommand{\sd}{\text{sd}}\) \(\newcommand{\skw}{\text{skew}}\) \(\newcommand{\kur}{\text{kurt}}\)

Recall that a Poisson process with rate \(r \in (0, \infty)\) defines three interrelated stochastic processes:

The sequence of inter-arrival times \( \bs{X} = (X_1, X_2, \ldots) \). These are indpendent random variables and each has the exponential distribution on \([0, \infty)\) with rate \(r\).
The sequence of arrival times \(\bs{T} = (T_0, T_1, \ldots)\). This is the partial sum sequence associated with \(\bs X\), so that \(T_n\) has the gamma distribution on \([0, \infty)\) for \(n \in \N_+\) with shape parameter \(n\) and rate parameter \(r\).
The counting process \(\bs{N} = (N_t: t \ge 0)\) where \(N_t\) is the number of arrivals in \((0, t]\) for \(t \in [0, \infty)\) and has the Poisson distribution on \(\N\) with parameter \(r t\). This process has an inverse relationship with \(\bs T\) so that \(\N_t \ge n\) if and only if \(T_n \le t\) for \(n \in \N\) and \(t \in [0, \infty)\).

Thinning

Thinning or splitting a Poisson process refers to classifying each random point, independently, into one of a finite number of different types. The random points of a given type also form Poisson processes, and these processes are independent. Our exposition will concentrate on the case of just two types, but this case has all of the essential ideas.

The Two-Type Process

Consider a Poisson process with rate \(r \in (0, \infty)\). Suppose that each arrival, independently of the others, is one of two types: type 1 with probability \(p\) and type 0 with probability \(1 - p\), where \(p \in (0, 1)\) is a parameter. Here are some common examples:

The arrivals are radioactive emissions and each emitted particle is either detected (type 1) or missed (type 0) by a counter.
The arrivals are customers at a service station and each customer is classified as either male (type 1) or female (type 0).

We want to consider the type 1 and type 0 random points separately. For this reason, the new random process is usually referred to as thinning or splitting the original Poisson process. In some applications, the type 1 points are accepted while the type 0 points are rejected. The main result of this section is that the type 1 and type 0 points form separate Poisson processes, with rates \(r p\) and \(r (1-p)\) respectively, and are independent. We will explore this important result from several points of view.

In the previous sections, we have explored the analogy between the Bernoulli trials process and the Poisson process. Both have the strong renewal property that at each fixed time and at each arrival time, the process stochastically restarts, independently of the past. The difference, of course, is that time is discrete in the Bernoulli trials process and continuous in the Poisson process. In this section, we have both processes simultaneously, and given our previous explorations, it's perhaps not surprising that this leads to some interesting mathematics.

Consider the two-type Poisson process in . In addition to the processes \(\bs{X}\), \(\bs{T}\), and \(\bs{N}\) in , we have a sequence of Bernoulli trials \(\bs{I} = (I_1, I_2, \ldots)\) with success parameter \(p\). Indicator variable \(I_j\) specifies the type of the \(j\)th arrival. Moreover, because of our assumptions, \(\bs{I}\) is independent of \(\bs{X}\), \(\bs{T}\), and \(\bs{N}\).

The next result shows that the type 1 points form a Poisson process with rate \(p r\).

For the two-type process in , consider just the type 1 points. The time between the arrival of \((k - 1)\)st and \(k\)th type 1 point is \[ Y_k = \sum_{i = V_{k-1} + 1}^{V_k} X_i, \quad k \in \N_+ \] \(\bs{Y} = (Y_1, Y_2, \ldots)\) is a sequence of independent variables and each has the exponential distribution with rate parameter \(p r\).

Details:

Note that \(Y_k\) has \(U_k\) terms. From the renewal properties of the Poisson process and the Bernoulli trials process, the inter-arrival times are independent and identically distributed. Each inter-arrival time is the sum of a random number of independent terms; each term has the exponential distribution with rate \(r\), and the number of terms has the geometric distribution on \(\N_+\) with parameter \(p\). Moreover, the number of terms is independent of the terms themselves. We showed in the section on the exponential distribution that a random sum of this form has the exponential distribution with parameter \(r p\).

Similarly, if \(\bs{Z} = (Z_1, Z_2, \ldots)\) is the sequence of interarrvial times for the type 0 points, then \(\bs{Z}\) is a sequence of independent variables, and each has the exponential distribution with rate \((1 - p) r\). Moreover, \(\bs{Y}\) and \(\bs{Z}\) are independent.

For \(t \in [0, \infty)\), let \(M_t\) denote the number of type 1 arrivals in \((0, t]\) and \(W_t\) the number of type 0 arrivals in \((0, t]\). So \(\bs{M} = (M_t: t \ge 0)\) and \(\bs{W} = (W_t: t \ge 0)\) are the counting processes for the type 1 arrivals and for the type 0 arrivals.

For \(t \ge 0\), \(M_t\) has the Poisson distribution with parameter \(p r\), \(W_t\) has the Poisson distribution with parameter \((1 - p) r\), and \(M_t\) and \(W_t\) are independent.

Details:

The important observation is that the conditional distribution of \(M_t\) given \(N_t = n\) is binomial with parameters \(n\) and \(p\). Thus for \(j \in \N\) and \(k \in \N\), \begin{align} \P(M_t = j, W_t = k) & = \P(M_t = j, N_t = j + k) = \P(N_t = j + k) \P(M_t = j \mid N_t = j + k) \\ & = e^{-r t} \frac{(r t)^{j + k}}{(j + k)!} \frac{(j + k)!}{j! k!} p^j (1 - p)^k \\ & = e^{-p r t} \frac{(p r t)^j}{j!} e^{-(1 - p) r t} \frac{\left[(1 - p) r t\right]^k}{k!} \end{align}

In the two-type Poisson experiment vary \(r\), \(p\), and \(t\) with the scrollbars and note the shape of the probability density functions. For various values of the parameters, run the experiment 1000 times and compare the relative frequency functions to the probability density functions.

Estimating the Number of Arrivals

Suppose that the type 1 arrivals are observable, but not the type 0 arrivals. This setting is natural, for example, if the arrivals are radioactive emissions, and the type 1 arrivals are emissions that are detected by a counter, while the type 0 arrivals are emissions that are missed. Suppose that for a given \(t \gt 0\), we would like to estimate the total number arrivals \(N_t\) after observing the number of type 1 arrivals \(M_t\).

The conditional distribution of \(N_t\) given \(M_t = k\) is the same as the distribution of \(k + W_t\). \[ \P(N_t = n \mid M_t = k) = e^{-(1 - p) r t} \frac{\left[(1 - p) r t\right]^{n - k}}{(n - k)!}, \quad n \in \{k, k + 1, \ldots\} \]

Details:

Recall from that \(M_t\) and \(W_t\) are independent. Thus, for \(n \in \N\), \begin{align} \P(N_t = n \mid M_t = k) & = \frac{\P(N_t = n, M_t = k)}{\P(M_t = k)} = \frac{\P(M_t = k, W_t = n - k)}{\P(M_t = k)} \\ & = \frac{\P(M_t = k) \P(W_t = n - k)}{\P(M_t = k)} = \P(W_t = n - k) \end{align} The form of the probability density function follows since \(W_t\) as the Poisson distribution with parameter \((1 - p) r\).

\(\E(N_t \mid M_t = k) = k + (1 - p)\,r\).

Details:

This follows easily from exercise since \(\E(N_t \mid M_t = k) = \E(k + W_t) = k + (1 - p) r\).

So if the overall rate \(r\) of the process and the probability \(p\) that an arrival is type 1 are known, then it follows form the general theory of conditional expectation that the best estimator of \(N_t\) based on \(M_t\), in the least squares sense, is \[ \E(N_t \mid M_t) = M_t + (1 - p) r \]

The mean square error is \(\E\left(\left[N_t - \E(N_t \mid M_t)\right]^2 \right) = (1 - p) r t\).

Details:

Note that \(N_t - \E(N_t \mid M_t) = W_t - (1 - p) r\). Thus the mean square error is just \(\var(W_t) = (1 - p) r t\).

As you might guess, the results in this subsection generalize from 2 types to \(k\) types for general \(k \in \N_+\). Once again, we start with a Poisson process with rate \(r \gt 0\). Suppose that each arrival, independently of the others, is type \(i\) with probability \(p_i\) for \(i \in \{0, 1, \ldots, k - 1\}\). Of course we must have \(p_i \ge 0\) for each \(i\) and \(\sum_{i=0}^{k-1} p_i = 1\). Then for each \(i\), the type \(i\) points form a Poisson process with rate \(p_i r\), and these processes are independent.

Superposition

Complementary to splitting or thinning a Poisson process is superposition: if we combine the random points in time from independent Poisson processes, then we have a new Poisson processes. The rate of the new process is the sum of the rates of the processes that were combined. Once again, our exposition will concentrate on the superposition of two processes. This case contains all of the essential ideas.

Two Processes

Suppose that we have two independent Poisson processes. We will denote the sequence of inter-arrival times, the sequence of arrival times, and the counting variables for the process \( i \in \{1, 2\} \) by \( \bs{X}^i = \left(X_1^i, X_2^i \ldots\right) \), \( \bs{T}^i = \left(T_1^i, T_2^i, \ldots\right) \), and \( \bs{N}^i = \left(N_t^i: t \in [0, \infty)\right) \), and we assume that process \( i \) has rate \( r_i \in (0, \infty) \). The new process that we want to consider is obtained by simply combining the random points. That is, the new random points are \( \left\{T_n^1: n \in \N_+\right\} \cup \left\{T_n^2: n \in \N_+\right\} \), but of course then ordered in time. We will denote the sequence of inter-arrival times, the sequence of arrival times, and the counting variables for the new process by \( \bs{X} = (X_1, X_2 \ldots) \), \( \bs{T} = (T_1, T_2, \ldots) \), and \( \bs{N} = \left(N_t: t \in [0, \infty)\right) \).

Clearly if \( A \) is an interval in \( [0, \infty) \) then \[ N(A) = N^1(A) + N^2(A) \] the number of combined points in \( A \) is simply the sum of the number of point in \( A \) for processes 1 and 2. It's also worth noting that \[ X_1 = \min\{X^1_1, X^2_1\} \] the first arrival for the combined process is the smaller of the first arrival times for processes 1 and 2. The other inter-arrival times, and hence also the arrival times, for the combined process are harder to state.

The combined process is a Poisson process with rate \( r_1 + r_2 \).

Details:

As noted above, if \( A \) is a subinterval of \( [0, \infty) \) then \( N(A) = N^1(A) + N^2(A) \). The first term has the Poisson distribution with parameter \( r_1 \lambda(A) \), the second term has the Poisson distribution with parameter \( r_2 \lambda(A) \), and the terms are independent. Hence \( N(A) \) has the Poisson distribution with parameter \( r_1 \lambda(A) + r_2 \lambda(A) = (r_1 + r_2) \lambda(A) \). Thus the counting process has stationary, Poisson distributed increments. Next, if \( (A_1, A_2, \ldots, A_n)\) is a sequence of disjoint subintervals of \( [0, \infty) \) then \[ \left(N(A_1), N(A_2), \ldots, N(A_n)\right) = \left(N^1(A_1) + N^2(A_1), N^1(A_2) + N^2(A_2), \ldots, N^1(A_n) + N^2(A_n) \right) \] is an independent sequence, so the counting process has independent increments.

Computational Exercises

In the two-type Poisson experiment, set \(r = 2\), \(t = 3\), and \(p = 0.7\). Run the experiment 1000 times, Compute the appropriate relative frequency functions and investigate empirically the independence of the number of type 1 points and the number of type 0 points.

Suppose that customers arrive at a service station according to the Poisson model, with rate \(r = 20\) per hour. Moreover, each customer, independently, is female with probability 0.6 and male with probability 0.4. Find the probability that in a 2 hour period, there will be at least 20 women and at least 15 men.

Details:

0.5814

In the two-type Poisson experiment, set \(r = 3\), \(t = 4\), and \(p = 0.8\). Run the experiment 100 times.

Compute the estimate of \(N_t\) based on \(M_t\) for each run.
Over the 100 runs, compute average of the sum of the squares of the errors.
Compare the result in (b) with .

Suppose that a piece of radioactive material emits particles according to the Poisson model at a rate of \( r = 100 \) per second. Moreover, assume that a counter detects each emitted particle, independently, with probability 0.9. Suppose that the number of detected particles in a 5 second period is 465.

Estimate the number of particles emitted.
Compute the mean square error of the estimate.

Details: