Endpoint Addition

\(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\ms}{\mathscr}\) \(\newcommand{\bs}{\boldsymbol}\) \(\newcommand{\rta}{\rightarrow}\) \(\newcommand{\lfrta}{\leftrightarrow}\) \(\newcommand{\Rta}{\Rightarrow}\) \(\newcommand{\LfRta}{\Leftrightarrow}\)

In this section we study a simple construction that adds endpoints to a given graph, and leads to the anlysis of the interesting collection of binomial trees.

Basics

Suppose that \((S, \rta)\) is a discrete, graph, so a graph in the combinatorial sense. Let Let \(T = S \cup S^\prime\) where \(S^\prime = \{t_x: x \in S\}\) is a distinct set of elements, disjoint from \(S\) and indexed by \(x \in S\). Extend the relation \(\rta\) to \(T\) by \[ x \rta t_x, \; t_x \rta x, \quad x \in S \] The graph \((T, \rta)\) is the graph obtained from \((S, \rta)\) by endpoint addition.

That is, we simply add an endpoint \(t_x\) to \(x \in S\) by an undirected edge for each \(x \in S\). If the original graph \((S, \rta)\) is symmetric (undirected) then so is \((T, \rta)\). A graph obtained by endpoint addition is always stochastic.

Suppose that \((S, \rta)\) is a discrete graph and \((T, \rta)\) the graph obtained from \((S, \rta)\) by endpoint addition, as in definition .

The (right) \(\sigma\)-algebra associated with \((T, \rta)\) is the reference \(\sigma\)-algebra \(\ms P(T)\).
If \(f\) is a probability density function on \(T\) and \(F\) the corresponding reliability function for \((T, \rta)\) then for \(x \in S\), \begin{align*} f(x) & = F(t_x) \\ f(t_x) & = F(x) - \sum\left\{F(t_y): y \in S, \, x \rta y\right\} \end{align*}

Details:

Let \(A_y\) denote the right neighbor set of \(y \in T\) so that \(\ms A = \sigma(\{A_y: y \in T\})\) is the \(\sigma\)-algebra associated with \((T, \rta)\). Then \(A_{t_x} = \{x\}\) for \(x \in S\), so \(\{x\} \in \ms A\) for \(x \in S\). Hence \(B \in \ms A\) for every \(B \subseteq S\). But then also, \[\{t_x\} = A_x - \{u \in S: x \rta u\} \in \ms A\]
The first part of the displayed equation follows from the construction since \(t_x \rta x\) and \(x\) is the only such point. For the second part, \[F(x) = \sum \{f(y): y \in T, \, x \rta y\} = \sum \{f(y): y \in S, \, x \rta y\} + f(t_x) = \sum\{F(t_y): y \in S, \, x \rta y\} + f(t_x) \]

Given a probability distribution on \(S\), there is a natural way to construct a probability distribution on \(T\).

Suppose that \(f\) is a probability density function on \(S\) and that \(p \in (0, 1)\). Define the probability density function \(g\) on \(T\) by \[ g(x) = p f(x), \; g(t_x) = (1 - p) f(x), \quad x \in S \]

So if \(Y\) is a random variable in \(T\) with density function \(g\) then \(\P(Y \in S) = p\) and \(f\) is the conditional density of \(Y\) given \(Y \in S\). Moreover, \(\P(Y \in S^\prime) = 1 - p\) and the conditional density of \(Y\) given \(Y \in S^\prime\) is \(t_x \mapsto f(x)\).

in the context of definition , let \(F\) denote the reliability function of \(f\) for \((S, \rta)\) and \(G\) the reliability function of \(g\) for \((T, \rta)\). Then \[G(x) = p F(x) + (1 - p) f(x), \; G(t_x) = p f(x), \quad x \in S \]

Details:

Let \(x \in S\). Then by construction, \(G(t_x) = g(x) = p f(x)\) since \(t_x \rta x\) and \(x\) is the only such point. On the other hand, \[G(x) = \sum \{g(y): y \in T, \, x \rta y\} = \sum \{g(y): y \in S, \, x \rta y\} + g(t_x) = \sum \{p f(y): y \in S, \, x \rta y\} + (1 - p) f(x) = p F(x) + (1 - p) f(x)\]

Our main result is that if the density function \(f\) has constant rate for \((S, \rta)\) then with an appropriate choice of \(p\), the density function \(g\) has constant rate for \((T, \rta)\).

Suppose that \(f\) is a probability density function on \(S\) that has constant rate \(\alpha \in (0, \infty)\) for \((S, \rta)\). Let \begin{align*} \beta &= \frac{\sqrt{4 \alpha^2 + 1} - 1}{2 \alpha} \\ p &= \frac{1}{1 + \beta} = \frac{2 \alpha}{\sqrt{4 \alpha^2 + 1} + (2 \alpha - 1)} \end{align*} Then the density \(g\) constructed in from \(f\) and \(p\) has constant rate \(\beta\) for \((T, \rta)\).

Details:

Let \(x \in S\). Using and the fact that \(f\) has constant rate \(\alpha\) for \((S, \rta)\) we have \[g(x) = \frac{1}{1 + \beta} f(x) = \frac{\alpha}{1 + \beta} F(x) = \alpha \left[G(x) - \frac{\beta}{1 + \beta} f(x) \right] = \alpha [G(x) - \beta g(x)]\] Hence \[g(x) = \frac{\alpha}{1 + \alpha \beta} G(x) = \beta G(x)\] The last equality holds since \(\beta\) is defined by the quadratic equation \(\alpha \beta^2 + \beta - \alpha = 0\). Next, \[g(t_x) = \frac{\beta}{1 + \beta} f(x) = \beta g(x) = \beta G(t_x), \quad x \in S\]

In the context of ,

\(\beta \lt \alpha\)
\(\beta\) is an increasing function of \(\alpha\) and \(p\) is a decreasing function of \(\alpha\).
\(\beta \to 0\) and \(p \to 1\) as \(\alpha \to 0\)
\(\beta \to 1\) and \(p \to \frac{1}{2}\) as \(\alpha \to \infty\)

Details:

These results follow from calculus.

Suppose that \((S_1, \rta)\) is a discrete graph that has a distribution with constant rate \(\alpha_1 \in (0, \infty)\). Recursively, for \(n \in \N_+\), let \((S_{n + 1}, \rta)\) denote the graph obtained from \((S_n, \rta)\) by endpoint addition as in . So this graph has a distribution with constant rate \[ \alpha_{n + 1} = \frac{\sqrt{4 \alpha_n^2 + 1} - 1}{2 \alpha_n}, \quad n \in \N_+ \]

In the context of , \(\alpha_n \to 0\) as \(n \to \infty\).

Details:

By construction, \(\alpha_{n + 1} = \varphi(\alpha_n)\) for \(n \in \N_+\) where \[ \varphi(t) = \frac{\sqrt{4 t^2 + 1} - 1}{2 t}, \quad t \in (0, \infty)\] We define \(\varphi(0) = 0\) so that \(\varphi\) is continuous on \([0, \infty)\). So then \(\alpha_n\) is decreasing in \(n\) and converges to the only fixed point of \(\varphi\), namely 0.

Examples

Binomial Trees

The most important application of endpoint addition is to binomial trees, since these trees are generated precisely by this method.

The (undirected) binomial trees are defined recursively as follows:

The binomial tree of order 0 is a single point.
For \(n \in \N\), the binomial tree of order \(n + 1\) is constructed by endpoint addition to the binomial tree of order \(n\).

The binomial tree of order \(n \in \N_+\) has \(2^n\) vertices.

So the vertices of the binomial tree of order \(n \in \N_+\) can be labeled with elements of \(\{0, 1\}^n\), that is, bit strings of length \(n\). It helps to have a consistent way to do that.

The vertices of the binomial trees are labeled with bit strings as follows:

The binomial tree of order 1 is a path with two vertices. Label the vertices 0 and 1 arbitarily.
For \(n \in \N_+\), to construct the labeled binomial tree of order \(n + 1\), start with the labeled binomial tree of order \(n\) but add bit 1 to the end of each string. Add an endpoint to each vertex with the same label as the vertex except that the final bit is changed to 0.

With this labeling, the graph \(B_n = \left(\{0, 1\}^n, \sim\right)\) is the binomial tree of order \(n\). The graph relations are given recursively as follows: \(0 \sim 1\) in \(B_1\) and for \(y, \, z \in \{0, 1\}^n\), \begin{align*} y 0 & \sim y 1 \text{ in } B_{n + 1} \\ y 1 & \sim z 1 \text{ in } B_{n + 1} \text{ if and only if } y \sim z \text{ in } B_n \end{align*}

The following result identifies the degrees of the vertices in the binomial tree. To state the result succinctly, let \(1^{(j)}\) denote the string with 1 repeated \(j\) times \(j \in \N_+\). Let \(1^{(0)}\) denote the empty string.

Let \(B_n\) denote the binomial tree of order \(n \in \N_+\), as defined in .

For \(k \in \N_+\) with \(k \lt n\), there are \(2^{n - k}\) vertices of degree \(k\). These have labels with terminal bit strings of the form \(0 1^{(k - 1)}\)
There are 2 vertices of degree \(n\) these are labeled with terminal bit strings \(1^{(n - 1)}\).

In particulr, the binomial tree of order \(n \ge 2\) has \(2^{n - 1}\) endpoints. The following result restates in the notation of our bit string labeling:

Suppose that \(n \in \N_+\) and that \(f\) is a probability density function on \(\{0, 1\}^{n + 1}\) that has reliability function \(F\) for the binomial tree \(B_{n + 1}\). Then for \(y \in \{0, 1\}^n\), \begin{align*} f(y 1) & = F(y 0) \\ f(y 0) & = F(y 1) - \sum \{F(z 0): z \sim y \text{ in } B_n\} \end{align*}

With the bit string notation, it's easy to describe the probability distributions on the binomial trees generated recursively by the construction in . A bit of additional notation will help. For \(p \in (0, 1)\), let \(p(0) = 1 - p\) and \(p(1) = p\).

Let \(X_0\) have the point mass distribution on the single vertex in the binomial tree of order 0. For \(n \in \N\), let \(X_{n + 1}\) have the distribution on \(\{0, 1\}^{n + 1}\) constructed from the distribution of \(X_n\) on \(\{0, 1\}^n\) as in , with parameter \(p_{n + 1} \in (0, 1)\). Then the random bits \(I_1 I_2 \cdots I_n\) of \(X_n\) are independent Bernoulli variables and \(I_k\) has parameter \(p_k\) for \(k \in \{1, 2, \ldots, n\}\). That is, the probability density function \(f_n\) of \(X_n\) is given by \[f_n(i_1 i_2 \cdots i_n) = p_1(i_1) p_2(i_2) \cdots p_n(i_n), \quad i_1 i_2 \cdots i_n \in \{0, 1\}^n \]

If \(p_k = p \in (0, 1)\) for \(k \in \N_+\) then the bits of \(X_n\) are a sequence of \(n\) Bernoulli trials with parameter \(p\) for \(n \in \N_+\). The following result gives the distribution of the degree of \(X_n\).

For \(n \in \N_+\) let \(X_n\) denote the random variable in \(\{0, 1\}^n\) constructed in and let \(V_n\) denote the degree of \(X_n\) in the binomial tree \(B_n\). Then \(V_n\) has density function \(h_n\) given by \begin{align*} h_n(k) &= (1 - p_{n - k + 1}) \prod_{j = n - k + 2}^n p_j, \quad k \in \{1, 2, \ldots, n - 1\} \\ h_n(n) &=\prod_{j = 2}^{n} p_j \end{align*}

Details:

This follows from and : \begin{align*} \P(V_n = k) &= \P(I_{n - k + 1} = 0, I_j = 1 \text{ for } j \in \{n - k + 2, \ldots, n\}), \quad k \in \{1, 2, \ldots, n - 1\} \\ \P(V_n = n) &= \P(I_j = 1 \text{ for } j \in \{2, 3, \ldots n\}) \end{align*}

When \(p_k = p\) for all \(k \in \N_+\) then \(V_n\) has a truncated geometric distribution:

Suppose that \(p_k = p\) for \(k \in \N_+\) and for \(n \in \N_+\) and let \(V_n = \deg(X_n)\) as in .

\(V_n\) has density function \(h_n\) given by \(h_n(k) = (1 - p) p^{k - 1}\) for \(k \in \{1, 2, \ldots, n - 1\}\) and \(h_n(n) = p^{n - 1}\).
The distribution of \(V_n\) converges to the geometric distribution with success parameter \(1 - p\) as \(n \to \infty\).

Of course for us, the most interesting case is when \(X_n\) has constant rate for the binomial tree \(B_n\) for each \(n \in \N_+\). The following results are a special case of .

Let \(\alpha_1 = 1\) and then recursively let \[ \alpha_{n + 1} = \frac{\sqrt{4 \alpha_n^2 + 1} - 1}{2 \alpha_n}, \quad n \in \N_+ \] Let \(p_n = 1 / (1 + \alpha_n)\) for \(n \in \N_+\). Then random variable \(X_n\) in has constant rate \(\alpha_n\) on the binomial tree \(B_n\) for each \(n \in \N_+\).

Details:

This follows from since the binomial tree of order 1 is a walk on two vertices and hence the uniform distribution has constant rate 1 for this tree.

We can also give a recurrence relation for the parameters \((p_n: n \in \N_+)\) directly: \(p_1 = \frac{1}{2}\) and \[ p_{n + 1} = \frac{2 - 2 p_n}{(2 - 3 p_n) + \sqrt{5 p_n^2 - 8 p_n + 4}}, \quad n \in \N_+ \]

\(\alpha_n \downarrow 0\) and \(p_n \uparrow 1\) as \(n \to \infty\).

Explicit formulas for \(\alpha_n\) and \(p_n\) as a function of \(n \in \N_+\) seem difficult. But we can give the results for the first few values. We already know what happens when \(n = 1\); only the notation is different.

The binomial tree of order 1 is a path on 2 vertices. The rate constant and probability are \(\alpha_1 = 1\) and \(p_1 = 1 / 2\). The constant rate distribution is uniform, with density \(f_1\) given by \(f_1(0) = 1 / 2 = f_1(1) = 1 / 2\).

When \(n = 2\) we also have a path so the result is a special case of general results on paths in Section 7.3.

The binomial tree of order 2 is a path on 4 vertices. The rate constant and probability are \(\alpha_2 = p_2 = \left(\sqrt{5} - 1\right) \big/ 2 \approx 0.6180\). The constant rate distribution has density \(f_2\) given by \begin{align*} f_2(00) & = f_2(10) = \frac{3 - \sqrt{5}}{4} \\ f_2(01) & = f_2(11) = \frac{\sqrt{5} - 1}{4} \end{align*}

For the binomial tree of order 3, the rate constant and probability are \[ \alpha_3 = \frac{\sqrt{7 - 2 \sqrt{5}} - 1}{\sqrt{5} - 1} \approx 0.4773, \; p_3 = \frac{\sqrt{5} - 1}{\sqrt{7 - 2 \sqrt{5}} + \sqrt{5} - 2} \approx 0.6769 \] The constant rate distribution has density \(f_3\) given by \begin{align*} f_3(000) & = f_3(100) = \frac{\left(\sqrt{5} - 3\right) \left(\sqrt{7 - 2 \sqrt{5}} - 1\right)}{4 \left({\sqrt{7 - 2 \sqrt{5}} + \sqrt{5} - 2}\right)} \approx 0.06170\\ f_3(010) & = f_3(110) = \frac{\sqrt{5} - 1}{4} \left(1 - \frac{\sqrt{5} - 1}{\sqrt{7 - 2 \sqrt{5}} + \sqrt{5} - 2}\right) \approx 0.09983\\ f_3(001) & = f_3(101) = \frac{\sqrt{5} - 2}{{\sqrt{7 - 2 \sqrt{5}} + \sqrt{5} - 2}} \approx 0.1293 \\ f_3(011) & = f_3(111) = \frac{\left(\sqrt{5} - 1\right)^2}{4 \left({\sqrt{7 - 2 \sqrt{5}} + \sqrt{5} - 2}\right)} \approx 0.20918 \end{align*}

Of course, the result in on \(V_n = \deg(X_n)\) applies to the constant rate distributions. Rooted, directed binomial trees will be studied in Chapter 5.

Regular Graphs

Another tractible case arises from endpoint addition to a regular graph. So suppose that \((S, \rta)\) is a right regular graph of degree \(k\) with \(n\) vertices where \(n, \, k \in \N_+\) and \(k \le n\). Let \((T, \rta)\) denote the graph obtained from \((S, \rta)\) by endpoint addition.

The graph \((T, \rta)\) has a constant rate distribution with rate \[ \beta = \frac{\sqrt{k^2 + 4} - k}{2} \] and density function \(g\) given by \begin{align*} g(x) &= \frac{1}{n} \frac{2}{\sqrt{k^2 + 4} + 2 - k}, \quad x \in S \\ g(t_x) & = \frac{1}{n} \frac{\sqrt{k^2 + 4} - k}{\sqrt{k^2 + 4} + 2 - k}, \quad x \in S \end{align*}

Details:

The uniform distribution on \(S\) has constant rate \(1 / k\) for \((S, \rta)\), so the result follows from and some algebra.

Note that the rate constant in depends only on \(k\) (but again we must have \(k \le n\)).

Suppose that random variable \(X\) in \(T\) has the distribution in . Then \begin{align*} \P(X \in S) &= \frac{2}{\sqrt{k^2 + 4} + 2 - k} \\ \P(X \in S^\prime) & = \frac{\sqrt{k^2 + 4} - k}{\sqrt{k^2 + 4} + 2 - k} \end{align*} So \(\beta \to 0\), \(\P(X \in S) \to 1\) and \(\P(X \in S^\prime) \to 0\) as \(k \to \infty\).

Let \(v_m\) denote the right walk function of order \(m \in \N_+\) for the graph \((T, \rta)\). Then \(v_m(x) = a_m\) and \(v_m(t_x) = a_{m - 1}\) for \(x \in S\) where \[a_m = \left(\frac{k + 2 - \sqrt{k^2 + 4}}{2 \sqrt{k^2 + 4}}\right) \left(\frac{k + \sqrt{k^2 + 4}}{2}\right)^m + \left(\frac{3 \sqrt{k^2 + 4} - k - 2}{2 \sqrt{k^2 + 4}}\right) \left(\frac{k - \sqrt{k^2 + 4}}{2}\right)^m\]

Details:

Since \((S, \rta)\) is right regular, \(v_m\) is constant on \(S\) and on \(S^\prime\). So let \(a_m\) denote the constant value on \(S\) and \(b_m\) the constant value on \(S^\prime\). Then \begin{align*} a_{m + 1} & = k a_m + b_m \\ b_{m + 1} & = a_m \end{align*} which leads to the second order difference equation \(a_{m + 2} - k a_{m + 1} - a_m = 0\). The corresponding characteristic equation is \(r^2 - k r - 1 = 0\) which has roots \[\frac{k \pm \sqrt{k^2 + 4}}{2}\] Applying the initial conditions \(a_0 = 1\), \(a_1 = k + 1\) then leads to the result.

It follows that the number of walks of length \(m \in \N_+\) is \(w_m = n (a_m + a_{m - 1})\), and hence the right walk distribution of order \(m\) has density function \(g_m\) given by \begin{align*} g_m(x) & = \frac{a_m}{n (a_m + a_{m - 1})}, \quad x \in S \\ g_m(t_x) & = \frac{a_{m - 1}}{n (a_m + a_{m - 1})}, \quad x \in S \end{align*} It's easy to show directly that \(w_m / w_{m + 1} \to \beta\) as \(m \to \infty\) and that \(g_m \to g\) as \(m \to \infty\) where \(\beta\) is the rate constant and \(g\) the constant rate density function in . Of course, these results are guaranteed by the general theory in Section 5. Here are a couple of examples:

Suppose that \((S, \rta)\) is the directed cycle on \(n \in \{3, 4, \ldots\}\) vertices. The constant rate distribution for the graph \((T, \rta)\) obtained by endpoint addition rate \(\beta = (\sqrt{5} - 1) / 2\) and density function \(g\) given by \[g(x) = \frac{\sqrt{5} - 1}{2 n}, \; g(t_x) = \frac{3 - \sqrt{5}}{ 2 n}, \quad x \in S \]

Details:

\((S, \rta)\) is regular with degree \(k = 1\) so the result follows from .

Suppose that \((S, \lfrta)\) is the undirected cycle on \(n \in \{3, 4, \ldots\}\) vertices. The constant rate distribution for the graph \((T, \rta)\) obtained by endpoint addition has rate \(\beta = \sqrt{2} - 1\) and density function \(g\) given by \[g(x) = \frac{1}{\sqrt{2} n}, \; g(t_x) = \left(1 - \frac{1}{\sqrt{2}}\right) \frac{1}{n}, \quad x \in S \]

Details:

\((S, \lfrta)\) is regular with degree \(k = 2\) so the result follows from .