\(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\Z}{\mathbb{Z}}\) \(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\var}{\text{var}}\) \(\newcommand{\sd}{\text{sd}}\) \(\newcommand{\bs}{\boldsymbol}\) \(\newcommand{\cov}{\text{cov}}\)
  1. Random
  2. 8. Hypothesis Testing
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6

4. Tests in the Two-Sample Normal Model

In this section, we will study hypothesis tests in the two-sample normal model and in the bivariate normal model. This section parallels the section on Estimation in the Two Sample Normal Model in the chapter on Interval Estimation.

The Two-Sample Normal Model

Suppose that \(\bs{X} = (X_1, X_2, \ldots, X_n)\) is a random sample of size \(m\) from the normal distribution with mean \(\mu\) and standard deviation \(\sigma\), and that \(\bs{Y} = (Y_1, Y_2, \ldots, Y_n)\) is a random sample of size \(n\) from the normal distribution with mean \(\nu\) and standard deviation \(\tau\). Moreover, suppose that the samples \(\bs{X}\) and \(\bs{Y}\) are independent.

This type of situation arises frequently when the random variables represent a measurement of interest for the objects of the population, and the samples correspond to two different treatments. For example, we might be interested in the blood pressure of a certain population of patients. The \(\bs{X}\) vector records the blood pressures of a control sample, while the \(\bs{Y}\) vector records the blood pressures of the sample receiving a new drug. Similarly, we might be interested in the yield of an acre of corn. The \(\bs{X}\) vector records the yields of a sample receiving one type of fertilizer, while the \(\bs{Y}\) vector records the yields of a sample receiving a different type of fertilizer.

Usually our interest is in a comparison of the parameters (either the mean or variance) for the two sampling distributions. In this section we will construct tests for the for the difference of the means and the ratio of the variances. As with previous estimation problems we have studied, the procedures vary depending on what parameters are known or unknown. Also as before, key elements in the construction of the tests are the sample means and sample variances and the special properties of these statistics when the sampling distribution is normal.

We will use the following notation for the sample mean and sample variance of a generic sample \(\bs{U} = (U_1, U_2, \ldots, U_k)\): \[ M(\bs{U}) = \frac{1}{k} \sum_{i=1}^k U_i, \quad S^2(\bs{U}) = \frac{1}{k - 1} \sum_{i=1}^k [U_i - M(\bs{U})]^2 \]

Tests of the Difference in the Means with Known Standard Deviations

Our first discussion concerns tests for the difference in the means \(\nu - \mu\) under the assumption that the standard deviations \(\sigma\) and \(\tau\) are known. This is often, but not always, an unrealistic assumption. In some statistical problems, the variances are stable, and are at least approximately known, while the means may be different because of different treatments. Also this is a good place to start because the analysis is fairly easy.

For a conjectured difference of the means \( \delta \in \R \), define the test statistic \[ Z = \frac{[M(\bs{Y}) - M(\bs{X})] - \delta}{\sqrt{\sigma^2 / m + \tau^2 / n}} \]

  1. If \( \nu - \mu = \delta \) then \( Z \) has the standard normal distribution.
  2. If \( \nu - \mu \ne \delta \) then \(Z\) has the normal distribution with mean \([(\nu - \mu) - \delta] \big/ {\sqrt{\sigma^2 / m + \tau^2 / n}}\) and variance 1.
Details:

From properties of normal samples, \( M(\bs{X}) \) has a normal distribution with mean \( \mu \) and variance \( \sigma^2 / m \) and similarly \( M(\bs{Y}) \) has a normal distribution with mean \( \nu \) and variance \( \tau^2 / n \). Since the samples are independent, \( M(\bs{X}) \) and \( M(\bs{Y}) \) are independent, so \( M(\bs{Y}) - M(\bs{X}) \) has a normal distribution with mean \( \nu - \mu \) and variance \( \sigma^2 / m + \sigma^2 / n \). The final result then follows since \( Z \) is a linear function of \( M(\bs{Y}) - M(\bs{X}) \).

Of course (b) actually subsumes (a), but we separate them because the two cases play an impotrant role in the hypothesis tests. In part (b), the non-zero mean can be viewed as a non-centrality parameter.

As usual, for \(p \in (0, 1)\), let \(z(p)\) denote the quantile of order \(p\) for the standard normal distribution. For selected values of \(p\), \(z(p)\) can be obtained from the quantile app or from most statistical software packages. Recall also by symmetry that \(z(1 - p) = -z(p)\).

For every \( \alpha \in (0, 1) \), the following tests have significance level \(\alpha\):

  1. Reject \(H_0: \nu - \mu = \delta\) versus \(H_1: \nu - \mu \ne \delta\) if and only if \(Z \lt -z(1 - \alpha / 2)\) or \(Z \gt z(1 - \alpha / 2)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \gt \delta + z(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \) or \( M(\bs{Y}) - M(\bs{X}) \lt \delta - z(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \).
  2. Reject \(H_0: \nu - \mu \ge \delta\) versus \(H_1: \nu - \mu \lt \delta\) if and only if \(Z \lt -z(1 - \alpha)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \lt \delta - z(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \).
  3. Reject \(H_0: \nu - \mu \le \delta\) versus \(H_1: \nu - \mu \gt \delta\) if and only if \(Z \gt z( 1 - \alpha)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \gt \delta + z(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \).
Details:

This follows the same logic that we have seen before. In part (a), \( H_0 \) is a simple hypothesis, and under this hypothesis \( Z \) has the standard normal distribution. Thus, if \( H_0 \) is true then the probability of falsely rejecting \( H_0 \) is \( \alpha \) by definition of the quantiles. In parts (b) and (c), \( H_0 \) specifies a range of values of \( \nu - \mu \), and under \( H_0 \), \( Z \) has a nonstandard normal distribution, by . But the largest type 1 error probability is \( \alpha \) and occurs when \( \nu - \mu = \delta \). The decision rules in terms of \( M(\bs{Y}) - M(\bs{X}) \) are equivalent to those in terms of \( Z \) by simple algebra.

For each of the tests above, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\delta\) is in the corresponding \(1 - \alpha\) level confidence interval.

  1. \( [M(\bs{Y}) - M(\bs{X})] - z(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \le \delta \le [M(\bs{Y}) - M(\bs{X})] + z(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  2. \( \delta \le [M(\bs{Y}) - M(\bs{X})] + z(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  3. \( \delta \ge [M(\bs{Y}) - M(\bs{X})] - z(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
Details:

These results follow from . In each case, we start with the inequality that corresponds to not rejecting the null hypothesis and solve for \( \delta \).

Tests of the Difference of the Means with Unknown Standard Deviations

Next we will construct tests for the difference in the means \(\nu - \mu\) under the more realistic assumption that the standard deviations \(\sigma\) and \(\tau\) are unknown. In this case, it is more difficult to find a suitable test statistic, but we can do the analysis in the special case that the standard deviations are the same. Thus, we will assume that \(\sigma = \tau\), and the common value \(\sigma\) is unknown. This assumption is reasonable if there is an inherent variability in the measurement variables that does not change even when different treatments are applied to the objects in the population.

Recall that the pooled estimate of the common variance \(\sigma^2\) is the weighted average of the sample variances, with the degrees of freedom as the weight factors: \[ S^2(\bs{X}, \bs{Y}) = \frac{(m - 1) S^2(\bs{X}) + (n - 1) S^2(\bs{Y})}{m + n - 2} \] The statistic \( S^2(\bs{X}, \bs{Y}) \) is an unbiased and consistent estimator of the common variance \( \sigma^2 \).

For a conjectured \( \delta \in \R \) define the test statistc \[ T = \frac{[M(\bs{Y}) - M(\bs{X})] - \delta}{S(\bs{X}, \bs{Y}) \sqrt{1 / m + 1 / n}} \]

  1. If \( \nu - \mu = \delta \) then \( T \) has the \(t\) distribution with \(m + n - 2\) degrees of freedom,
  2. If \( \nu - \mu \ne \delta \) then \( T \) has a non-central \( t \) distribution with \( m + n - 2 \) degrees of freedom and non-centrality parameter \[ \frac{(\nu - \mu) - \delta}{\sigma \sqrt{1/m + 1 /n}} \]
Details:

Part (b) actually subsumes part (a), since the ordinary \( t \) distribution is a special case of the non-central \( t \) distribution, with non-centrality parameter 0. With some basic algebra, we can write \( T \) in the form \[ T = \frac{Z + a}{\sqrt{V \big/ (m + n - 2)}}\] where \( Z \) is the standard score of \( M(\bs{Y}) - M(\bs{X}) \), \( a \) is the non-centrality parameter given in the theorem, and \( V = \frac{m + n - 2}{\sigma^2} S^2(\bs{X}, \bs{Y}) \). So \( Z \) has the standard normal distribution, \( V \) has the chi-square distribution with \( m + n - 2 \) degrees of freedom, and \( Z \) and \( V \) are independent. Thus by definition, \( T \) has the non-central \( t \) distribution with \( m + n - 2 \) degrees of freedom and non-centrality parameter \( a \).

As usual, for \(k \gt 0\) and \(p \in (0, 1)\), let \(t_k(p)\) denote the quantile of order \(p\) for the \(t\) distribution with \(k\) degrees of freedom. For selected values of \(k\) and \(p\), values of \(t_k(p)\) can be computed from the quantile app, or from most statistical software packages. Recall also that, by symmetry, \(t_k(1 - p) = -t_k(p)\).

The following tests have significance level \(\alpha\):

  1. Reject \(H_0: \nu - \mu = \delta\) versus \(H_1: \nu - \mu \ne \delta\) if and only if \(T \lt -t_{m + n - 2}(1 - \alpha / 2)\) or \(T \gt t_{m + n - 2}(1 - \alpha / 2)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \gt \delta + t_{m+n-2}(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \) or \( M(\bs{Y}) - M(\bs{X}) \lt \delta - t_{m+n-2}(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  2. Reject \(H_0: \nu - \mu \ge \delta\) versus \(H_1: \nu - \mu \lt \delta\) if and only if \(T \le -t_{m-n+2}(1 - \alpha)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \lt \delta - t_{m+n-2}(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  3. Reject \(H_0: \nu - \mu \le \delta\) versus \(H_1: \nu - \mu \gt \delta\) if and only if \(T \ge t_{m-n+2}(1 - \alpha)\) if and only if \( M(\bs{Y}) - M(\bs{X}) \gt \delta + t_{m+n-2}(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
Details:

This follows the same logic that we have seen before. In part (a), \( H_0 \) is a simple hypothesis, and under this hypothesis \( T \) has the \( t \) distribution with \( m + n - 2 \) degrees of freedom. Thus, if \( H_0 \) is true then the probability of falsely rejecting \( H_0 \) is \( \alpha \) by definition of the quantiles. In parts (b) and (c), \( H_0 \) specifies a range of values of \( \nu - \mu \), and under \( H_0 \), \( T \) has a non-central \( t \) distribution by . But the largest type 1 error probability is \( \alpha \) and occurs when \( \nu - \mu = \delta \). The decision rules in terms of \( M(\bs{Y}) - M(\bs{X}) \) are equivalent to those in terms of \( T \) by simple algebra.

For each of the tests above, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\delta\) is in the corresponding \(1 - \alpha\) level confidence interval.

  1. \( [M(\bs{Y}) - M(\bs{X})] - t_{m+n-2}(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \le \delta \le [M(\bs{Y}) - M(\bs{X})] + t_{m+n-2}(1 - \alpha / 2) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  2. \( \delta \le [M(\bs{Y}) - M(\bs{X})] + t_{m+n-2}(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
  3. \( \delta \ge [M(\bs{Y}) - M(\bs{X})] - t_{m+n-2}(1 - \alpha) \sqrt{\sigma^2 / m + \tau^2 / n} \)
Details:

These results follow from . In each case, we start with the inequality that corresponds to not rejecting the null hypothesis and solve for \( \delta \).

Tests of the Ratio of the Variances

Next we will construct tests for the ratio of the distribution variances \(\tau^2 / \sigma^2\). So the basic assumption is that the variances, and of course the means \(\mu\) and \(\nu\) are unknown.

For a conjectured \( \rho \in (0, \infty) \), define the test statistics \[ F = \frac{S^2(\bs{X})}{S^2(\bs{Y})} \rho \]

  1. If \( \tau^2 / \sigma^2 = \rho \) then \( F \) has the \(F\) distribution with \(m - 1\) degrees of freedom in the numerator and \(n - 1\) degrees of freedom in the denominator.
  2. If \( \tau^2 / \sigma^2 \ne \rho \) then \( F \) has a scaled \( F \) distribution with \( m - 1 \) degrees of freedom in the numerator, \( n - 1 \) degrees of freedom in the denominator, and scale factor \( \rho \frac{\sigma^2}{\tau^2} \).
Details:

Part (b) actually subsumes part (a) when \( \rho = \tau^2 / \rho^2 \), so we will just prove (b). Note that \[ F = \left(\frac{S^2(\bs{X}) \big/ \sigma^2}{S^2(\bs{Y}) \big/ \tau^2}\right) \rho \frac{\sigma^2}{\tau^2} \] But \( S^2(\bs{X}) \big/ \sigma^2 \) has the chi-square distribution with \( m - 1 \) degrees of freedom, \( S^2(\bs{Y}) \big/ \tau^2 \) has the chi-square distribution with \( n - 1 \) degrees of freedom, and the variables are independent. Hence the ratio has the \( F \) distribution with \( m - 1 \) degrees of freedom in the numerator and \( n - 1 \) degrees of freedom in the denominator

The following tests have significance level \( \alpha \):

  1. Reject \(H_0: \tau^2 / \sigma^2 = \rho\) versus \(H_1: \tau^2 / \sigma^2 \ne \rho\) if and only if \(F \gt f_{m-1, n-1}(1 - \alpha / 2)\) or \(F \lt f_{m-1, n-1}(\alpha / 2 )\).
  2. Reject \(H_0: \tau^2 / \sigma^2 \le \rho\) versus \(H_1: \tau^2 / \sigma^2 \gt \rho\) if and only if \(F \lt f_{m-1, n-1}(\alpha)\).
  3. Reject \(H_0: \tau^2 / \sigma^2 \ge \rho\) versus \(H_1: \tau^2 / \sigma^2 \lt \rho\) if and only if \(F \gt f_{m-1, n-1}(1 - \alpha)\).
Details:

The proof is the usual argument. In part (a), \( H_0 \) is a simple hypothesis, and under this hypothesis \( F \) has the \( f \) distribution with \( m - 1 \) degrees of freedom in the numerator \( n - 1 \) degrees of freedom in the denominator. Thus, if \( H_0 \) is true then the probability of falsely rejecting \( H_0 \) is \( \alpha \) by definition of the quantiles. In parts (b) and (c), \( H_0 \) specifies a range of values of \( \tau^2 / \sigma^2 \), and under \( H_0 \), \( F \) has a scaled \( F \) distribution by thoerem . But the largest type 1 error probability is \( \alpha \) and occurs when \( \tau^2 / \sigma^2 = \rho \).

For each of the tests above, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\rho_0\) is in the corresponding \(1 - \alpha\) level confidence interval.

  1. \( \frac{S^2(\bs{Y})}{S^2(\bs{X})} F_{m-1,n-1}(\alpha / 2) \le \rho \le \frac{S^2(\bs{Y})}{S^2(\bs{X})} F_{m-1,n-1}(1 - \alpha / 2) \)
  2. \(\rho \le \frac{S^2(\bs{Y})}{S^2(\bs{X})} F_{m-1,n-1}(\alpha) \)
  3. \( \rho \ge \frac{S^2(\bs{Y})}{S^2(\bs{X})} F_{m-1,n-1}(1 - \alpha) \)
Details:

These results follow from . In each case, we start with the inequality that corresponds to not rejecting the null hypothesis and solve for \( \rho \).

Tests in the Bivariate Normal Model

In this subsection, we consider a model that is superficially similar to the two-sample normal model, but is actually much simpler. Suppose that \[ ((X_1, Y_1), (X_2, Y_2), \ldots, (X_n, Y_n)) \] is a random sample of size \(n\) from the bivariate normal distribution of \((X, Y)\) with \(\E(X) = \mu\), \(\E(Y) = \nu\), \(\var(X) = \sigma^2\), \(\var(Y) = \tau^2\), and \(\cov(X, Y) = \delta\).

Thus, instead of a pair of samples, we have a sample of pairs. The fundamental difference is that in this model, variables \( X \) and \( Y \) are measured on the same objects in a sample drawn from the population, while in the previous model, variables \( X \) and \( Y \) are measured on two distinct samples drawn from the population. The bivariate model arises, for example, in before and after experiments, in which a measurement of interest is recorded for a sample of \(n\) objects from the population, both before and after a treatment. For example, we could record the blood pressure of a sample of \(n\) patients, before and after the administration of a certain drug.

We will use our usual notation for the sample means and variances of \(\bs{X} = (X_1, X_2, \ldots, X_n)\) and \(\bs{Y} = (Y_1, Y_2, \ldots, Y_n)\) in definition . Recall also that the sample covariance of \( (\bs{X}, \bs{Y}) \) is \[ S(\bs{X}, \bs{Y}) = \frac{1}{n - 1} \sum_{i=1}^n [X_i - M(\bs{X})][Y_i - M(\bs{Y})] \] (not to be confused with the pooled estimate of the standard deviation in definition ).

The sequence of differences \(\bs{Y} - \bs{X} = (Y_1 - X_1, Y_2 - X_2, \ldots, Y_n - X_n)\) is a random sample of size \(n\) from the distribution of \(Y - X\). The sampling distribution is normal with

  1. \(\E(Y - X) = \nu - \mu\)
  2. \(\var(Y - X) = \sigma^2 + \tau^2 - 2 \, \delta\)

The sample mean and variance of the sample of differences are

  1. \(M(\bs{Y} - \bs{X}) = M(\bs{Y}) - M(\bs{X})\)
  2. \(S^2(\bs{Y} - \bs{X}) = S^2(\bs{X}) + S^2(\bs{Y}) - 2 \, S(\bs{X}, \bs{Y})\)

The sample of differences \(\bs{Y} - \bs{X}\) fits the normal model for a single variable. The section on tests in the mormal ,odel could be used to perform tests for the distribution mean \(\nu - \mu \) and the distribution variance \(\sigma^2 + \tau^2 - 2 \delta\).

Computational Exercises

A new drug is being developed to reduce a certain blood chemical. A sample of 36 patients are given a placebo while a sample of 49 patients are given the drug. The statistics (in mg) are \(m_1 = 87\), \(s_1\ = 4\), \(m_2 = 63\), \(s_2 = 6\). Test the following at the 10% significance level:

  1. \(H_0: \sigma_1 = \sigma_2\) versus \(H_1: \sigma_1 \ne \sigma_2\).
  2. \(H_0: \mu_1 \le \mu_2\) versus \(H_1: \mu_1 \gt \mu_2\) (assuming that \(\sigma_1 = \sigma_2\)).
  3. Based on (b), is the drug effective?
Details:
  1. Test statistic 0.4, critical values 0.585, 1.667. Reject \(H_0\).
  2. Test statistic 1.0, critical values \(\pm 1.6625\). Fail to reject \(H_0\).
  3. Probably not

A company claims that an herbal supplement improves intelligence. A sample of 25 persons are given a standard IQ test before and after taking the supplement. The before and after statistics are \(m_1 = 105\), \(s_1 = 13\), \(m_2 = 110\), \(s_2 = 17\), \(s_{1, \, 2} = 190\). At the 10% significance level, do you believe the company's claim?

Details:

Test statistic 2.8, critical value 1.3184. Reject \(H_0\).

In Fisher's iris data, consider the petal length variable for the samples of Versicolor and Virginica irises. Test the following at the 10% significance level:

  1. \(H_0: \sigma_1 = \sigma_2\) versus \(H_1: \sigma_1 \ne \sigma_2\).
  2. \(H_0: \mu_1 \le \mu_2\) versus \(\mu_1 \gt \mu_2\) (assuming that \(\sigma_1 = \sigma_2\)).
Details:
  1. Test statistic 1.1, critical values 0.6227, 1.6072. Fail to reject \(H_0\).
  2. Test statistic \(-11.4\), critical value \(-1.6602\). Reject \(H_0\).

A plant has two machines that produce a circular rod whose diameter (in cm) is critical. A sample of 100 rods from the first machine as mean 10.3 and standard deviation 1.2. A sample of 100 rods from the second machine has mean 9.8 and standard deviation 1.6. Test the following hypotheses at the 10% level.

  1. \(H_0: \sigma_1 = \sigma_2\) versus \(H_1: \sigma_1 \ne \sigma_2\).
  2. \(H_0: \mu_1 = \mu_2\) versus \(H_1: \mu_1 \ne \mu_2\) (assuming that \(\sigma_1 = \sigma_2\)).
Details:
  1. Test statistic 0.56, critical values 0.7175, 1.3942. Reject \(H_0\).
  2. Test statistic \(-4.97\), critical values \(\pm 1.645\). Reject \(H_0\).