The Dice Goodness of Fit Experiment

Dice Goodness of Fit Experiment

Description

The experiment is to select a random sample from a specified distribution and perform the chi-square goodness of fit test to another specified distribution, at a specified level of significance. The distributions are discrete distributions on \(\{1, 2, 3, 4, 5, 6\}\), and thus the experiment corresponds to rolling a stamdard die, with the true distribution, \(n\) times.

The true distribution and test distribution can be specified with the list boxes. The following distributions can be chosen:

fair: each face has probability \(1 / 6\).
1-6 flat: faces 1 and 6 have probability \(1 / 4\) each; faces 2, 3, 4, and 5 have probability \(1 / 8\).
2-5 flat: faces 2 and 5 have probability \(1 / 4\) each; faces 1, 3, 4, and 6 have probability \(1 / 8\) each.
3-4 flat: faces 3 and 4 have probability \(1 / 4\) each; faces 1, 2, 5, and 6 have probability \(1 / 8\) each.
skewed left: face \(i\) has probability \(i / 21\) for \(i \in \{1, 2, 3, 4, 5, 6\}\).
skewed left: face \(i\) has probability \((7 - i) / 21\) for \(i \in \{1, 2, 3, 4, 5, 6\}\).

The probability density function of the true distribution is shown in the first graph; the probability density function of the test distribution is shown in the second graph. The significance level and the sample size can be varied with input controls.

The test statistic is \[V = \sum_{i = 1}^6 \frac{(O_i - E_i)^2}{E_i}\] where \(O_i\) is the observed frequency of \(i\) and \(E_i\) the expected frequency. If the null hypothesis is true and \(n\) sufficiently large, then \(V\) has (approximately) the chi-square distribution with 5 degrees of freedom. The probability density function of \(V\) and the critical value are shown in the third graph.

On each run, the sample density function is shown in the first two graphs. This empirical density may not look identical in the two graphs because of different scales on the vertical axis. The value of the test statistic \(V\) is shown in the third graph. Note that the null hypothesis is rejected if and only if \(V\) falls outside of the critical values. Random variable \(R\) indicates the event that the null hypothesis is rejected. On each update, the empirical density of \(R\) is shown in the fourth graph and is recorded in the distribution table. The values of \(V\) and \(R\) and the \( P \)-value are recorded in the data table on each run. The goodness of fit table shows the expected frequencies, and on each run, the observed frequencies.