## Info

State of nature

Decision

H0 is true

H0 is false

Reject

Type I error

No error

Do not reject

No error

Type II error

Ideally, we would like to minimize both type I and type II errors. But unfortunately, for any given sample size, it is not possible to minimize both the errors simultaneously. The classical approach to this problem, embodied in the work of Neyman and Pearson, is to assume that a type I error is likely to be more serious in practice than a type II error. Therefore, one should try to keep the probability of committing a type I error at a fairly low level, such as 0.01 or 0.05, and then try to minimize the probability of having a type II error as much as possible.

In the literature, the probability of a type I error is designated as a and is called the level of significance, and the probability of a type II error is designated as p. The probability of not committing a type II error is called the power of the test. Put differently, the power of a test is its ability to reject a false null hypothesis. The classical approach to hypothesis testing is to fix a at levels such as 0.01 (or 1 percent) or 0.05 (5 percent) and then try to maximize the power of the test; that is to minimize p.

It is important that the reader understands the concept of the power of a test, which is best explained with an example.8

Let X — N(|, 100); that is, X is normally distributed with mean i and variance 100. Assume that a = 0.05. Suppose we have a sample of 25 observations, which gives a sample mean value of XX. Suppose further we entertain the hypothesis H0: i = 50. Since X is normally distributed, we know that the sample mean is also normally distributed as: XX— N(i, 100/25). Hence under the stated null hypothesis that i = 50, the 95% confidence interval for X is (i ± 1.96(^00/25) = i ± 3.92, that is, (46.08 to 53.92). Therefore, the critical region consists of all values of XX less than 46.08 or greater than 53.92. That is, we will reject the null hypothesis that the true mean is 50 if a sample mean value is found below 46.08 or greater than 53.92.

8The following discussion and the figures are based on Helen M. Walker and Joseph Lev, Statistical Inference, Holt, Rinehart and Winston, New York, 1953, pp. 161-162.

APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS 909

But what is the probability that XX will lie in the preceding critical re-gion(s) if the true p has a value different from 50? Suppose there are three alternative hypotheses: p = 48, p = 52, and p = 56. If any of these alternatives is true, it will be the actual mean of the distribution of XX. The standard error is unchanged for the three alternatives since a2 is still assumed to be 100.

The shaded areas in Figure A.13 show the probabilities that XX will fall in the critical region if each of the alternative hypotheses is true. As you can check, these probabilities are 0.17 (for p = 48), 0.05 (for p = 50), 0.17 (for p = 52) and 0.85 (for p = 56). As you can see from this figure, whenever the true value of p differs substantially from the hypothesis under consideration (which here is p = 50), the probability of rejecting the hypothesis is high but when the true value is not very different from the value given under the null hypothesis, the probability of rejection is small. Intuitively, this should make sense if the null and alternative hypotheses are very closely bunched.

This can be seen further if you consider Figure A.14, which is called the power function graph, and the curve shown there is called the power curve.

The reader will by now realize that the confidence coefficient (1 — a) discussed earlier is simply one minus the probability of committing a type

FIGURE A.13 Distribution of Xwhen N = 25, a = 10, and p = 48, 50, 52, or 56. Under H: p = 50, the critical region with a = 0.05 is X < 46.1 and X > 53.9. The shaded area indicates the probability that X will fall into the critical region. This probability is:

0.17 if ß = 48 0.17 if ß = 52 0.05 if ß = 50 0.85 if ß = 56

910 APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS

FIGURE A.14 Power function of test of hypothesis ±——±——j— rr __ _„ _ ro ^ p = 50 when N = 25, a = 10, and 40 42 44 46 48 H 52 54 56 58 60 a = 0.05. Scale of p

I error. Thus a 95 percent confidence coefficient means that we are prepared to accept at the most a 5 percent probability of committing a type I error— we do not want to reject the true hypothesis by more than 5 out of i00 times.

The p Value, or Exact Level of Significance. Instead of preselecting a at arbitrary levels, such as i, 5, or i0 percent, one can obtain the p (probability) value, or exact level of significance of a test statistic. The p value is defined as the lowest significance level at which a null hypothesis can be rejected.

Suppose that in an application involving 20 df we obtain a t value of 3.552. Now the p value, or the exact probability, of obtaining a t value of 3.552 or greater can be seen from Table D.2 as 0.00i (one-tailed) or 0.002 (two-tailed). We can say that the observed t value of 3.552 is statistically significant at the 0.00i or 0.002 level, depending on whether we are using a one-tail or two-tail test.

Several statistical packages now routinely print out the p value of the estimated test statistics. Therefore, the reader is advised to give the p value wherever possible.

The Test of Significance Approach

Recall that

In any given application, XX and n are known (or can be estimated), but the true p and a are not known. But if a is specified and we assume (under H0) that p = p~, a specific numerical value, then Zi can be directly computed and we can easily look at the normal distribution table to find the probability of obtaining the computed Z value. If this probability is small, say, less than 5 percent or i percent, we can reject the null hypothesis—if the

APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS 911

hypothesis were true, the chances of obtaining the particular Z value should be very high. This is the general idea behind the test of significance approach to hypothesis testing. The key idea here is the test statistic (here the Z statistic) and its probability distribution under the assumed value i = i*. Appropriately, in the present case, the test is known as the Z test, since we use the Z (standardized normal) value.

Returning to our example, if i = i = 69, the Z statistic becomes

ol^n

If we look at the normal distribution table D.1, we see that the probability of obtaining such a Z value is extremely small. (Note: The probability of a Z value exceeding 3 or —3 is about 0.001. Therefore, the probability of Z exceeding 8 is even smaller.) Therefore, we can reject the null hypothesis that i = 69; given this value, our chance of obtaining XX of 67 is extremely small. We therefore doubt that our sample came from the population with a mean value of 69. Diagrammatically, the situation is depicted in Figure A.15.

In the language of test of significance, when we say that a test (statistic) is significant, we generally mean that we can reject the null hypothesis. And the test statistic is regarded as significant if the probability of our obtaining it is equal to or less than a, the probability of committing a type I error. Thus if a = 0.05, we know that the probability of obtaining a Z value of —1.96 or 1.96 is 5 percent (or 2.5 percent in each tail of the standardized normal distribution). In our illustrative example Z was —8. Hence the probability of obtaining such a Z value is much smaller than 2.5 percent, well below our pre-specified probability of committing a type I error. That is why the computed value of Z = —8 is statistically significant; that is, we reject the null

FIGURE A.15 The distribution of the Z statistic.

FIGURE A.15 The distribution of the Z statistic.

912 APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS

hypothesis that the true m is 69. Of course, we reached the same conclusion using the confidence interval approach to hypothesis testing.

We now summarize the steps involved in testing a statistical hypothesis:

Step 1. State the null hypothesis H0 and the alternative hypothesis H1 (e.g., Ho: m = 69 and H1: m = 69).

Step 3. Determine the probability distribution of the test statistic (e.g.,

Step 4. Choose the level of significance (i.e., the probability of committing a type I error) a.

Step 5. Using the probability distribution of the test statistic, establish a 100(1 — a)% confidence interval. If the value of the parameter under the null hypothesis (e.g., m = M = 69) lies in this confidence region, the region of acceptance, do not reject the null hypothesis. But if it falls outside this interval (i.e., it falls into the region of rejection), you may reject the null hypothesis. Keep in mind that in not rejecting or rejecting a null hypothesis you are taking a chance of being wrong a percent of the time.