## Testing The Overall Significance Of The Sample Regression

Throughout the previous section we were concerned with testing the significance of the estimated partial regression coefficients individually, that is, under the separate hypothesis that each true population partial regression coefficient was zero. But now consider the following hypothesis:

This null hypothesis is a joint hypothesis that fa2 and fa3 are jointly or simultaneously equal to zero. A test of such a hypothesis is called a test of the overall significance of the observed or estimated regression line, that is, whether Y is linearly related to both X2 and X3.

Can the joint hypothesis in (8.5.1) be tested by testing the significance of fa2 and fa3 individually as in Section 8.4? The answer is no, and the reasoning is as follows.

3For our example, the skewness value is 0.2276 and the kurtosis value is 2.9488. Recall that for a normally distributed variable the skewness and kurtosis values are, respectively, 0 and 3.

254 PART ONE: SINGLE-EQUATION REGRESSION MODELS

In testing the individual significance of an observed partial regression coefficient in Section 8.4, we assumed implicitly that each test of significance was based on a different (i.e., independent) sample. Thus, in testing the significance of ft under the hypothesis that ft = 0, it was assumed tacitly that the testing was based on a different sample from the one used in testing the significance of ft under the null hypothesis that ft = 0. But to test the joint hypothesis of (8.5.1), if we use the same sample data, we shall be violating the assumption underlying the test procedure.4 The matter can be put differently: In (8.4.2) we established a 95% confidence interval for ft. But if we use the same sample data to establish a confidence interval for ft3, say, with a confidence coefficient of 95%, we cannot assert that both ft2 and ft3 lie in their respective confidence intervals with a probability of (1 - a)(1 - a) = (0.95)(0.95).

In other words, although the statements

Pr [ft - ta/2 se (ft) < ft2 < ft + ta/2 se (ft)] = 1 - a

Pr [ft - ta/2 se (ft) < ft3 < ft + ta/2 se (ft)] = 1 - a are individually true, it is not true that the probability that the intervals

simultaneously include ft2 and ft3 is (1 - a)2, because the intervals may not be independent when the same data are used to derive them. To state the matter differently,

. . . testing a series of single [individual] hypotheses is not equivalent to testing those same hypotheses jointly. The intuitive reason for this is that in a joint test of several hypotheses any single hypothesis is "affected'' by the information in the other hypotheses.5

The upshot of the preceding argument is that for a given example (sample) only one confidence interval or only one test of significance can be obtained. How, then, does one test the simultaneous null hypothesis that ft2 = ft3 = 0? The answer follows.

The Analysis of Variance Approach to Testing the Overall Significance of an Observed Multiple Regression: The F Test

For reasons just explained, we cannot use the usual t test to test the joint hypothesis that the true partial slope coefficients are zero simultaneously. However, this joint hypothesis can be tested by the analysis of variance (ANOVA) technique first introduced in Section 5.9, which can be demonstrated as follows.

4In any given sample the cov (ft?2, ^3) may not be zero; that is, P2 and \$3 may be correlated. See (7.4.17).

5Thomas B. Fomby, R. Carter Hill, and Stanley R. Johnson, Advanced Econometric Methods, Springer-Verlag, New York, 1984, p. 37.

CHAPTER EIGHT: MULTIPLE REGRESSION ANALYSIS: THE PROBLEM OF INFERENCE 255