## Info

18For proof, see K. A. Brownlee, Statistical Theory and Methodology in Science and Engineering, John Wiley & Sons, New York, 1960, pp. 278-280.

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

142 PART ONE: SINGLE-EQUATION REGRESSION MODELS

purposes, our sample could not have come from a population with zero p2 value and we can conclude with great confidence that X, income, does affect Y, consumption expenditure.

Refer to Theorem 5.7 of Appendix 5A.1, which states that the square of the t value with k df is an F value with 1 df in the numerator and k df in the denominator. For our consumption-income example, if we assume H0: p2 = 0, then from (5.3.2) it can be easily verified that the estimated t value is 14.26. This t value has 8 df. Under the same null hypothesis, the F value was 202.87 with 1 and 8 df. Hence (14.24)2 = F value, except for the rounding errors.

Thus, the t and the F tests provide us with two alternative but complementary ways of testing the null hypothesis that p2 = 0. If this is the case, why not just rely on the t test and not worry about the F test and the accompanying analysis of variance? For the two-variable model there really is no need to resort to the F test. But when we consider the topic of multiple regression we will see that the F test has several interesting applications that make it a very useful and powerful method of testing statistical hypotheses.

5.10 APPLICATION OF REGRESSION ANALYSIS: THE PROBLEM OF PREDICTION

On the basis of the sample data of Table 3.2 we obtained the following sample regression:

where Yt is the estimator of true E(Yi) corresponding to given X. What use can be made of this historical regression? One use is to "predict" or "forecast" the future consumption expenditure Y corresponding to some given level of income X. Now there are two kinds of predictions: (1) prediction of the conditional mean value of Y corresponding to a chosen X, say, X0, that is the point on the population regression line itself (see Figure 2.2), and (2) prediction of an individual Y value corresponding to X0. We shall call these two predictions the mean prediction and individual prediction.

### Mean Prediction19

To fix the ideas, assume that X0 = 100 and we want to predict E(Y | X0 = 100). Now it can be shown that the historical regression (3.6.2) provides the point estimate of this mean prediction as follows:

19For the proofs of the various statements made, see App. 5A, Sec. 5A.4.

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 143

where Y0 = estimator of E(Y | X0). It can be proved that this point predictor is a best linear unbiased estimator (BLUE).

Since Y0 is an estimator, it is likely to be different from its true value. The difference between the two values will give some idea about the prediction or forecast error. To assess this error, we need to find out the sampling distribution of Y0. It is shown in Appendix 5A, Section 5A.4, that Y0 in Eq. (5.10.1) is normally distributed with mean (fa + faX0) and the variance is given by the following formula:

By replacing the unknown a2 by its unbiased estimator a2, we see that the variable t = Y0 — + ^ X0) (5.10.3)

follows the t distribution with n — 2 df. The t distribution can therefore be used to derive confidence intervals for the true E(Y0 | X0) and test hypotheses about it in the usual manner, namely,

Pr [fa + faX0 — ta/2 se (Y0) < fa + faX0 < fa + faX0 + ta/2 se (Y0)] = 1 — a

where se (Y0) is obtained from (5.10.2). For our data (see Table 3.3), var (Y0) = 42.159 10.4759

Therefore, the 95% confidence interval for true E(Y | X0) = fa + faX0 is given by

75.3645 — 2.306(3.2366) < E(Y0 | X = 100) < 75.3645 + 2.306(3.2366) that is,

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

144 PART ONE: SINGLE-EQUATION REGRESSION MODELS