## Info

112 PART ONE: SINGLE-EQUATION REGRESSION MODELS

6. (n — 2)(a2/a2) is distributed as the x2 (chi-square) distribution with (n — 2)df.3 This knowledge will help us to draw inferences about the true a2 from the estimated a2, as we will show in Chapter 5. (The chi-square distribution and its properties are discussed in Appendix A.)

7. (ft, ft) are distributed independently of a2. The importance of this will be explained in the next chapter.

8. ft and ft have minimum variance in the entire class of unbiased estimators, whether linear or not. This result, due to Rao, is very powerful because, unlike the Gauss-Markov theorem, it is not restricted to the class of linear estimators only.4 Therefore, we can say that the least-squares estimators are best unbiased estimators (BUE);that is, they have minimum variance in the entire class of unbiased estimators.

To sum up: The important point to note is that the normality assumption enables us to derive the probability, or sampling, distributions of ft and ft (both normal) and a2 (related to the chi square). As we will see in the next chapter, this simplifies the task of establishing confidence intervals and testing (statistical) hypotheses.

In passing, note that, with the assumption that ui ~ N(0, a2), Yi, being a linear function of ui, is itself normally distributed with the mean and variance given by

More neatly, we can write

4.4 THE METHOD OF MAXIMUM LIKELIHOOD (ML)

A method of point estimation with some stronger theoretical properties than the method of OLS is the method of maximum likelihood (ML). Since this method is slightly involved, it is discussed in the appendix to this chapter. For the general reader, it will suffice to note that if ui are assumed to be normally distributed, as we have done for reasons already discussed, the ML and OLS estimators of the regression coefficients, the fts, are identical, and this is true of simple as well as multiple regressions. The ML estimator of a2 is ^ u2 /n. This estimator is biased, whereas the OLS estimator

3The proof of this statement is slightly involved. An accessible source for the proof is Robert V. Hogg and Allen T. Craig, Introduction to Mathematical Statistics, 2d ed., Macmillan, New York, 1965, p. 144.

4C. R. Rao, Linear Statistical Inference and Its Applications, John Wiley & Sons, New York, 1965, p. 258.

CHAPTER FOUR: CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM) 113

of a2 = u2/(n — 2), as we have seen, is unbiased. But comparing these two estimators of a2, we see that as the sample size n gets larger the two estimators of a2 tend to be equal. Thus, asymptotically (i.e., as n increases indefinitely), the ML estimator of a2 is also unbiased.

Since the method of least squares with the added assumption of normality of ui provides us with all the tools necessary for both estimation and hypothesis testing of the linear regression models, there is no loss for readers who may not want to pursue the maximum likelihood method because of its slight mathematical complexity.

1. This chapter discussed the classical normal linear regression model (CNLRM).

2. This model differs from the classical linear regression model (CLRM) in that it specifically assumes that the disturbance term ui entering the regression model is normally distributed. The CLRM does not require any assumption about the probability distribution of ui; it only requires that the mean value of ui is zero and its variance is a finite constant.

3. The theoretical justification for the normality assumption is the central limit theorem.

4. Without the normality assumption, under the other assumptions discussed in Chapter 3, the Gauss-Markov theorem showed that the OLS estimators are BLUE.

5. With the additional assumption of normality, the OLS estimators are not only best unbiased estimators (BUE) but also follow well-known probability distributions. The OLS estimators of the intercept and slope are themselves normally distributed and the OLS estimator of the variance of ui (= a2) is related to the chi-square distribution.

6. In Chapters 5 and 8 we show how this knowledge is useful in drawing inferences about the values of the population parameters.

7. An alternative to the least-squares method is the method of maximum likelihood (ML). To use this method, however, one must make an assumption about the probability distribution of the disturbance term ui. In the regression context, the assumption most popularly made is that ui follows the normal distribution.

8. Under the normality assumption, the ML and OLS estimators of the intercept and slope parameters of the regression model are identical. However, the OLS and ML estimators of the variance of ui are different. In large samples, however, these two estimators converge.

9. Thus the ML method is generally called a large-sample method. The ML method is of broader application in that it can also be applied to regression models that are nonlinear in the parameters. In the latter case, OLS is generally not used. For more on this, see Chapter 14. 