## Info

Consulting the Durbin-Watson table for 39 observations and 1 explanatory variable, we find that dL = 1.435 and du = 1.540 (5 percent level). Since the observed g lies below the lower limit of d, we do not reject the hypothesis that true p = 1. Keep in mind that although we use the same Durbin-Watson

41I. I. Berenblutt and G. I. Webb, "A New Test for Autocorrelated Errors in the Linear Regression Model," Journal of the Royal Statistical Society, Series B, vol. 35, No.1, 1973, pp. 33-50.

tables, now the null hypothesis is that p = 1 and not that p = 0. In view of this finding, the results given in (12.9.9) may be acceptable.

p Based on Durbin-Watson d Statistic. If we cannot use the first difference transformation because p is not sufficiently close to unity, we have an easy method of estimating it from the relationship between d and p established previously in (12.6.10), from which we can estimate p as follows:

Thus, in reasonably large samples one can obtain rho from (12.9.13) and use it to transform the data as shown in the generalized difference equation (12.9.5). Keep in mind that the relationship between p and d given in (12.9.13) may not hold true in small samples, for which Theil and Nagar have proposed a modification, which is given in exercise 12.6.

In our wages-productivity regression (12.5.1), we obtain a d value of 0.1229. Using this value in (12.9.13), we obtain p « 0.9386. Using this estimated rho value, we can estimate regression (12.9.5). All we have to do is subtract 0.9386 times the previous value of Y from its current value and similarly subtract 0.9386 times the previous value of X from its current value and run the OLS regression on the variables thus transformed as in (12.9.6), where Y* = (Yt - 0.9386Yt-0 and X* = (Xt - 0.9386Xt-1).

p Estimated from the Residuals. If the AR(1) scheme ut = put-1 + et is valid, a simple way to estimate rho is to regress the residuals ut on ut-1, for the ut are consistent estimators of the true ut, as noted previously. That is, we run the following regression:

where ut are the residuals obtained from the original (level form) regression and where vt are the error term of this regression. Note that there is no need to introduce the intercept term in (12.9.14), for we know the OLS residuals sum to zero.

The residuals from our wages-productivity regression given in (12.5.1) are already shown in Table 12.5. Using these residuals, the following regression results were obtained:

As this regression shows, p = 0.9142. Using this estimate, one can transform the original model as per (12.9.6). Since the rho estimated by this procedure is about the same as that obtained from the Durbin-Watson d, the

482 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

regression results using the rho of (12.9.15) should not be very different from those obtained from the rho estimated from the Durbin-Watson d. We leave it to the reader to verify this.

Iterative Methods of Estimating p. All the methods of estimating p discussed previously provide us with only a single estimate of p. But there are the so-called iterative methods that estimate p iteratively, that is, by successive approximation, starting with some initial value of p. Among these methods the following may be mentioned: the Cochrane-Orcutt iterative procedure, the Cochrane-Orcutt two-step procedure, the Durbin two-step procedure, and the Hildreth-Lu scanning or search procedure. Of these, the most popular is the Cochran-Orcutt iterative method. To save space, the iterative methods are discussed by way of exercises. Remember that the ultimate objective of these methods is to provide an estimate of p that may be used to obtain GLS estimates of the parameters. One advantage of the Cochrane-Orcutt iterative method is that it can be used to estimate not only an AR(1) scheme, but also higher-order autoregressive schemes, such as ut = f>1tit-1 + p2ut-2 + vt, which is AR(2). Having obtained the two rhos, one can easily extend the generalized difference equation (12.9.6). Of course, the computer can now do all this.

Returning to our wages-productivity regression, and assuming an AR(1) scheme, we use the Cochrane-Orcutt iterative method, which gives the following estimates of rho: 0.9142, 0.9052, 0.8992, 0.8956, 0.8935, 0.8924, and 0.8919. The last value of 0.8919 can now be used to transform the original model as in (12.9.6) and estimate it by OLS. Of course, OLS on the transformed model is simply the GLS. The results are as follows:

Dropping the First Observation Since the first observation has no antecedent, in estimating (12.9.6), we drop the first observation. The regression results are as follows:

Comparing the results of this regression with the original regression given in (12.5.1), we see that the slope coefficient has dropped dramatically. Notice two things about (12.9.16). First, the intercept coefficient in (12.9.16) is fa(1 — p), from which the original fa can be easily retrieved, since we know that p = 0.8913. Secondly, the r 2's of the transformed model (12.9.16) and the original model (12.5.1) cannot be directly compared, since the dependent variables in the two models are different.

Retaining the First Observation à la Prais-Winsten. We cautioned earlier that in small samples keeping the first observation or omitting it can

Y* = 45.105 + 0.5503.X* se = (6.190) (0.0652) t = (7.287) (8.433) r2 = 0.9959

make a substantial difference in small samples, although in large samples the difference may be inconsequential.

Retaining the first observation a la Prais-Winsten, we obtain the following regression results42:

The difference between (12.9.16) and (12.9.17) tells us that the inclusion or exclusion of the first observation can make a substantial difference in the regression results. Also, note that the slope coefficient in (12.9.17) is approximately the same as that in (12.5.1).

General Comments. There are several points about correcting for autocorrelation using the various methods discussed above.

First, since the OLS estimators are consistent despite autocorrelation, in large samples, it makes little difference whether we estimate p from the Durbin-Watson d, or from the regression of the residuals in the current period on the residuals in the previous period, or from the Cochrane-Orcutt iterative procedure because they all provide consistent estimates of the true p. Second, the various methods discussed above are basically two-step methods. In step 1 we obtain an estimate of the unknown p and in step 2 we use that estimate to transform the variables to estimate the generalized difference equation, which is basically GLS. But since we use p instead of the true p, all these methods of estimation are known in the literature as feasible GLS (FGLS) or estimated GLS (EGLS) methods.

Third, it is important to note that whenever we use an FGLS or EGLS method to estimate the parameters of the transformed model, the estimated coefficients will not necessarily have the usual optimum properties of the classical model, such as BLUE, especially in small samples. Without going into complex technicalities, it may be stated as a general principle that whenever we use an estimator in place of its true value, the estimated OLS coefficients may have the usual optimum properties asymptotically, that is, in large samples. Also, the conventional hypothesis testing procedures are, strictly speaking, valid asymptotically. In small samples, therefore, one has to be careful in interpreting the estimated results.

Fourth, in using EGLS, if we do not include the first observation (as was originally the case with the Cochrane-Orcutt procedure), not only the

Y = 26.454 + 0.7245X* se = (5.4520) (0.0612) t = (4.8521) (11.8382) r2 = 0.9949

42Including the first observation, the iterated values of rho are: 0.9142, 9.9462, 0.9556, 0.9591, 0.9605, and 0.9610. The last value was used in transforming the data to form the generalized difference equation.

484 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

numerical values but also the efficiency of the estimators can be adversely affected, especially if the sample size is small and if the regressors are not strictly speaking nonstochastic.43 Therefore, in small samples it is important to keep the first observation a la Prais-Winsten. Of course, if the sample size is reasonably large, EGLS, with or without the first observation, gives similar results. Incidentally, in the literature EGLS with Prais-Winsten transformation is known as the full EGLS, or FEGLS, for short. 