Info

No negative correlation

No decision

4 - du < d < 4 -

dL

No autocorrelation, positive or negative

Do not reject

du < d < 4 -

du

25For details, see Thomas B. Fomby, R. Carter Hill, and Stanley R. Johnson, Advanced Econometric Methods, Springer Verlag, New York, 1984, pp. 225-228.

25For details, see Thomas B. Fomby, R. Carter Hill, and Stanley R. Johnson, Advanced Econometric Methods, Springer Verlag, New York, 1984, pp. 225-228.

CHAPTER TWELVE: AUTOCORRELATION 471

2. H0: p = 0 versus H p< 0. Reject H0 at a level if the estimated (4 — d) < dU, that is, there is statistically significant evidence of negative autocorrelation.

3. H0: p = 0 versus H p = 0. Reject H0 at 2a level if d < dU or (4 — d) < dU, that is, there is statistically significant evidence of autocorrelation, positive or negative.

It may be pointed out that the indecisive zone narrows as the sample size increases, which can be seen clearly from the Durbin-Watson tables. For example, with 4 regressors and 20 observations, the 5 percent lower and upper d values are 0.894 and 1.828, respectively, but these values are 1.515 and 1.739 if the sample size is 75.

The computer program Shazam performs an exact d test, that is, it gives the p value, the exact probability of the computed d value. With modern computing facilities, it is no longer difficult to find the p value of the computed d statistic. Using SHAZAM (version 9) for our wages-productivity regression, we find the p value of the computed d of 0.1229 is practically zero, thereby reconfirming our earlier conclusion based on the Durbin-Watson tables.

The Durbin-Watson d test has become so venerable that practitioners often forget the assumptions underlying the test. In particular, the assumptions that (1) the explanatory variables, or regressors, are nonstochastic; (2) the error term follows the normal distribution; and (3) that the regression models do not include the lagged value(s) of the regressand are very important for the application of the d test.

If a regression model contains lagged value(s) of the regressand, the d value in such cases is often around 2, which would suggest that there is no (first-order) autocorrelation in such models. Thus, there is a built-in bias against discovering (first-order) autocorrelation in such models. This does not mean that autoregressive models do not suffer from the autocorrelation problem. As a matter of fact, Durbin has developed the so-called h test to test serial correlation in such models. But this test is not as powerful, in a statistical sense, as the Breusch-Godfrey test to be discussed shortly, so there is no need to use the h test. However, because of its historical importance, it is discussed in exercise 12.36.

Also, if the error term Ut are not NIID, the routinely used d test may not be reliable.26 In this respect the runs test discussed earlier has an advantage in that it does not make any (probability) distributional assumption about the error term. However, if the sample is large (technically infinite), we can use the Durbin-Watson d, for it can be shown that27

26For an advanced discussion, see Ron C. Mittelhammer, George G. Judge, and Douglas J. Miller, Econometric Foundations, Cambridge University Press, New York, 2000, p. 550.

27See James Davidson, Econometric Theory, Blackwell Publishers, New York, 2000, p. 161.

472 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

That is, in large samples the d statistic as transformed in (12.6.12) follows the standard normal distribution. Incidentally, in view of the relationship between d and p, the estimated first-order autocorrelation coefficient, shown in (12.6.10), it follows that that is, in large samples, the square root of the sample size times the estimated first-order autocorrelation coefficient also follows the standard normal distribution.

As an illustration of the test, for our wages-productivity example, we found that d = 0.1229 with n = 40. Therefore, from (12.6.12) we find that

Asymptotically, if the null hypothesis of zero (first-order) autocorrelation were true, the probability of obtaining a Z value (i.e., a standardized normal variable) of as much as 5.94 or greater is extremely small. Recall that for a standard normal distribution, the (two-tail) critical 5 percent Z value is only 1.96 and the 1 percent critical Z value is about 2.58. Although our sample size is only 40, for practical purposes it may be large enough to use the normal approximation. The conclusion remains the same, namely, that the residuals from the wages-productivity regression suffer from autocorrelation.

But the most serious problem with the d test is the assumption that the regressors are nonstochastic, that is, their values are fixed in repeated sampling. If this is not the case, then the d test is not valid either in finite, or small, samples or in large samples.28 And since this assumption is usually difficult to maintain in economic models involving time series data, one author contends that the Durbin-Watson statistic may not be useful in econometrics involving time series data.29 In his view, more useful tests of autocorrelation are available, but they are all based on large samples. We discuss one such test below, the Breusch-Godfrey test.

To avoid some of the pitfalls of the Durbin-Watson d test of autocorrelation, statisticians Breusch and Godfrey have developed a test of autocorrelation that is general in the sense that it allows for (1) nonstochastic regressors, such as the lagged values of the regressand; (2) higher-order autoregressive

29Fumio Hayashi, Econometrics, Princeton University Press, Princeton, N.J., 2000, p. 45.

30See, L. G. Godfrey, "Testing Against General Autoregressive and Moving Average Error Models When the Regressor include Lagged Dependent Variables,'' Econometrica, vol. 46, 1978, pp. 1293-1302, and T. S. Breusch, "Testing for Autocorrelation in Dynamic Linear Models,'' Australian Economic Papers, vol. 17, 1978, pp. 334-355.

IV. A General Test of Autocorrelation: The Breusch-Godfrey (BG) Test30

CHAPTER TWELVE: AUTOCORRELATION 473

schemes, such as AR(1), AR(2), etc.; and (3) simple or higher-order moving averages of white noise error terms, such as st in (12.2.1).31

Without going into the mathematical details, which can be obtained from the references, the BG test, which is also known as the LM test,32 proceeds as follows: We use the two-variable regression model to illustrate the test, although many regressors can be added to the model. Also, lagged values of the regressand can be added to the model. Let

Assume that the error term ut follows the pth-order autoregressive, AR(p), scheme as follows:

where st is a white noise error term as discussed previously. As you will recognize, this is simply the extension of the AR(1) scheme. The null hypothesis H0 to be tested is that

That is, there is no serial correlation of any order. The BG test involves the following steps:

1. Estimate (12.6.14) by OLS and obtain the residuals, ut.

2. Regress ut on the original Xt (if there is more than one X variable in the original model, include them also) and ut-1, ut-2,..., ut-p, where the latter are the lagged values of the estimated residuals in step 1. Thus, if p = 4, we will introduce four lagged values of the residuals as additional regressor in the model. Note that to run this regression we will have only (n - p) observations (why?). In short, run the following regression:

lit = «1 + «2 Xt + p1ut-1 + p2ut-2 +-----+ pput-p + St (12.6.17)

and obtain R2 from this (auxiliary) regression.33

3. If the sample size is large (technically, infinite), Breusch and Godfrey have shown that

31For example, in the regression Yt = ¡1 + ¡2Xt + ut the error term can be represented as ut = st + k1Et-1 + i.2St-2, which represents a three-period moving average of the white noise error term st.

32The test is based on the Lagrange Multiplier principle briefly mentioned in Chap. 8.

33The reason that the original regressor X is included in the model is to allow for the fact that X may not be strictly nonstochastic. But if it is strictly nonstochastic, it may be omitted from the model. On this, see Jeffrey M. Wooldridge, Introductory Econometrics: A Modern Approach, South-Western Publishing Co., 200, p. 386.

474 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

That is, asymptotically, n - p times the R2 value obtained from the auxiliary regression (12.6.17) follows the chi-square distribution with p df. If in an application, (n — p)R2 exceeds the critical chi-square value at the chosen level of significance, we reject the null hypothesis, in which case at least one rho in (12.6.15) is statistically significantly different from zero.

The following practical points about the BG test may be noted:

1. The regressors included in the regression model may contain lagged values of the regressand Y, that is, Yt—1, Yt—2, etc., may appear as explanatory variables. Contrast this model with the Durbin-Watson test restriction that there be no lagged values of the regressand among the regressors.

2. As noted earlier, the BG test is applicable even if the disturbances follow a pth-order moving average (MA) process, that is, the ut are generated as follows:

ut = St + ^1St—1 + X2St—2 +-----+ XpSt—p (12.6.19)

where st is a white noise error term, that is, the error term that satisfies all the classical assumptions.

In the chapters on time series econometrics, we will study in some detail the pth-order autoregressive and moving average processes.

3. If in (12.6.15) p = 1, meaning first-order autoregression, then the BG test is known as Durbin's M test.

4. A drawback of the BG test is that the value of p, the length of the lag, cannot be specified a priori. Some experimentation with the p value is inevitable. Sometimes one can use the so-called Akaike and Schwarz information criteria to select the lag length. We will discuss these criteria in Chapter 13 and later in the chapters on time series econometrics.

ILLUSTRATION OF THE BG TEST: THE WAGES-PRODUCTIVITY RELATION

To illustrate the test, we will apply it to our illustrative example. Using an AR(6) scheme, we obtained the results shown in exercise 12.25. From the regression results given there, it can be seen that (n — p) = 34 and R2 = 0.8920. Therefore, multiplying these two, we obtain a chi-square value of 30.328. For 6 df (why?), the probability of obtaining a chi-square value of as much as 30.328 or greater is extremely small; the chi-square table in Appendix D.4 shows that the probability of ob taining a chi-square value of as much as 18.5476 or greater is only 0.005. Therefore, for the same df, the probability of obtaining a chi-square value of about 30 must be extremely small. As a matter of fact, the actual p value is almost zero.

Therefore, the conclusion is that, for our example, at least one of the six autocorrelations must be nonzero.

Trying varying lag lengths from 1 to 6, we find that only the AR(1) coefficient is significant, suggesting that there is no need to consider more than one lag. In essence the BG test in this case turns out to be Durbin's m test.

Why So Many Tests of Autocorrelation?

The answer to this question is that ". . . no particular test has yet been judged to be unequivocally best [i.e., more powerful in the statistical sense], and thus the analyst is still in the unenviable position of considering a varied collection of test procedures for detecting the presence or structure, or both, of autocorrelation."34 Of course, a similar argument can be made about the various tests of heteroscedasticity discussed in the previous chapter.

If after applying one or more of the diagnostic tests of autocorrelation discussed in the previous section, we find that there is autocorrelation, what then? We have four options:

1. Try to find out if the autocorrelation is pure autocorrelation and not the result of mis-specification of the model. As we discussed in Section 12.1, sometimes we observe patterns in residuals because the model is mis-specified—that is, it has excluded some important variables—or because its functional form is incorrect.

2. If it is pure autocorrelation, one can use appropriate transformation of the original model so that in the transformed model we do not have the problem of (pure) autocorrelation. As in the case of heteroscedasticity, we will have to use some type of generalized least-square (GLS) method.

3. In large samples, we can use the Newey-West method to obtain standard errors of OLS estimators that are corrected for autocorrelation. This method is actually an extension of White's heteroscedasticity-consistent standard errors method that we discussed in the previous chapter.

4. In some situations we can continue to use the OLS method.

Because of the importance of each of these topics, we devote a section to each one.

Let us return to our wages-productivity regression given in (12.5.1). There we saw that the d value was 0.1229 and based on the Durbin-Watson d test we concluded that there was positive correlation in the error term. Could this correlation have arisen because our model was not correctly specified? Since the data underlying regression (12.5.1) is time series data, it is quite possible that both wages and productivity exhibit trends. If that is the case,

34Ron C. Mittelhammer et al., op. cit., p. 547. Recall that the power of a statistical test is one minus the probability of committing a Type II error, that is, one minus the probability of accepting a false hypothesis. The maximum power of a test is 1 and the minimum is 0. The closer of the power of a test is to zero, the worse is that test, and the closer it is to 1, the more powerful is that test. What these authors are essentially saying is that there is no single most powerful test of autocorrelation.

12.7 WHAT TO DO WHEN YOU FIND AUTOCORRELATION: REMEDIAL MEASURES