## Info

FIGURE 12.1 Patterns of autocorrelation and nonautocorrelation.

Starting at the bottom of the recession, when economic recovery starts, most of these series start moving upward. In this upswing, the value of a series at one point in time is greater than its previous value. Thus there is a "momentum" built into them, and it continues until something happens (e.g., increase in interest rate or taxes or both) to slow them down. Therefore, in regressions involving time series data, successive observations are likely to be interdependent.

Specification Bias: Excluded Variables Case. In empirical analysis the researcher often starts with a plausible regression model that may not be the most "perfect" one. After the regression analysis, the researcher does the postmortem to find out whether the results accord with a priori expectations. If not, surgery is begun. For example, the researcher may plot the residuals U obtained from the fitted regression and may observe patterns such as those shown in Figure 12.1a to d. These residuals (which are proxies for ui) may suggest that some variables that were originally candidates but were not included in the model for a variety of reasons should be included. This is the case of excluded variable specification bias. Often the inclusion of such variables removes the correlation pattern observed among the residuals. For example, suppose we have the following demand model:

where Y = quantity of beef demanded, X2 = price of beef, X3 = consumer income, X4 = price of pork, and t = time.4 However, for some reason we run the following regression:

Now if (12.1.2) is the "correct" model or the "truth" or true relation, running (12.1.3) is tantamount to letting vt = 04 X4t + ut. And to the extent the price of pork affects the consumption of beef, the error or disturbance term v will reflect a systematic pattern, thus creating (false) autocorrelation. A simple test of this would be to run both (12.1.2) and (12.1.3) and see whether autocorrelation, if any, observed in model (12.1.3) disappears when (12.1.2) is run.5 The actual mechanics of detecting autocorrelation will be discussed in Section 12.6 where we will show that a plot of the residuals from regressions (12.1.2) and (12.1.3) will often shed considerable light on serial correlation.

4As a matter of convention, we shall use the subscript t to denote time series data and the usual subscript i for cross-sectional data.

5If it is found that the real problem is one of specification bias, not autocorrelation, then as will be shown in Chap. 13, the OLS estimators of the parameters (12.1.3) may be biased as well as inconsistent.

446 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

446 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL FIGURE 12.2 Specification bias: incorrect functional form.

Specification Bias: Incorrect Functional Form. Suppose the "true" or correct model in a cost-output study is as follows:

Marginal cost = P1 + P2 output + P3 output2 + ui (12.1.4)

but we fit the following model:

The marginal cost curve corresponding to the "true" model is shown in Figure 12.2 along with the "incorrect" linear cost curve.

As Figure 12.2 shows, between points A and B the linear marginal cost curve will consistently overestimate the true marginal cost, whereas beyond these points it will consistently underestimate the true marginal cost. This result is to be expected, because the disturbance term vi is, in fact, equal to output2 + ui, and hence will catch the systematic effect of the output2 term on marginal cost. In this case, vi will reflect autocorrelation because of the use of an incorrect functional form. In Chapter 13 we will consider several methods of detecting specification bias.

Cobweb Phenomenon. The supply of many agricultural commodities reflects the so-called cobweb phenomenon, where supply reacts to price with a lag of one time period because supply decisions take time to implement (the gestation period). Thus, at the beginning of this year's planting of crops, farmers are influenced by the price prevailing last year, so that their supply function is

Suppose at the end of period t, price Pt turns out to be lower than Pt-1. Therefore, in period t + 1 farmers may very well decide to produce less than they did in period t. Obviously, in this situation the disturbances ut are not expected to be random because if the farmers overproduce in year t, they are likely to reduce their production in t + 1, and so on, leading to a Cobweb pattern.

Lags. In a time series regression of consumption expenditure on income, it is not uncommon to find that the consumption expenditure in the current period depends, among other things, on the consumption expenditure of the previous period. That is,

Consumption = fa + fa incomet + fa consumption^ + ut (12.1.7)

A regression such as (12.1.7) is known as autoregression because one of the explanatory variables is the lagged value of the dependent variable. (We shall study such models in Chapter 17.) The rationale for a model such as (12.1.7) is simple. Consumers do not change their consumption habits readily for psychological, technological, or institutional reasons. Now if we neglect the lagged term in (12.1.7), the resulting error term will reflect a systematic pattern due to the influence of lagged consumption on current consumption.

"Manipulation" of Data. In empirical analysis, the raw data are often "manipulated." For example, in time series regressions involving quarterly data, such data are usually derived from the monthly data by simply adding three monthly observations and dividing the sum by 3. This averaging introduces smoothness into the data by dampening the fluctuations in the monthly data. Therefore, the graph plotting the quarterly data looks much smoother than the monthly data, and this smoothness may itself lend to a systematic pattern in the disturbances, thereby introducing autocorrelation. Another source of manipulation is interpolation or extrapolation of data. For example, the Census of Population is conducted every 10 years in this country, the last being in 2000 and the one before that in 1990. Now if there is a need to obtain data for some year within the intercensus period 1990-2000, the common practice is to interpolate on the basis of some ad hoc assumptions. All such data "massaging'' techniques might impose upon the data a systematic pattern that might not exist in the original data.6

Data Transformation. As an example of this, consider the following model:

where, say, Y = consumption expenditure and X = income. Since (12.1.8) holds true at every time period, it holds true also in the previous time

448 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

Yt~i, Xt-i, and ut—i are known as the lagged values of Y, X, and u, respectively, here lagged by one period. We will see the importance of the lagged values later in the chapter as well in several places in the text. Now if we subtract (12.1.9) from (12.1.8), we obtain where A, known as the first difference operator, tells us to take successive differences of the variables in question. Thus, AYt = (Yt — Yt-1), AXt = (Xt — Xt— 1), and Aut = (ut — ut—1). For empirical purposes, we write (12.1.10) as where vt = Aut = (ut — ut—1).

Equation (12.1.9) is known as the level form and Eq. (12.1.10) is known as the (first) difference form. Both forms are often used in empirical analysis. For example, if in (12.1.9) Y and X represent the logarithms of consumption expenditure and income, then in (12.1.10) AY and AX will represent changes in the logs of consumption expenditure and income. But as we know, a change in the log of a variable is a relative change, or a percentage change, if the former is multiplied by 100. So, instead of studying relationships between variables in the level form, we may be interested in their relationships in the growth form.

Now if the error term in (12.1.8) satisfies the standard OLS assumptions, particularly the assumption of no autocorrelation, it can be shown that the error term vt in (12.1.11) is autocorrelated. (The proof is given in Appendix 12A, Section 12A.1.) It may be noted here that models like (12.1.11) are known as dynamic regression models, that is, models involving lagged regressands. We will study such models in depth in Chapter 17.

The point of the preceding example is that sometimes autocorrelation may be induced as a result of transforming the original model.

Nonstationarity. We mentioned in Chapter 1 that, while dealing with time series data, we may have to find out if a given time series is stationary. Although we will discuss the topic of nonstationary time series more thoroughly in the chapters on time series econometrics in Part V of the text, loosely speaking, a time series is stationary if its characteristics (e.g., mean, variance, and covariance) are time invariant; that is, they do not change over time. If that is not the case, we have a nonstationary time series.

As we will discuss in Part V, in a regression model such as (12.1.8), it is quite possible that both Y and X are nonstationary and therefore the error u is also nonstationary.7 In that case, the error term will exhibit autocorrelation.

7As we will also see in Part V, even though Y and X are nonstationary, it is possible to find u to be stationary. We will explore the implication of that later on.

Gujarati: Basic I II. Relaxing the I 12. Autocorrelation: What I I © The McGraw-Hill

Econometrics, Fourth Assumptions of the Happens if the Error Terms Companies, 2004

Edition Classical Model are Correlated?

CHAPTER TWELVE: AUTOCORRELATION 449

Time 