## An Autoregressive Integrated Moving Average Arima Process

The time series models we have already discussed are based on the assumption that the time series involved are (weakly) stationary in the sense defined in Chapter 21. Briefly, the mean and variance for a weakly stationary time series are constant and its covariance is time-invariant. But we know that many economic time series are nonstationary, that is, they are integrated; for example, the economic time series in Table 2i.i are integrated.

But we also saw in Chapter 2i that if a time series is integrated of order i [i.e., it is /(!)], its first differences are 7(0), that is, stationary. Similarly, if a

840 PART FOUR: SIMULTANEOUS-EQUATION MODELS

time series is 7(2), its second difference is 7(0). In general, if a time series is 7(d), after differencing it d times we obtain an 7(0) series.

Therefore, if we have to difference a time series d times to make it stationary and then apply the ARMA(p, q) model to it, we say that the original time series is ARIMA(p, d, q), that is, it is an autoregressive integrated moving average time series, where p denotes the number of autoregressive terms, d the number of times the series has to be differenced before it becomes stationary, and q the number of moving average terms. Thus, an ARIMA(2, 1, 2) time series has to be differenced once (d = 1) before it becomes stationary and the (first-differenced) stationary time series can be modeled as an ARMA(2, 2) process, that is, it has two AR and two MA terms. Of course, if d = 0 (i.e., a series is stationary to begin with), ARIMA(p, d = 0, q) = ARMA(p, q). Note that an ARIMA(p, 0, 0) process means a purely AR(p) stationary process; an ARIMA(0, 0, q) means a purely MA(q) stationary process. Given the values of p, d, and q, one can tell what process is being modeled.

The important point to note is that to use the Box-Jenkins methodology, we must have either a stationary time series or a time series that is stationary after one or more differencings. The reason for assuming stationarity can be explained as follows:

The objective of B-J [Box-Jenkins] is to identify and estimate a statistical model which can be interpreted as having generated the sample data. If this estimated model is then to be used for forecasting we must assume that the features of this model are constant through time, and particularly over future time periods. Thus the simple reason for requiring stationary data is that any model which is inferred from these data can itself be interpreted as stationary or stable, therefore providing valid basis for forecasting.6

22.3 THE BOX-JENKINS (BJ) METHODOLOGY

The million-dollar question obviously is: Looking at a time series, such as the U.S. GDP series in Figure 21.1, how does one know whether it follows a purely AR process (and if so, what is the value of p) or a purely MA process (and if so, what is the value of q) or an ARMA process (and if so, what are the values of p and q) or an ARIMA process, in which case we must know the values of p, d, and q. The BJ methodology comes in handy in answering the preceding question. The method consists of four steps:

Step 1. Identification. That is, find out the appropriate values of p, d, and q. We will show shortly how the correlogram and partial correlogram aid in this task.

Step 2. Estimation. Having identified the appropriate p and q values, the next stage is to estimate the parameters of the autoregressive and moving average terms included in the model. Sometimes this calculation can be done by simple least squares but sometimes we will have to resort to

6Michael Pokorny, An Introduction to Econometrics, Basil Blackwell, New York, 1987, p. 343.

CHAPTER TWENTY-TWO: TIME SERIES ECONOMETRICS: FORECASTING 841

CHAPTER TWENTY-TWO: TIME SERIES ECONOMETRICS: FORECASTING 841 FIGURE 22.1 The Box-Jenkins methodology.

nonlinear (in parameter) estimation methods. Since this task is now routinely handled by several statistical packages, we do not have to worry about the actual mathematics of estimation; the enterprising student may consult the references on that.

Step 3. Diagnostic checking. Having chosen a particular ARIMA model, and having estimated its parameters, we next see whether the chosen model fits the data reasonably well, for it is possible that another ARIMA model might do the job as well. This is why Box-Jenkins ARIMA modeling is more an art than a science; considerable skill is required to choose the right ARIMA model. One simple test of the chosen model is to see if the residuals estimated from this model are white noise; if they are, we can accept the particular fit; if not, we must start over. Thus, the BJ methodology is an iterative process (see Figure 22.1).

Step 4. Forecasting. One of the reasons for the popularity of the ARIMA modeling is its success in forecasting. In many cases, the forecasts obtained by this method are more reliable than those obtained from the traditional econometric modeling, particularly for short-term forecasts. Of course, each case must be checked.

With this general discussion, let us look at these four steps in some detail. Throughout, we will use the GDP data given in Table 21.1 to illustrate the various points. 