## Additional Topics In Econometric Modeling

As noted in the introduction to this chapter, the topic of econometric modeling and diagnostic testing is so vast and evolving that specialized books are written on this topic. In the previous section we have touched on some major themes in this area. In this section we consider a few additional features that researchers may find useful in practice. In particular, we consider these topics: (1) outliers, leverage, and influence; (2) recursive least squares, and (3) Chow's prediction failure test. Of necessity the discussion of each of these topics will be brief.

Outliers, Leverage, and Influence41

Recall that, in minimizing the residual sum of squares (RSS), OLS gives equal weight to every observation in the sample. But every observation may not have equal impact on the regression results because of the presence of three types of special data points called outliers, leverage points, and influence points. It is important that we know what they are and how they influence regression analysis.

In the regression context, an outlier may be defined as an observation with a "large residual." Recall that u = (Y — Yi), that is, the residual represents the difference (positive or negative) between the actual value of the regressand and its value estimated from the regression model. When we say that a residual is large, it is in comparison with the other residuals and very often such a large residual catches our attention immediately because of its rather large vertical distance from the estimated regression line. Note that in a data set there may be more than one outlier. We have already encountered an example of this in exercise 11.22, where you were asked to regress percent change in stock prices (Y) on percent change in consumer prices (X) for a sample of 20 countries. One observation, that relating to Chile, was an outlier.

40Wojciech W. Charemza and Derek F. Deadman, New Directions in Econometric Practice: A General to Specific Modelling, Cointegration and Vector Autoregression, 2d ed., Edward Elgar Publishers, 1997, p. 30. See also pp. 250-252 for their views on various model selection criteria.

41The following discussion is influenced by Chandan Mukherjee, Howard White, and Marc Wyuts, Econometrics and Data Analysis for Developing Countries, Routledge, New York, 1998, pp. 137-148.

CHAPTER THIRTEEN: ECONOMETRIC MODELING 541

A data point is said to exert (high) leverage if it is disproportionately distant from the bulk of the values of a regressor(s). Why does a leverage point matter? It matters because it is capable of pulling the regression line toward itself, thus distorting the slope of the regression line. If this actually happens, then we call such a leverage (data) point an influential point. The removal of such a data point from the sample can dramatically affect the regression line. Returning to exercise 11.22, you will see that if you regress Y on X including the observation for Chile, the slope coefficient is positive and "highly statistically significant." But if you drop the observation for Chile, the slope coefficient is practically zero. Thus the Chilean observation has leverage and is also an influential observation.

To further clarify the nature of outliers, leverage and influence points, consider the diagram in Figure 13.4, which is self-explanatory.42

How do we handle such data points? Should we just drop them and confine our attention to the remaining data points? According to Draper and Smith:

Automatic rejection of outliers is not always a wise procedure. Sometimes the outlier is providing information that other data points cannot due to the fact that it arises from an unusual combination of circumstances which may be of vital interest and requires further investigation rather than rejection. As a general rule, outliers should be rejected out of hand only if they can be traced to causes such as errors of recording the observations or setting up the apparatus [in a physical experiment]. Otherwise, careful investigation is in order.43  