CHAPTER ELEVEN: HETEROSCEDASTICITY 401
economic investigations. In this respect the econometrician differs from scientists in fields such as agriculture and biology, where researchers have a good deal of control over their subjects. More often than not, in economic studies there is only one sample Y value corresponding to a particular value of X. And there is no way one can know ai2 from just one Y observation. Therefore, in most cases involving econometric investigations, heteroscedasticity may be a matter of intuition, educated guesswork, prior empirical experience, or sheer speculation.
With the preceding caveat in mind, let us examine some of the informal and formal methods of detecting heteroscedasticity. As the following discussion will reveal, most of these methods are based on the examination of the OLS residuals u since they are the ones we observe, and not the disturbances ui. One hopes that they are good estimates of ui, a hope that may be fulfilled if the sample size is fairly large.
Nature of the Problem Very often the nature of the problem under consideration suggests whether heteroscedasticity is likely to be encountered. For example, following the pioneering work of Prais and Houthakker on family budget studies, where they found that the residual variance around the regression of consumption on income increased with income, one now generally assumes that in similar surveys one can expect unequal variances among the disturbances.9 As a matter of fact, in cross-sectional data involving heterogeneous units, heteroscedasticity may be the rule rather than the exception. Thus, in a cross-sectional analysis involving the investment expenditure in relation to sales, rate of interest, etc., het-eroscedasticity is generally expected if small-, medium-, and large-size firms are sampled together.
As a matter of fact, we have already come across examples of this. In Chapter 2 we discussed the relationship between mean, or average, hourly wages in relation to years of schooling in the United States. In that chapter we also discussed the relationship between expenditure on food and total expenditure for 55 families in India (see exercise 11.16).
Graphical Method If there is no a priori or empirical information about the nature of heteroscedasticity, in practice one can do the regression analysis on the assumption that there is no heteroscedasticity and then do a postmortem examination of the residual squared u2 to see if they exhibit any systematic pattern. Although u2 are not the same thing as u2, they can be
9S. J. Prais and H. S. Houthakker, The Analysis of Family Budgets, Cambridge University Press, New York, 1955.
402 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL
used as proxies especially if the sample size is sufficiently large.10 An examination of the u2 may reveal patterns such as those shown in Figure 11.8.
In Figure 11.8, u2 are plotted against Yi, the estimated Yi from the regression line, the idea being to find out whether the estimated mean value of Y is systematically related to the squared residual. In Figure 11.8a we see that there is no systematic pattern between the two variables, suggesting that perhaps no heteroscedasticity is present in the data. Figure 11.8& to e, however, exhibits definite patterns. For instance, Figure 11.8c suggests a linear relationship, whereas Figure 11.8d and e indicates a quadratic relationship between u2 and Yi. Using such knowledge, albeit informal, one may transform the data in such a manner that the transformed data do not exhibit hetero-scedasticity. In Section 11.6 we shall examine several such transformations.
Instead of plotting u2 against Yi, one may plot them against one of the explanatory variables, especially if plotting u2 against Yi results in the pattern shown in Figure 11.8a. Such a plot, which is shown in Figure 11.9, may reveal patterns similar to those given in Figure 11.8. (In the case of the two-variable model, plotting u2 against Yi is equivalent to plotting it against
10For the relationship between Ui and Ui, see E. Malinvaud, Statistical Methods of Econometrics, North Holland Publishing Company, Amsterdam, 1970, pp. 88-89.
CHAPTER ELEVEN: HETEROSCEDASTICITY 403
Xi, and therefore Figure 11.9 is similar to Figure 11.8. But this is not the situation when we consider a model involving two or more X variables; in this instance, U2 may be plotted against any X variable included in the model.)
A pattern such as that shown in Figure 11.9c, for instance, suggests that the variance of the disturbance term is linearly related to the X variable. Thus, if in the regression of savings on income one finds a pattern such as that shown in Figure 11.9c, it suggests that the heteroscedastic variance may be proportional to the value of the income variable. This knowledge may help us in transforming our data in such a manner that in the regression on the transformed data the variance of the disturbance is ho-moscedastic. We shall return to this topic in the next section.
Park Test11 Park formalizes the graphical method by suggesting that of is some function of the explanatory variable Xi. The functional form he nR. E. Park, "Estimation with Heteroscedastic Error Terms,'' Econometrica, vol. 34, no. 4, October 1966, p. 888. The Park test is a special case of the general test proposed by A. C. Harvey in "Estimating Regression Models with Multiplicative Heteroscedasticity," Econometrica, vol. 44, no. 3, 1976, pp. 461-465.
404 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL
suggested was a* = a * Xfevi or ln a* = ln a * + 3 ln Xi + vi (11.5.1)
where vi is the stochastic disturbance term.
Since a* is generally not known, Park suggests using U* as a proxy and running the following regression:
If 3 turns out to be statistically significant, it would suggest that heteroscedasticity is present in the data. If it turns out to be insignificant, we may accept the assumption of homoscedasticity. The Park test is thus a two-stage procedure. In the first stage we run the OLS regression disregarding the heteroscedasticity question. We obtain Ui from this regression, and then in the second stage we run the regression (11.5.*).
Although empirically appealing, the Park test has some problems. Goldfeld and Quandt have argued that the error term vi entering into (11.5.*) may not satisfy the OLS assumptions and may itself be heteroscedastic.1* Nonetheless, as a strictly exploratory method, one may use the Park test.
Was this article helpful?