## Info

10The interested reader may refer to Kmenta, op. cit., pp. 625-630 for an accessible discussion.

10The interested reader may refer to Kmenta, op. cit., pp. 625-630 for an accessible discussion.

650 PART THREE: TOPICS IN ECONOMETRICS

16.5 FIXED EFFECTS (LSDV) VERSUS RANDOM EFFECTS MODEL

The challenge facing a researcher is: Which model is better, FEM or ECM? The answer to this question hinges around the assumption one makes about the likely correlation between the individual, or cross-section specific, error component si and the X regressors.

If it is assumed that ei and the X's are uncorrelated, ECM may be appropriate, whereas if ei and the X's are correlated, FEM may be appropriate.

Why would one expect correlation between the individual error component ei and one or more regressors? Consider an example. Suppose we have a random sample of a large number of individuals and we want to model their wage, or earnings, function. Suppose earnings are a function of education, work experience, etc. Now if we let ei stand for innate ability, family background, etc., then when we model the earnings function including ei it is very likely to be correlated with education, for innate ability and family background are often crucial determinants of education. As Wooldridge contends, "In many applications, the whole reason for using panel data is to allow the unobserved effect [i.e., ei] to be correlated with the explanatory variables."11 The assumptions underlying ECM is that the ei are a random drawing from a much larger population. But sometimes this may not be so. For example, suppose we want to study the crime rate across the 50 states in the United States. Obviously, in this case, the assumption that the 50 states are a random sample is not tenable.

Keeping this fundamental difference in the two approaches in mind, what more can we say about the choice between FEM and ECM? Here the observations made by Judge et al. may be helpful12:

1. If T (the number of time series data) is large and N (the number of cross-sectional units) is small, there is likely to be little difference in the values of the parameters estimated by FEM and ECM. Hence the choice here is based on computational convenience. On this score, FEM may be preferable.

2. When N is large and T is small, the estimates obtained by the two methods can differ significantly. Recall that in ECM p1i = fa + si, where si is the cross-sectional random component, whereas in FEM we treat fi1i as fixed and not random. In the latter case, statistical inference is conditional on the observed cross-sectional units in the sample. This is appropriate if we strongly believe that the individual, or cross-sectional, units in our sample are not random drawings from a larger sample. In that case, FEM is appropriate. However, if the cross-sectional units in the sample are regarded as random drawings, then ECM is appropriate, for in that case statistical inference is unconditional.

3. If the individual error component si and one or more regressors are correlated, then the ECM estimators are biased, whereas those obtained from FEM are unbiased.

"Wooldridge, op. cit., p. 450. 12Judge et al., op. cit., pp. 489-491.

CHAPTER SIXTEEN: PANEL DATA REGRESSION MODELS 651

4. If N is large and T is small, and if the assumptions underlying ECM hold, ECM estimators are more efficient than FEM estimators.13

Is there a formal test that will help us to choose between FEM and ECM? Yes, a test was developed by Hausman in 1978.14 We will not discuss the details of this test, for they are beyond the scope of this book.15 The null hypothesis underlying the Hausman test is that the FEM and ECM estimators do not differ substantially. The test statistic developed by Hausman has an asymptotic x2 distribution. If the null hypothesis is rejected, the conclusion is that ECM is not appropriate and that we may be better off using FEM, in which case statistical inferences will be conditional on the ei in the sample.

Despite the Hausman test, it is important to keep in mind the warning sounded by Johnston and DiNardo. In deciding between fixed effects or random effects models, they argue that, "... there is no simple rule to help the researcher navigate past the Scylla of fixed effects and the Charybdis of measurement error and dynamic selection. Although they are an improvement over cross-section data, panel data do not provide a cure-all for all of an econometrician's problems."16

16.6 PANEL DATA REGRESSIONS: SOME CONCLUDING COMMENTS

As noted at the outset, the topic of panel data modeling is vast and complex. We have barely scratched the surface. Among the topics that we have not discussed, the following may be mentioned.

1. Hypothesis testing with panel data.

2. Heteroscedasticity and autocorrelation in ECM.

3. Unbalanced panel data.

4. Dynamic panel data models in which the lagged value(s) of the regressand (Yit) appears as an explanatory variable.

5. Simultaneous equations involving panel data.

### 6. Qualitative dependent variables and panel data.

One or more of these topics can be found in the references cited in this chapter, and the reader is urged to consult them to learn more about this topic. These references also cite several empirical studies in various areas of business and economics that have used panel data regression models. The beginner is well advised to read some of these applications to get a feel about how researchers have actually implemented such models.

13Taylor has shown that for T > 3 and (N — K) > 9, where K is the number of regressors, the statement holds. See W. E. Taylor, "Small Sample Considerations in Estimation from Panel Data," Journal of Econometrics, vol. 13, 1980, pp. 203-223.

14J. A. Hausman, "Specification Tests in Econometrics," Econometrica, vol. 46, 1978, pp. 1251-1271.

15For the details, see Baltagi, op. cit., pp. 68-73.

16Jack Johnson and John DiNardo, Econometric Methods, 4th ed., McGraw-Hill, 1997, p. 403.

652 PART THREE: TOPICS IN ECONOMETRICS