## Info

R-squared 0.276707

S.E. of regression 4.395772

Sum squared resid 10086.51

Log likelihood -1527.962

Durbin-Watson stat 1.858629

Mean dependent var 9.04753 8

S.D. dependent var 5.144082

Akaike info criterion 5.810462

Schwarz criterion 5.858974

F-statistic 39.93978

Prob(F-statistic) 0.000000

546 PART TWO: RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL

straightforward. For example, the value of —0.8408 of the region dummy suggests that holding all the other variables constant, on average, workers in the South earn about 84 cents less per hour than their counterparts elsewhere, perhaps because of the low cost of living in the South and/or the fact the South is less unionized. Similarly, on average, women earn less than their male counterparts, by about \$2.13, holding all other factors constant. Whether this amounts to gender discrimination cannot be told from the statistical analysis alone.

As expected, the "short" regression (omitting Hispanic, marital status, and race variables) has a lower adjusted R 2 than the "long" regression (i.e., the regression that includes all the variables), as one would expect. But notice the Akaike and Schwarz statistics: They are both lower for the short regression compared to the long regression, showing how they penalize for introducing more regressors in the model. Since the values of both statistics are so close that one can choose either of the statistics, the Durbin-Watson d value in both models is sufficiently close to 2 to suggest any "autocorrelation" or specification errors.

Since the data underlying regression (13.11.1) are given in the data disk, you may want to "experiment" with the data. It is quite possible that there might be some interaction between the gender and education dummies or gender and marital status dummies. It is also possible that the relationship between hourly wage and labor market experience is nonlinear, necessitating the introduction of the squared education term in the regression model. As you can see, even with a given data set, there are several possibilities. This might sound like data mining, but we have already noted that data mining may have some role to play in econometric modeling. Of course, you should keep in mind the true level of significance in carrying out data mining.

We have covered a lot of ground in this chapter. There is no question that model building is an art as well as a science. A practical researcher may be bewildered by theoretical niceties and an array of diagnostic tools. But it is well to keep in mind Martin Feldstein's caution that "The applied econome-trician, like the theorist, soon discovers from experience that a useful model is not one that is 'true' or 'realistic' but one that is parsimonious, plausible and informative."48

Peter Kennedy of Simon Fraser University in Canada advocates the following "Ten Commandments of Applied Econometrics"49:

1. Thou shalt use common sense and economic theory.

2. Thou shalt ask the right questions (i.e., put relevance before mathematical elegance).