## Info

as estimator of a2.

We now describe the various steps in estimating the logit regression (15.6.1):

1. For each income level X, compute the probability of owning a house as Pi = ni / Ni.

2. For each Xi, obtain the logit as24

3. To resolve the problem of heteroscedasticity, transform (15.6.1) as follows25:

23As shown in elementary probability theory, Pi, the proportion of successes (here, owning a house), follows the binomial distribution with mean equal to true Pi and variance equal to Pi (1 - Pi )/Ni; and as N increases indefinitely the binomial distribution approximates the normal distribution. The distributional properties of ui given in (15.6.4) follow from this basic theory. For details, see Henry Theil, "On the Relationships Involving Qualitative Variables,'' American Journal of Sociology, vol. 76, July 1970, pp. 103-154.

24Since Pi = ni/Ni, Li can be alternatively expressed as Li = ln ni/(Ni - ni). In passing it should be noted that to avoid Pi taking the value of 0 or 1, in practice Li is measured as Li = ln(n + 2)/(Ni - ni + 1) = ln(Pi + 1/2Ni)/(1 - Pi + 1/2Ni). It is recommended as a rule of thumb that Ni be at least 5 at each value of Xi. For additional details, see D. R. Cox, Analysis of Binary Data, Methuen, London, 1970, p. 33.

25If we estimate (15.6.1) disregarding heteroscedasticity, the estimators, although unbiased, will not be efficient, as we know from Chap. 11.

600 PART THREE: TOPICS IN ECONOMETRICS

which we write as

where the weights wi = NiPi(1 — Pi); L* = transformed or weighted Li; X* = transformed or weighted Xi; and vi = transformed error term. It is easy to verify that the transformed error term vi is homoscedastic, keeping in mind that the original error variance is &u = 1/[ NiPi (1 — Pi)].

4. Estimate (15.6.6) by OLS—recall that WLS is OLS on the transformed data. Notice that in (15.6.6) there is no intercept term introduced explicitly (why?). Therefore, one will have to use the regression through the origin routine to estimate (15.6.6).

5. Establish confidence intervals and/or test hypotheses in the usual OLS framework, but keep in mind that all the conclusions will be valid strictly speaking if the sample is reasonably large (why?). Therefore, in small samples, the estimated results should be interpreted carefully.

15.7 THE GROUPED LOGIT (GLOGIT) MODEL: A NUMERICAL EXAMPLE

To illustrate the theory just discussed, we will use the data given in Table 15.4. Since the data in the table are grouped, the logit model based on this data will be called a grouped logit model, glogit, for short. The necessary raw data and other relevant calculations necessary to implement glogit are given in Table 15.5. The results of the weighted least-squares regression (15.6.7) based on the data given in Table 15.5 are as follows: Note that there is no intercept in (15.6.7); hence the regression-through-origin procedure is appropriate here.

The R2 is the squared correlation coefficient between actual and estimated Li*. Li* and Xi* are weighted Li and Xi, as shown in (15.6.6).

Interpretation of the Estimated Logit Model

How do we interpret (15.7.1)? There are various ways, some intuitive and some not:

Logit Interpretation. As (15.7.1) shows, the estimated slope coefficient suggests that for a unit (\$1000) increase in weighted income, the weighted log of the odds in favor of owning a house goes up by 0.08 units. This mechanical interpretation, however, is not very appealing.

IP HI 