## The Logit Model

We will continue with our home ownership example to explain the basic ideas underlying the logit model. Recall that in explaining home ownership in relation to income, the LPM was

where X is income and Y = 1 means the family owns a house. But now consider the following representation of home ownership:

For ease of exposition, we write (15.5.2) as

1 ez

Equation (15.5.3) represents what is known as the (cumulative) logistic distribution function.15

It is easy to verify that as Zi ranges from —to to +to, Pi ranges between 0 and 1 and that Pi is nonlinearly related to Zi (i.e., Xi), thus satisfying the two requirements considered earlier.16 But it seems that in satisfying these requirements, we have created an estimation problem because Pi is nonlinear not only in X but also in the P's as can be seen clearly from (15.5.2). This means that we cannot use the familiar OLS procedure to estimate the parameters.17 But this problem is more apparent than real because (15.5.2) can be linearized, which can be shown as follows.

15The logistic model has been used extensively in analyzing growth phenomena, such as population, GNP, money supply, etc. For theoretical and practical details of logit and probit models, see J. S. Kramer, The Logit Model for Economists, Edward Arnold Publishers, London, 1991; and G. S. Maddala, op. cit.

16Note that as Zi ^ +to, e—Zi tends to zero and as Zi ^ —to, e—Zi increases indefinitely. Recall that e = 2.71828.

17Of course, one could use nonlinear estimation techniques discussed in Chap. 14. See also Sec. 15.8.

596 PART THREE: TOPICS IN ECONOMETRICS

If Pi, the probability of owning a house, is given by (15.5.3), then (1 — Pi), the probability of not owning a house, is

Therefore, we can write

Now Pi/(1 — Pi) is simply the odds ratio in favor of owning a house—the ratio of the probability that a family will own a house to the probability that it will not own a house. Thus, if Pi = 0.8, it means that odds are 4 to 1 in favor of the family owning a house.

Now if we take the natural log of (15.5.5), we obtain a very interesting result, namely,

that is, L, the log of the odds ratio, is not only linear in X, but also (from the estimation viewpoint) linear in the parameters.18 L is called the logit, and hence the name logit model for models like (15.5.6).

### Notice these features of the logit model.

1. As P goes from 0 to 1 (i.e., as Z varies from —to to +to), the logit L goes from —to to +to. That is, although the probabilities (of necessity) lie between 0 and 1, the logits are not so bounded.

2. Although L is linear in X, the probabilities themselves are not. This property is in contrast with the LPM model (15.5.1) where the probabilities increase linearly with X.19

3. Although we have included only a single X variable, or regressor, in the preceding model, one can add as many regressors as may be dictated by the underlying theory.

4. If L, the logit, is positive, it means that when the value of the regres-sor(s) increases, the odds that the regressand equals 1 (meaning some event of interest happens) increases. If L is negative, the odds that the regressand equals 1 decreases as the value of X increases. To put it differently, the logit

18Recall that the linearity assumption of OLS does not require that the X variable be necessarily linear. So we can have X2, X3 , etc., as regressors in the model. For our purpose, it is linearity in the parameters that is crucial.

19Using calculus, it can be shown that dP/dX = fa P(1 — P), which shows that the rate of change in probability with respect to X involves not only fa but also the level of probability from which the change is measured (but more on this in Sec. 15.7). In passing, note that the effect of a unit change in Xi on P is greatest when P = 0.5 and least when P is close to 0 or 1.

CHAPTER FIFTEEN: QUALITATIVE RESPONSE REGRESSION MODELS 597

becomes negative and increasingly large in magnitude as the odds ratio decreases from 1 to 0 and becomes increasingly large and positive as the odds ratio increases from 1 to infinity.20

5. More formally, the interpretation of the logit model given in (15.5.6) is as follows: fa, the slope, measures the change in L for a unit change in X, that is, it tells how the log-odds in favor of owning a house change as income changes by a unit, say, \$1000. The intercept fa is the value of the log-odds in favor of owning a house if income is zero. Like most interpretations of intercepts, this interpretation may not have any physical meaning.

6. Given a certain level of income, say, X* if we actually want to estimate not the odds in favor of owning a house but the probability of owning a house itself, this can be done directly from (15.5.3) once the estimates of fa1 + fa2 are available. This, however, raises the most important question: How do we estimate fa and fa in the first place? The answer is given in the next section.

7. Whereas the LPM assumes that Pi is linearly related to Xi, the logit model assumes that the log of the odds ratio is linearly related to Xi.