## If n Prob[do i z n n mttmmt 1J

5. Clearly, if e(i, t) is assumed to be logit distributed, or generated by any other symmetric (around zero) distribution, equation (3.11) applies with <D as the relevant cumulative distribution function. The modification of (3.11) for asymmetric random variables is straightforward.

3.5 A Random Effect Bernoulli Model and One-Factor Schemes

Unobserved temporally correlated error components are now introduced into the analysis. Such components are often termed heterogeneity in the applied literature on stochastic processes. Initially it is assumed that individuals all have the same values for the time invariant exogenous variables so that V{i, t)= V. Also it is assumed initially that s(i, t) has a components of variance structure :

where U(i, t) is iid with mean zero and variance a\¡ and t(z") is distributed independent of the U(i, t).

Individual i has a fixed component t (i). Given x(z'), the probability that person i experiences an event at time t (d (z, t) = 1) is

Prob[e(z, t) > - V\ t(z)] = Prob[£/(z,í) > - (r(í) + V)]

The mean probability in the population is P = Prob[«(i, t) >-V] = jProb[i/(i,/) > - (t(i) + V)]f{x)dx V

where/(t) is the frequency distribution of x, and where Prob[£/(z, /) > - (z(i) + V)] is shorthand for the probability that U(i, t) exceeds minus (f(z') + V) given f(z) and V—a shorthand notation that will be used in the rest of this chapter. The mean probability in the population P, and hence ty?I {oh + ax )1/2]>can be estimated from a single cross section by ordinary probit analysis. At least two years of panel data must be obtained to estimate the correlation coefficient between a{i, t) and s(i, /'), t ^ t'. This is known as the intraclass correlation coefficient, p = + ol). Using probit analysis, the expected number of periods in the state, PT, can be estimated, but at least two periods of panel data are required to estimate the population variance, (j'/,(t)(l — P(z))f(z)dx)T, unless/(i) is degenerate (of = 0).

Maximum likelihood estimators of P based on a single cross section are consistent estimators of P as the cross section sample size / becomes large.

Unlike the situation in the preceding model, maximum likelihood estimators based on a long time series on one person or a large cross section at a point in time estimate different parameters. If both samples become large, the first sample estimates (V + x(i))[av while the second sample estimates Vf(o y + of)1'2. The first sample is conditioned on a specific value of t(;), so that z is a fixed effect indistinguishable from V. The second sample is not conditioned on a specific value of -i(i).

As a consequence of Jensen's inequality the average duration in the state cannot be estimated from cross section data, because expected continuous duration in the state satisfies the following inequality:

where Ex denotes expectation with respect to the density of t, /(t). Estimates of the average duration based on an estimated cross section probability (an estimate of P) understate the average length of duration in the state.

Panel data can be used to estimate a separate P(t(0) for each person by the method of maximum likelihood. This estimate is consistent as T becomes large. The estimated probabilities can be used to generate consistent estimators of the average duration in a state for each person: insert the estimated P(t(i)) into the mathematical formula for average duration.6

The probability of J successes (Ed(i, t) = J) and T — J failures is the same for any sequence with J successes in any order. To see this, note that conditional on z{i) the model in this section is the same as in the preceding section. Removing the conditioning (by integrating out t (/)), leads to the probability of / successes and T — J failures in a particular sequence as

As in a case without heterogeneity, any of the (J) sequences with J successes have the same probability.

It is possible to account for measured differences in personal characteristics in exactly the same way as is done in the model presented in the preceding section. If V(i, t) is assumed to be a linear function of known exogenous variables, one may write

6. This example illustrates the point that panel data can be used to relax the ergodicity assumption maintained in much work in stationary time-series analysis.

Under the identification conditions specified in section 3.3, ft is estimable. This model has been estimated by Heckman and Willis (1975). Using maximum likelihood, they estimate /} and p under the normalizing assumption that of + of, = 1. This final assumption may be relaxed. Exactly as in the model of the preceding section it is possible to permit the disturbance variances to differ among time periods and estimate the ratio among disturbance variances in different periods. Thus a nonstationary version of the model can be estimated. If the Z{i, t) are permitted to vary arbitrarily, and disturbance variances are permitted to assume a free structure, the exchangeability property of the random effects model disappears.

Defining the probability of a given sequence of events given Z(i) for the random effect model is straightforward. For convenience it is useful to work with the standardized value of t, x = (t/ctj, which has mean zero and variance one. Define fi as p/av. In this notation the probability of sequence d(z) given Z(i) is

Z{i,t)ß + x, j where/(f) is the density of the standard normal distribution and p < I.7 Subject to the given identification conditions maximum likelihood estimators of /Fand p are consistent and efficient. The likelihood formed from the product of the probabilities is relatively easy to compute since it involves only one numerical integration per observation of products of cumulative normal error functions which are available on most computers.

7. The probability that d(i, t) = 1 given Z(i, t) and x(<) is fi/(i, t) fi T(I')1

Prob[U(i, 0 > - Z(i, t)P ~ t(0] = Probj —-> - Z(¡', t)----— .

Since /? = P/av, and since ajav =(p/1 — p)1'2, this probability is ru(i, i) ( p V'2 i r