J T

Estimation of the parameters r is a complicated and computationally intensive exercise, much more so than is found with typical static models. The search for the parameter vector that maximizes the likelihood function involves solving the dynamic decision problem (2.1) each time new parameter values are evaluated in the search; this is apparent by the presence of vf in the limit of integration in (2.2). Solving the decision problem (2.1) requires dynamic programming, and so a DP simulation must be run each time new parameter values are evaluated in the search for the maximum likelihood value. Maximum likelihood estimation thus involves nesting an "inner" dynamic programming algorithm within an "outer" hill-climbing algorithm.

As a general matter, the inclusion of the unobserved state variables seriously impacts the tractability of the dynamic programming problem (2.1), as the expectation of the value function must be taken over the random components of the observed state vector x and the S-dimensional vector e. For all but the smallest problems this is not computationally feasible given that (2.1) must be solved many times during the course of the search for r*, the likelihood-maxmizing value of r. Making matters even more difficult is the multi-dimensional integration in (2.1) associated with each observation. The reader familiar with the literature on random utility models will recognize that the properties of the Gumbel distribution can be used to resolve the difficulty of the integration in (2.2). In a seminal paper Rust (1989) shows that these same properties of the Gumbel distribution can be used to resolve the difficulty of the integration implicit in the expectation in (2.1), as follows.

We assume that e is iid Gumbel-distributed with location parameters 0£ = 0,..,ds) and common scale parameter n£. Moreover, for expositional reasons we define

in which case the decision problem (2.1) can be restated, v (xt, et, y) = max [Rs (xt, y) + £f + /3V (xt, s; r)]. (2.5)

Then from the standard properties of the Gumbel distribution (see Ben-Akiva and Lerman (1985)), integrating both sides of (2.5) with respect to e on day t + 1 yields,

where 7 is Euler's constant (^0.577). Substitution of (2.6) into (2.4) at time t + 1 yields,

V (xt,st;r) = Exlxt,st -ln (f e^^t+i.yH^ W(xt+i,s;r))]

If v( ) is known, the determination of V( )—necessary to solve the decision problem (2.5)—is now a relatively simple affair involving integration over the random elements in the observable state vector xt+1. In practice, though, v() is not known; but V( ) can be determined via successive approximation (backwards recursion) due to the contraction mapping properties of Bellman's form. Initially V( ) is set identically equal to zero, and the value function v( ) is approximated from (2.5) using standard techniques for approximating a function, such as linear or Chebychev polynomial interpolation. Then the expectation of v( ) with respect to e is calculated from (2.6), and an approximation of V( ) is obtained from (2.7) using standard techniques of function interpolation and numerical integration. The approximation of V( ) is then used in (2.5) to obtain a new estimate of v(), and so on. This iterative mechanism terminates under conditions for convergence, such as sufficiently small changes across iterations in the optimal decision rule, s (x, e, y; r).

It deserves emphasis that the algorithm described above identifies the optimal decision rule - and, more to the point, the associated expected value function V( |r) - conditional on a particular set of parameters r. With the expected value function in hand, the assumption that e is iid Gumbel-distributed allows a restatement of (2.2) in the analytical form,

Pr (sjt = 0|xjt, yj; r) = Pr (v° + e0 > v1 + e1;...; v0 + e0 > vS + ef)

The upshot is that although it remains the case that a DP algorithm must be nested within the estimation algorithm used to find T*, assuming the unobserved state variables are iid Gumbel-distributed greatly simplifies the algorithm.

Examination of (2.8) also shows that a well-documented weakness of static multinomial logit models—the property of independence of irrelevant alternatives (IIA)—does not hold in a dynamic model employing Gumbel-distributed random variables. The IIA property derives from the fact that, from the perspective of the analyst, the odds that one alternative is chosen over another in a static model depends only on the attributes of the two alternatives. So, for instance, in the example presented by Bockstael, the IIA property implies the unlikely result that the odds of visiting a saltwater beach instead of a freshwater lake does not depend on whether a third beach is itself a saltwater or freshwater site. However, the log odds ratio for the dynamic model can be stated,

And so, by virtue of the presence of xt and y in V ( ), the odds of choosing alternative i over alternative k depends on the attributes (state of nature) of all the alternatives.

3. An Illustration: The Brazee-Mendelsohn Timber Harvest Problem

In this section we attempt to clarify the discussion above by examining a modification of the Brazee-Mendelsohn (BM) model of optimal harvesting when timber prices are stochastic (Brazee and Mendelsohn 1988). This is a simple but illuminating example. In the original BM model the forest owner faces, in each period, the binary decision to either harvest a timber stand or to postpone harvest. This decision depends on two state variables: the age of the timber stand a, and the price of timber p. Timber volume at stand age a is given by where and 02 are growth parameters.

Timber prices are independent and identically normally-distributed with mean price np and standard deviation ap; we denote the probability density function by g (p,p, ap). The forest owner solves the the problem, v(at,pt) = max[@Epv(at + l,pt+i),PtW(at) - c + 3Epv(l,pt+i)\ (3.2)

where c is the cost to harvest and replant. The problem is easily solved by ordinary dynamic programming techniques. The optimal harvest policy s(p, a) is a reservation price policy in which, for a given stand age, the forest owner

) = Ve R (xt, y) + 9l + 13V (xt+1 ,i;r)) Tie R (xt, y) + 9k + 3V (xt+i,k; r)^j

harvests if and only if the observed timber price is above a reservation price. Graphically this is represented by the harvest isocline shown in Figure 15.1; harvest occurs for all combinations of price and stand age above the isocline. When applying this model to actual data the analyst must account for the pos- Figure 15.1. Brazee-Mendelsohn Optimal Harvest Policy

sibility that observed harvest decisions deviate from the normative decision rule. This is accomplished by introducing the decision-specific state variables e = (e°, e1). This paradox—that the normative model does not fully describe the decision problem faced by "real world" dynamic optimizers, and so the normative model is, as a practical matter, not normative at all—is almost invariably the case for dynamic problems of resource allocation, because normative models distill a rich decision environment to a world fully described by just several state variables. The addition of the unobserved state variables can be viewed as a somewhat crude attempt to account for the richness of the real world.

The modified decision problem is,

= max [e° + (3Epv (at + 1, pt+i), PtW (at) - c + ej + (3Epv (1, pt+i)] (3.3) = max[e° + pV (at, 0; r), ptW (at) - c + ej + pV (at, 1; r)]

where the control variable takes a value of 1 if the stand is harvested and 0 otherwise. Note that (3.3) is a special case of (2.1) in which only one of the observed state variables is stochastic (p), and the distribution of the state variable is not conditional on the current value. In this framework, one possible interpretation of e is that it is the utility received from the standing forest (though in this simple case, the utility received is not conditional on stand age - generally an unrealistic specification).

Assuming that e is iid Gumbel-distributed with choice-specific location parameters 9%,i = 0,1, and common scale parameter the expected value function can be stated (using a modified version of (2.7):

M ° = ¿nc^+PV (at + l,°;r)) + eVt(pt+lW (at+i)-c+81+i3V (at + l,l;r)) and:

M1 = ens(s°+PV (i,°;r)) + ens(pt+iW (i)-c+e1+pv (i,i;r)),

Given parameters r = {ft, , &p, ns, 0°, 01}, V (•) can be approxi mated by a simple iterative recursion. In the first iteration, V (•) in M° and M1 is set to an arbitrary value, and an update of V (•) is found by solving (3.4) for each stand age a (this requires numerical approximation of the integral taken over timber prices). The update is then used on the right-hand side of (3.4) in the second iteration, and so on, until a convergence criterion is met.

For the analyst with observations on the state variables p and a, as well as the actual harvest decision Sjt £ {0,1}, the probability of the observed harvest decision by forest owner j at time t is:

(1 _ s ) ens(e°+pv(ajt,°;r)) + Sjtens(ptW(ajt)-c+e1+pv(a^i^)) (3.5) enc (e°+PV (ajt, °;r)) + enc(ptW (ajt)-c+e1+pv (aju i;r)) , and the likelihood function takes the form,