Info

Chapter 25

Optimal Control Theory

In (his chapter we take up the problem ot' optimization over lime. Such problems are common in economics. For example, in the theory of investment, firms are assumed to choose the time path of investment expenditures to maximize the (discounted) sum of profits over time. In the theory of savings, individuals are assumed to choose the time path of consumption and saving that maximizes the (discounted) sum of lifetime utility. These are examples of dynamic optimization problems. In this chapter, we study a new technique, optimal control theory, which is used to solve dynamic optimization problems.

It is fundamental in economics to assume optimizing behavior by economic agents such as firms or consumers. Techniques for solving sialic optimization problems have already been covered in chapters 6. 12, and 13. Why do we need to learn a new mathematical theory (optimal control theory) for handling dynamic optimization problems? To demonstrate the need, we consider the following economic example

Static versus Dynamic Optimization: An Investment Example

Suppose that a firm's output depends only on the amount of capital it employs. Let

where Q is the linn's output level, q is the production function and K is the amount of capital employed. Assume that there is a well-functioning rental market for the kind of capital the firm uses and that the firm is able to rent as much capital as it wants at the price R per unit, which ii takes as given. To make this example more concrete, imagine that the firm is a fishing company that rents hilly equipped units of fishing capital on a daily basis. (A unit of fishing capital would include boat, nets, fuel, crew, etc.). Q is the number of fish caught per day and K is the number of units of fishing capital employed per day. If p is the price of fish, then current profit depends on the amount of fish caught, which in turn depends on the amouni of K used and is given by the function n( K);

If the firm's objective is to choose K to maximize current profit, the optimal amount of K is given implicitly by the usual first-order condition:

But why should the firm care only about current profit? Why would it not take a longer-term view and also care about future profits? A more realistic assumption is that the firm's objective is to maximize the discounted sum of profits over an interval of time running from the present time (t - 0) to a given time horizon, T. This is given by the functional y|A'(n|

where p is the firm's discount rate and e~pl is the continuous-time discounting factor../1 K(t) | is called a functional to distinguish it from a function. A function maps a single value for a variable like K, tor a finite number of values if K is a vector of different types of capital) into a single number, like the amount of current profit. A functional maps a function like K(r)—or finite number of functions if there is more than one type of capital—into a single number, like the discounted sum of profits.

It appears we now have a dynamic optimization problem. The difference between this and the static optimization problem is that we now have to choose a path of K values, or in other words we have to choose a function of time, K (/), to maximize ./, rather than having to choose a single value for K to maximize7T(K). This is the main reason that we require a new mathematical theory. Calculus helps us find the value of K that maximizes a function .t(K) because we can differentiate n(K) with respect to A' to find the maximum of K). However, calculus is not, in general, suited to helping us find the function of time K (/) that maximizes the functional J[K(t)] because we cannot differentiate a functional ./| K(t) | with respect to a function Kit).

It turns out. however, that we do not have a truly dynamic optimization problem in this example. As a result calculus works well in solving this particular problem. Hie reason is that the amount of K rented in any period t affects only profits in that period and not in any other period. Thus it is fairly obvious that the maximum of the discounted sum of proliLs occurs by maximizing profits at each point in lime. As a result this dynamic problem is really just a sequence of static optimization problems. The solution therefore is just a sequence of solutions to a sequence of static optimization problems. Indeed, this is the justification for spending as much time as we do in economics on static optimization problems.

An optimization problem becomes truly dynamic only when the economic choices made in the current period affect not only current payoffs (profit) but tt'i K) = pq'(K) — R =0

also payoffs (profits) at a later date. The intuition is straightforward: if current output affects only current profit, then in choosing current output, we need only be concerned witli its effect on current profit. Hence we choose current output to maximize current profit. But if current output affects current profit and profit at a later date, then in choosing current output, we need to be concerned about its effect on current and future profit. This is a dynamic problem.

To turn our fishing firm example into a truly dynamic optimization problem, let us drop the assumption that a rental market for fishing capital exists. Instead, we suppose that the firm must purchase its own capital. Once purchased, the capital lasts for a long time. Let / (/) be the amount of capital purchased (investment) at time r. and assume that capital depreciates at the rale 6. The amount (stock) of capital owned by the firm at lime I is K (I) and changes according to the differential equation which says that, ai each point in time, the firm's capital stock increases by fhe amouni of investment and decreases by the amount of depreciation.

Let c\l(D] be a function that gives the cost of purchasing (investing) the amount I (t) of capital at time /: then profit at time r is

The problem facing file fishing firm at each point in time is to decide how much capital to purchase. This is a truly dynamic problem because current investment affects current profit, since it is a current expense, and also affects future profits, since it affects the amount of capital available for future production. II the firm's objective is to maximize the discounted sum of profits from zero to T, it maximizes

Once a path for /tn is chosen, the path of K(i) is completely determined because the initial condition for the capital stock is given at K,„ Thus, the functional J depends on the particular path chosen for 1(1).

There is an infinite number of paths. Id), from which to choose. A few examples of feasible paths are as follows:

(i) 1(1) = This is a constant amouni of investment, just enough to cover depreciation so that the capital stock remains intact at its initial level.

(iii) / u) = Ac"'. This is a path of investment that starts with 1(0) = A and then increases over time at the rate a. if a > 0. or decreases at the rate u. if a <0.

These are just a few arbitrary paths that we mention for illustration. In fact any function of / is a feasible path. The problem is to choose the path that maximizes .l\l(l)\. Since we know absolutely nothing about what this function of time might look like, choosing the right path would seem to ne a formidable task.

It turns out that in the special case in which T =oc and the function ?r [£(/). /(/)| takes the quadratic form n\K(t), l(t)] = K -t<K2 - I2

the solution to the above problem is no-fiS:=«!*.+,« Figure 25.1 Optimal path of investment over time

where r, is the negative root of the characteristic equation of the differential equation system lhat, as we shall see, results from solving this dynamic optimization problem, and K is the steady-state level of the capital stock that the firm desires, and is given by

Figure 25.1 displays the optimal path of investment for die case in which Kq < K Along the optimal path, investment declines. In :he limit as I > oo, investment converges to a constant amount equal to SK (since n < 0) so that in the long run the firm's investment is just replacement of depreciation.

How did we find this path? We found it using optimal control theory, which is the topic we turn to now.

25.1 The Maximum Principle

Optimal control theory relies heavily on the maximum principle, which amounts to a set of necessary conditions that hold only on optimal paths. Once you know how to apply these necessary conditions, then a knowledge of basic calculus and differential equations is all thai is required to solve dynamic optimization problems like the one outlined above. In this section we provide a statement of the necessary conditions ol the maximum principle and then provide a justification. In addition we provide examples to illustrate the use of the maximum principle.

Definition 25.1

We begin with a dcliniiion of ihe general form of the dynamic optimization problem that we shall study in this section.

The general form of the dynamic optimization problem with a finite time horizon and a free endpoint in continuous-time models is max J = f ,f[xU),y(t).t]dt (25.1)

The term free endpoint means that x(T) is unrestricted, and hence is free to be chosen optimally. The significance of this is explored in more detail below.

In this general formulation. J is the value of the functional which is to be maximized, v(/) is referred to as the state variable and y(0 is referred to as the control variable. As the name suggests, the control variable is the one directly chosen or controlled. Since the control variable and state variables are linked by a differential equation that is given, the state variable is indirectly influenced by the choice of the control variable.

In the fishing firm example posed above, the state variable is the amount of capital held by the firm; the control variable is investment. The example was a free-endpoint problem because there was no constraint placed on the final amount of the capital stock. As well, the integrand function, f[x(t). y(t). /], was equal to jt[K(I), I(t)\e "', and the differential equation for the state variable, gf.rin, y(r).r)l, was simply equal to /(r) — SK(t).

We will examine a number of important variations of this general specification in later sections. In section 25.3 we examine the fixed endpoint version of this problem. This means that x(T), the final value of the state variable, is specified as an equality constraint to be satisfied. In section 25.4 we consider the case in which T is infinity. Finally in section 25.6 we consider the case in which the time horizon, 7", is also a free variable to be chosen optimally.

Suppose that a unique solution to the dynamic optimization problem in definition 25.1 exists. The solution is a path for the control variable, y(t). Once this is specified, the path for the slate variable is automatically determined through the differential equation for the state variable, combined with its given initial condition. We assume that the control variable is a continuous function of time (we relax this assumption in section 25.5) as is the stale variable. The necessary conditions that constitute the maximum principle are stated in terms of a Hamiltonian function, which is akin to the Lagrangean function used to solve constrained optimization problems. We begin by defining this function

 Definition 25.2 1 The Hamiltonian function, H, for the dynamic optimization problem in definition 25.1 is H[x(t), y(t). k(t). tJ = f[x(t), y(r), /)] 4- X(/)g[jc(/>, y(f), t] where k(t), referred to as the costate variable, is akin to the Lagrange multiplier in constrained optimization problems.