002 2012 Intro To Optimal Control
002 2012 Intro To Optimal Control
I. Overview of optimization
Optimization is the unifying paradigm in almost all economic analysis. So before we
start, let’s think about optimization. The tree below provides a very nice general
representation of the range of optimization problems that you might encounter. There
are two things to take from this. First, all optimization problems have a great deal in
common: an objective function, constraints, and choice variables. Second, there are
lots of different types of optimization problems and how you solve them will depend
on the branch on which you find yourself.
In this part of course we will use both analytical & numerical methods to solve certain
class of optimization problems. This class focuses on a set of optimization problems
that have two common features: the objective function is a linear aggregation over
time, and a set of variables, called the state variables, are constrained across time.
And so we begin …
Static Optimization: single optimal magnitude for each choice variable and does not
entail a schedule of optimal sequence of action.
Dynamic Optimization: it takes the form of an optimal time path for every choice
variable (today, tomorrow etc.), and determines the optimal magnitude thereby.
s.t. pa za + pb zb ≤ x
[pay attention to the notation: z is the vector of choice variables and x is the
consumer's exogenously determined income.]
Solving the one-period problem should be familiar to you. What happens if the
consumer lives for two periods, but has to survive off of the income endowment
provided at the beginning of the first period? That is, what happens if her problem is
max U(z1a , z1b , z2a , z2b ) = U(z1 , z2 )
z
We're going to make a huge (though common) assumption and maintain that
assumption throughout the course: utility is additively separable across time1:
u(z) = u(z1) + u(z2)
Clearly one way to solve this problem would be just as we would a standard static
problem: set up a Lagrangian and solve for all optimal choices simultaneously. This
may work here, where there are only 2 periods, but if we have 100 periods (or even an
infinite number of periods) then this could get really messy. This course will develop
methods to solve such problems.
2
In sum, the problems that we will study will have the following features. In each
period or moment in time the decision maker looks at the state variables (xt), then
chooses the control variables (zt). The combination of xt and zt generates immediate
benefits and costs. They also determine the probability distribution over x in the next
period or moment.
Instead of using brute force to find the solutions of all the z’s in one step, we
reformulate the problem. Let x1 be the endowment which is available in period 1, and
x2 be the endowment that remains in period 2. Following from the budget constraint,
we can see that x2= x1 – p'z1, with x2 ≥ 0. In this problem x2 defines the state that the
decision maker faces at the start of period 2. The equation which describes the change
in the x from period 1 to period 2, x2 –x1= - p'z1, is called the state equation. This
equation is also sometimes referred to as the equation of motion or the transition
equation.
4
We now rewrite our consumer’s problem, this time making use of the state equation:
2
max ∑ ut (zt ) s.t.
zt
t=1
xt+1 − xt = − p' zt ⎫
⎬ t = 1,2 (1.1)
xt+1 ≥ 0 ⎭
xt is fixed.
We now have a nasty little optimization problem with four constraints, two of them
inequality constraints – not fun. This course will help you solve and understand these
kinds of problems. Note that this formulation is quite general in that you could easily
write the n-period problem by simply replacing the 2’s in (1) with n.
To see this approach, first note that for most specifications, economic intuition tells us
that x2>0 and x3=0. Hence, for t=1 (t+1=2), we can suppress inequality constraint in
(1). We’ll use the fact that x3=0 at the very end to solve the problem.
More terminology
In optimal control theory, the variable λt is called the co-state variable and,
following the standard interpretation of Lagrange multipliers, at its optimal value λt
is equal to the marginal value of relaxing the constraint. In this case, that means it is
the marginal value of the state variable, xt. The co-state variable plays a critical role in
dynamic optimization.
We now use a little notation change that simplifies this problem and adds some
intuition (we'll see how the intuition arises in later lectures). That is, we define a
function known as the Hamiltonian where
H = u( z1 , x1) + λt (− p' zt ) .
In the left column of table below we present the first-order conditions of the
Lagrangian specification. Then on the right we present the derivative of the
Hamiltonian with respect to the same variables. By comparison, we then can see what
we would have to place on the right-hand side of the first derivative to obtain the
same optimum if using the Hamiltonian that we would reach if we used the
Lagrangian approach.
Hence, we see that for the solution using the Hamiltonian to yield the same maximum
the following conditions must hold
∂H
1. = 0 => The Hamiltonian should be maximized w.r.t. the control variable
∂zt
at every point in time.
∂H
2. = λt−1 − λt , for t > 1 => The co-state variable changes over time at a rate
∂xt
equal to minus the marginal value of the state variable to the Hamiltonian.
∂H
3. = xt+1 − xt => The
state
equation
must
always
be
satisfied.
∂ λt
When we combine these with a 4th condition, called the transversality condition
(how we transverse over to the world beyond t=1,2) we're able to solve the problem.
In this case the condition that x3 =0 (which for now we will assume to hold without
proof) serves that purpose. We'll discuss the transversality condition in more detail in
a few lectures.
These four conditions are the starting points for solving most optimal control
problems and sometimes the FOCs alone are sufficient to understand the economics
of a problem. However, if we want an explicit solution, then we would solve this
system of equations.
Although in this class most of the OC problems we’ll face are in continuous time, the
parallels should be obvious when we get there.
Imagine that the decision-maker is now in period 2, having already used up part of her
endowment in period 1, leaving x2 to be spent. In period 2, her problem is simply
V2 (x2 ) = max u2 (z2 ), s.t.
z2
p' z2 ≤ x2
If we solve this problem, we can easily obtain the function V(x2), which tells us the
maximum utility that can be obtained if she arrives in period 2 with x2 dollars
remaining. The function V(.) is equivalent to the indirect utility function with pa and
pb suppressed. The period 1 problem can then be written
max u(z1 ) + V2 (x2 ) s.t.
z1
(1.3)
x2 = x1 − p' z1
Note that we've implicitly assumed an interior solution so that the constraint requiring
that x3≥0 is assumed to hold with an equality and can be suppressed. Once we know
the functional form of V(.), (3) becomes a simple static optimization problem and its
solution is straightforward. Assume for a moment that the functional form of V(x2)
has been found. We can then write out Lagrangian of the first period problem,
Again, we see that the economic meaning of the costate variable, l 1 is just as in the
OC setup, i.e., it is equal to the marginal value of a unit of x1.
Of course the problem is that we do not have an explicit functional form for V(.) and
as the problem becomes more complicated, obtaining a functional form becomes
more difficult, even impossible for many problems. Hence, the trick to solving DP
problems is to find the function V(.).
V. Summary
• OC problems are solved using the vehicle of the Hamiltonian, which must be
maximized at each point in time.
• DP is about backward induction.
• Both techniques are equivalent to standard Lagrangian techniques and the
interpretation of the shadow price, l, is the same.
VI. References
Deaton, Angus and John Muellbauer. 1980. Economics of Consumer Behavior. New
York: Cambridge University Press.
This document was generated at 3:53 PM, 07/07/11
Copyright 2011 Richard T. Woodward
The maximum principle, due to Pontryagin, states that the following conditions, if satisfied,
guarantee a solution to the problem (you should commit these conditions to memory)
1. max H ( t , x, z , λ ) for all t ∈ [ 0, T ]
z
∂H ∂λ
2. = −λɺ =
∂x ∂t
∂H
3. = xɺ
∂λ
4. Transversality condition (such as, λ (T ) = 0)
Points to note:
• the maximization condition, 1, is not equivalent to ∂H/∂z =0, since corner solutions are
admissible and non-differential problems can be considered.
• the maximum criteria include 2 sets of differential equations (2&3), so there's one set of
differential equations that was not present in the original problem.
• ∂H/∂λ = the state equation by the definition of H.
• There are no second-order partial differential equations
problem in which there is no binding constraint on the terminal value of the state variable(s).
This condition makes intuitive sense: since λt is the marginal value of the state variable at
time t, if you have complete flexibility in choosing xT, you would want to choose that level so
that its marginal value is zero, i.e., λT=0. We will spend more time discussing the meaning
and derivation of transversality conditions in the next lecture.
III. The Solution of an optimal problem (An example from Chiang (1991) with slight
notation changes).
∫ − (1 + z )
T 2 12
max t dt
zt 0
s.t. xɺt = zt
and x0 = A, xT free
Note that we can use the standard interior solution for the maximization of the Hamiltonian
since the benefit function is concave and continuously differentiable. Hence, our
maximization equations are
1. ∂H ∂z = −1 2 (1 + z 2 )
−1 2
2z + λ = 0
(if you check the 2nd order conditions you can verify we've got a maximum)
2. ∂H ∂x = 0 = −λɺ
3. ∂H ∂λ = z = xɺ
4. λT=0, the transversality of this problem (because of the free value for xT).
Now that was easy, but not very interesting. Let's try something a little more challenging.
s.t. xɺt = 4 xt (1 − zt )
and x0 = 1, x1 = e 2
Maximum conditions:
∂H 1
1. = − λ 4 xt = 0 (check 2nd order condition)
∂z zt
∂H 1
2. λɺ = − = − + λ 4 (1 − zt )
∂x xt
∂H
3. xɺt = = 4 xt (1 − zt )
∂λ
4. x1 = e2
Substituting for zt in 2
1 1
λɺt = − − λt 4 1 −
xt λt 4 xt
1 1
λɺt = − − λt 4 −
xt xt
λɺt = −λt 4
1
7. xɺt = 4 xt −
λt
Is there an equilibrium where both λɺ and xɺ equal zero?
Notice that 6 involves one variable, 7 involves two variables and 5 involves three variables.
This suggests an order in which we might want to solve the problem – start with 6.
We are close to the solution, but we aren’t finished until the values for all constants of
integration have been identified. To do this we use the initial and terminal conditions (a.k.a.
transversality condition).
Substituting in, x0=1, and t=0, yields
0 ⋅ e4•0
1= − + A ⋅ e4•0 = A
λ0
so A=1.
7
xt
6
2
zt
1
λt
0
0 0.2 0.4 0.6 0.8 1 1.2
where a>0 and b>0 are parameters of the species’ biological growth function and zt is the rate
of harvest. Society's utility comes from fish consumption at the rate ln(zt), and the goal is to
maximize the discounted present value of its utility over an infinite horizon, discounting at the
rate r.
In this case, let’s jump directly to the phase diagram exploring the dynamics of the system.
The state equation gives tells us the dynamic relationship between xt and zt. We can use FOCs
1 and 2, to uncover the dynamic relationships of zt. Using 2 we see that
λɺ
− t = ( a − 2bxt )
λt
We can then use 1 to identify the 1:1 relationship between λɺt and zɺt :
e − rt
λt =
zt
ln ( λt ) = −rt − ln ( zt )
λɺt zɺ
= −r − t
λt zt
Hence we can write
zɺt
r+ = ( a − 2bxt ) ⇒ zɺt = ( a − r − 2bxt ) zt .
zt
2- 7
zt I
zɺt = 0 II
IV III
xɺt = 0
a−r
xt
2b
It is clear from the diagram that we have a saddlepath equilibrium with paths in quadrants II
and IV, but all of the dynamics presented in the phase diagram are consistent with the first
order conditions 1 – 3. However, we can now use the constraint xt≥0 and the transversality
condition to show that only points that are actually on the saddlepaths are optimal by ruling
out all other points.
First, in quadrant I all paths lead to decreasing values of x and increasing values of z. Along
such paths xɺt = axt − b ( xt ) − zt is negative and growing in absolute value; eventually x would
2
have to become negative. But this violates the constraint on x; so such paths are not
admissible in the optimum.
In quadrant III, harvests are declining and the stock is increasing. Eventually this will lead to
a point where x reaches the biological steady state where natural growth is zero so harvests, zt
must also be zero. This will occur in finite time. But that means at such a point λt =∞, which
2- 8
Finally, we can also rule out any point in quadrants II or IV that are not on the saddle path
because if the path does not lead to the equilibrium it will cross over to quadrant I or III.
Hence, only points on the separatrices are optimal.
VI. References
Chiang, Alpha C. 1991. Elements of Dynamic Optimization. McGraw Hill
xt
t
T
By vertical end point, we mean that T is fixed and xT can take on any value. This would be
appropriate if you are managing an asset or set of assets over a fixed horizon and it doesn't
matter what condition the assets are in when you reach T. This case we have considered
previously. When looked at from the perspective of the beginning of the planning horizon,
the value that t takes on at T is free and, moreover, it has no effect on what happens in the
future. So it is a fully free variable and we would maximize V over xT. Hence, it follows that
the shadow price of xT must equal zero, giving us our transversality condition, λT =0.
We will now confirm this intuition by deriving the transversality condition for this particular
problem and at the same time giving a more formal presentation of Pontryagin’s maximum
principle.
The objective function is
T
V ≡ ∫ F ( t , x, z ) dt
0
now, setting up an equation as a Lagrangian with the state-equation constraint, we have
L = ∫ F ( t , x, z ) + λt ( f ( t , x, z ) − xɺt ) dt .
T
0
We put the constraint inside the integral because it must hold at every point in time. Note that
the shadow price variable, λt, is actually not a single variable, but is instead defined at every
point in time in the interval 0 to T. Since the state equation must be satisfied at each point in
3 -2
time, at the optimum, it follows that λt ( f ( t , x, z ) − xɺt ) = 0 at each instant t, so that the value
of L must equal the value of V. Hence, we might write instead
V = ∫ F ( t , x, z ) + λt ( f ( t , x, z ) − xɺt ) dt
T
0
or
V = ∫ { F ( t , x, z ) + λt f ( t , x, z )} − λt xɺt dt
T
.
T
V = ∫ H ( t , x, z , λ ) − λt xɺt dt
0
∫ udv = vu − ∫ vdu
with λ = u and x = v, so that dv = xɺ , we get
T T
− ∫ λt xɺt dt = − [ λt xt ]0 + ∫ λɺt xt dt
T
0 0
T
= ∫ λɺt xt dt + λ0 x0 − λT xT
0
so, we can rewrite V as
1.
T
[ ]
V = ∫ H (t , x, z , λ ) + λɺt xt dt + λ0 x0 − λT xT
0
If the terminal condition is that xT can take on any value, then it must be that the marginal
value of a change in xT must equal to zero, i.e., ∂V/∂xT=0. Hence, the first-order condition
3- 3
with respect to xT is
∂V T ∂t ∂x ∂z ∂λ ∂x ∂λɺ
= ∫ Ht + H x t + H z t + H λ t + λɺt t + xt t dt −λT = 0
∂xT 0
∂xT ∂xT ∂xT ∂xT ∂xT ∂xT
Several terms in this derivative must equal zero. First, clearly it holds that ∂t ∂xT = 0 so
∂t
Ht = 0.
∂xT
Second, as stated above when we converted from L to V, λt will have no effect on V as long
as the constraint is satisfied, i.e., as long as the state equation is satisfied. Hence, the terms
∂V ∂V
that involve or can be ignored. Hence,
∂λt ∂λɺt
∂V T ∂t ∂x ∂z ∂λ ∂x ∂λɺ
= ∫ Ht + H x t + H z t + H λ t + λɺt t + xt t dt −λT = 0
∂xT 0
∂xT ∂xT ∂xT ∂xT ∂xT ∂xT
or
∂V T ∂x ∂z ∂x
= ∫ H x t + H z t + λɺt t dt − λT = 0
∂xT 0
∂xT ∂xT ∂xT
∂V T ∂xt ∂z
∂xT 0
(
= ∫ H x + λɺt )
∂xT
+ H z t dt −λT = 0
∂xT
As we derived above, the maximum principle requires that H x = −λɺt and Hz=0, so both of the
terms inside the integral equal zero at the optimum. Hence, we are left with
∂V
= − λT = 0 .
∂xT
The minus sign on the LHS is there because it reflects the marginal cost of leaving a marginal
unit of the stock at time T. In general, we can show that λt is the value of an additional unit of
the stock at time t. Setting this FOC equal to zero, we obtain the transversality condition,
λT=0.
This confirms our intuition that since we're attempting to maximize V over our planning
horizon, from the perspective of the beginning of that horizon xT is a variable to be chosen, it
must hold that λT, the marginal value of an additional unit of xT, must equal zero. Note that
this is the marginal value to V, i.e., to the sum of all benefits over time for 0 to T, not the value
to the benefit function, F(⋅). Although an additional unit may add value if it arrived at time T,
i.e., ∂F ( ⋅) ∂xT > 0 , the costs that are necessary for that marginal unit of x to arrive at T must
exactly balance the marginal benefit.
3- 4
xT
t
In this case there is no fixed endpoint, but the ending state variables must have a given level.
For example, you can keep an asset as long as you wish, but at the end of your use it must be
in a certain state. Again, we will use equation 1:
∫[ ]
T
V = H (t , x, z , λ ) + λɺ x dt + λ x − λ x .
t t 0 0 T T
0
Now, if we have the right terminal time, it must be the case that ∂V/∂T=0, for otherwise it
would certainly be the case that a change in T would increase V; if V/∂T>0 we would want to
increase the time horizon, and if V/∂T<0 it should be shortened. (Note that this is a necessary,
but not sufficient condition -- for the sufficient condition we'll have to wait until we introduce
an infinite horizon framework). Evaluating this derivative (remember Leibniz’s rule), we get -
∂V
( )
= H (T , xT , zT , λT ) + λɺT xT − λɺT xT + λT xɺT = 0
∂T
The second and third terms cancel and, since we are restricted to have xT equal to a specific
value, it follows that xɺT = 0 . Hence, the condition reduces to H(T,xT,zT,λT)=0, i.e.,
H=F(T,xT,zT)+λT(f(T, xT,zT))=0
xT
t
T
When added to the other optimum criteria, this transversality equation gives you enough
equations to solve the system and identify the optimal path.
3- 5
D. Terminal Curve
xt
xT = ϕ (T )
t
In this case the terminal condition is a function, xT = ϕ (T ) . Again, we use
T
1 V = ∫ H ( t , x, z , λ ) + λɺt xt dt + λ0 x0 − λT xT .
0
x
t
T
In this case the terminal time is fixed, but xT can only take on a set of values, e.g. xT≥x. This
would hold, for example, in a situation where you are using a stock of inputs that must be
used before you reach time T and xT≥0. You can use the input from 0 to T, but xt can never be
negative.
For such problems there are two possible transversality conditions. If xT>x, then the
transversality condition λT=0 applies. On the other hand, if the optimal path is to reach the
constraint on x, then the terminal condition would be xT=x. In general, the Kuhn-Tucker
specification is what we want. That is, our maximization objective is the same, but we now
have an inequality constraint, i.e., we're seeking to maximize
∫[ ]
T
V = H (t , x, z , λ ) + λɺ x dt + λ x − λ x s.t. xT≥x.
t t 0 0 T T
0
The Kuhn-Tucker conditions for the optimum then are:
λT≥0, xT≥x, and (xT−x)λT=0
where the last of these is the complementary slackness condition of the Kuhn-Tucker
conditions.
3- 6
As a practical matter, rather than burying the problem in calculus and algebra, I suggest that
you would typically take a guess, Is xT going to be greater than x? If you think it is, then
solve, the problem first using λT=0. If your solution leads to xT≥x, you're done. If not,
substitute in xT=x and solve again. This will usually work. When would this approach not
work?
xT
t
T
In this case the time is flexible up to a point, e.g., T≤Tmax, but the state is fixed at a given
level, say xT is fixed. Again there are two possibilities, T=Tmax or T<Tmax. Using the
horizontal terminal line results from above, the transversality condition takes on a form
similar to the Kuhn-Tucker conditions above,
T≤Tmax, H(T,xT,zT,λT)≥0, and (T−Tmax)HT=0.
Values that accrue to the planner outside of the planning horizon are referred to as salvage
values. The general optimization problem with salvage value becomes
T
max ∫ F ( t , x, z ) dt + S ( xT , T ) s.t.
z 0
xɺt = f ( t , x, z )
x0 = x0
Following the same derivation as for the vertical end-point problem above, we can obtain
∂S (T , xT ) .
λT =
∂xT
3- 7
∂S (T , xT )
Intuitively, this makes sense: λT is the marginal value of the stock and is the
∂xT
marginal value of the stock outside the planning horizon. When these are equal, it means that
the marginal value of the stock over the planning horizon is equal to zero and all of the value
is captured by the salvage value.
Note that the addition of the salvage value does not affect the Hamiltonian, nor will it affect
the first 3 of the criteria that must be satisfied. What would be the transversality condition for
a horizontal end-point problem with a salvage value?
A. Fixed finite x
If we have a value of x to which we must arrive, i.e., x∞ ≡lim t→∞xt=k, then the problem is
identical to the horizontal terminal line case considered above.
B. Flexible xT
Recall from above that for the finite horizon problem we used equation 1:
T
V = ∫ H ( t , x, z , λ ) + λɺt xt dt + λ0 x0 − λT xT .
0
In the infinite horizon case this equation is rewritten:
∞
V = ∫ H ( t , x, z , λ ) + λɺt xt dt + λ0 x0 − lim λt xt
0 t →∞
and, for problem in which x∞ is free, the condition analogous to the transversality condition in
the finite horizon case is lim λt = 0 . Note that if our objective is to maximize the present-
t →∞
value of benefits, this means that the present value of the marginal value of an additional unit
of x must go to zero as t goes to infinity. Hence, the current value (at time t) of an additional
unit of x must either be finite or grow at a rate slower than r so that the discount factor, e-rt,
pushes the present value to zero.
One way that we frequently present the results of infinite horizon problems is to evaluate the
equilibrium where λɺ = xɺ = 0 . Using these equations (and evaluating convergence and
stability via a phase diagram) we can then solve the problem. See the fishery problem in
Lecture 3.
3- 8
V. Summary
The central idea behind all transversality conditions is that if there is any flexibility at the end
of the time horizon, then the marginal benefit from taking advantage of that flexibility must
be zero at the optimum. You can apply this general principal to problems with more than one
variable, to problems with constraints and, as we have seen, to problems with a salvage value.
The purpose of this lecture and the next is to help us understand the intuition behind the
optimal control framework. We draw first on Dorfman's seminal article in which he
explained OC to economists. .
(For this lecture, I will use Dorfman's notation so k is the state variable and x is the
choice variable)
A. The problem
Dorfman’s problem is to maximize
T
(1) W ( kt , x ) = ∫ u ( k , x,τ ) dτ
t
where x is the stream of all choices made between t and T.
the state equation is
∂k
kɺ = = f ( k , x, t )
∂t
where k* and x* are the optimal paths of the state and control variables.
Following a policy of xt constant for the initial period from t to t+∆, and then optimizing
beyond that point can then be written
(2) V ( kt , xt , t ) = u ( kt , xt , t ) ∆ + V * ( kt +∆ , t + ∆ ) .
(note that the V on the LHS does not have a *, i.e., it is not necessarily at the optimum).
∂ ∂ *
(3) ∆ u ( k , xt , t ) + V ( kt +∆ , t + ∆ ) = 0 .
∂xt ∂xt
We can then rewrite the second term
∂V * ∂V * ∂kt +∆
(4) =
∂xt ∂kt +∆ ∂xt
Since we assume that the interval ∆ is quite short, we can approximate the state equation
kt +∆ = kt + kɺ∆ = kt + f (k , xt , t )∆
so that
∂kt +∆ ∂f
(5) = 0+ ∆
∂xt ∂xt
Dorfman then substitutes (5) into (4), and also writes V'=λ, so that (3) can be rewritten
∂u ∂f
∆ + λt + ∆ ∆ = 0.
∂xt ∂xt
Note: we can get the same results if we start with a Lagrangian, i.e.,
( )
L = u ( kt xt , t ) ∆ + V * ( kt +∆ , t + ∆ ) − λt +∆ kt +∆ − ( kt + f ( k , xt , t ) ∆ )
and then the FOCs would be,
∂u ∂f
∆ + λt + ∆ ∆ = 0 , and
∂xt ∂xt
∂V ( ⋅)
= λt +∆ .
∂kt +∆
In the context of the Lagrangian we know that λ is the value of marginally
relaxing the constraint, i.e., the change in V that would be achieved by an extra
unit k. Hence, V' and λ are equivalent.
So now we've got a nice intuitive explanation for the first of the maximum conditions.
The central principle of dynamic optimization is that optimal choices are made when a
balance is struck between the immediate and future marginal consequences of our
choices.
λt = ∆ +
∂k ∂kt +∆ ∂k
∂u ∂k
λt = ∆ + λt +∆ t +∆
∂k ∂k
Since this is over a short period, we can approximate
∂k ∂f
λt +∆ = λt + λɺ∆ and kt +∆ = kt + ∆kɺ, so that t +∆ = 1 + ∆
∂kt ∂k
Hence,
∂u ∂f
( )
λt = ∆ + λt + λɺ∆ 1 + ∆
∂k ∂k
∂u ∂f ∂f
λt = ∆ +λt + λɺ∆ + λt ∆ + λɺ∆ 2
∂k ∂k ∂k
∂u ɺ ∂f ɺ 2 ∂f
0= ∆ + λ ∆ + λt ∆ + λ∆
∂k ∂k ∂k
or,
∂u ∂f ∂f
−λɺ = + λt + λɺ∆ .
∂k ∂k ∂k
Taking the limit at ∆→0, the last term falls out and we're left with
∂u ∂f
(7) − λɺ = +λ
∂k ∂k
which is the second maximum condition, −λɺ = ∂H .
∂k
What does Dorfman (p. 821) tell us about the economic intuition behind this equation?
To an economist, it λɺ is the rate at which the capital is appreciating.
− λɺ is therefore the rate at which a unit of capital depreciates at time t. …
In other words, [1] a unit of capital loses value or depreciates as time
passes at the rate at which its potential contribution to profits becomes its
past contribution. … [or] [2] Each unit of the capital good is gradually
decreasing in value at precisely the same rate at which it is giving rise to
4-4
valuable outputs. [3] We can also interpret − λɺ as the loss that would be
incurred if the acquisition of a unit of capital were postponed for a short
time [which at the optimum must be equal to the instantaneous marginal
value of that unit of capital].
So we see that since the value of the capital stock at the beginning of the problem is equal
to the sum of the contributions of the capital stock across time. As we move across time,
therefore, the capital stock’s ability to contribute to V is “used up”.
E. Step 4. Summing up
Hence, each of the optimality conditions associated with the Hamiltonian has a clear
economic interpretation.
Let H = u (k , x, t ) + λt f (k , x, t )
Hence, the value of λt is influenced by two effects: the current (in period t) marginal
value of k, which could either be increasing or decreasing, and the discounting effect,
which is always falling. Hence, even if the marginal value of capital is increasing over
time (in current dollars), λ might be falling. Because of these two factors, it often
happens that the economic meaning of λt is not easily seen. An alternative way to specify
discounted optimal control problems that leads to more helpful solution is called the
current value Hamiltonian.
Obviously the third condition, that the state equation must hold, remains unchanged. The
Transversality condition might change by a discount factor, but in many cases analogous
conditions hold. For example, if the TC is λT = 0, and λT =µTe-rT then it must also hold that
µT = 0. (Note that if T=∞, then for r>0, this would be satisfied if µt does not go to infinity
as t→∞).
Hence, we can use the current value Hamiltonian, but it is important to use the correct
optimality conditions.
1
Manseung Han, who took my class in 2002, greatly helped me in figuring out a clear presentation of this
part of the problem.
4-7
marginal value of the capital increases over time. The sum of these three tell us the
benefit of holding a marginal unit of capital for one more instant. The RHS of r µ , can
be thought of as the opportunity cost of holding capital. For example, suppose that our
capital good can be easily transformed into dollars and we discount at the rate r because it
is the market interest rate. Then rµ is the immediate opportunity cost of holding capital,
since we could sell it and earn interest at the rate r. Hence, at the optimum, we will hold
our state variable up to the point where its marginal value is equal to the marginal cost.
C. Summary
The current value formulation is very attractive for economic analysis because current
values are usually more interesting than discounted values. For example, in a simple
economy, the market price of a capital stock will equal the current-value co-state
variable. As economists we are usually more interested in such actual prices than we are
in their discounted present value. Hence, very often the current-value Hamiltonian is
more helpful than the present-value variety.
Also, as a practical matter, for analysis it is often the case that the differential equation
for µ will be autonomous (independent of t) while that for λ will not be. Hence, the
dynamics of a system involving µ can be interpreted using phase-diagram and steady-
state analysis, while this does not hold for λ.
One note of caution: we have stated and derived many of the basic results for the present-
value formulation (e.g., transversality conditions). When you are using the current-value
formulation, you need to be careful to ensure that everything is modified consistently.
III. Reference
Dorfman, Robert. 1969. An Economic Interpretation of Optimal Control Theory. American
Economic Review 59(5):817-31.
This document was generated at 8:24 AM, 07/13/11
Copyright 2011 Richard T. Woodward
Let xt be the stock of the resource remaining at time t and let zt be the rate at which the
stock is being depleted. For simplicity, first assume that extraction costs are zero, and
that the market is perfectly competitive. In this case, the representative owner of the
resource will receive ptzt from the extraction of zt in period t and this will be pure profit
or, more accurately, quasi-rents.
Definitions (from http://www.bized.ac.uk/)
Economic rent: A surplus paid to any factor of production over its supply
price. Economic rent is the difference between what a factor of production is
earning (its return) and what it would need to be earning to keep it in its
present use. It is, in other words, the amount a factor is earning over and above
what it could be earning in its next best alternative use (its transfer earnings).
Quasi-rent: Short-term economic rent arising from a temporary inelasticity of
supply.
zt
CS u( xt , zt , t ) = ∫ p( z )dz
0
Pt D
PS=
Quasi Rent
zt
We consider the problem of a social planner who wants to maximize the present value of
consumer surplus plus rents (= producer surplus in this case). CS + PS at any instant in
zt
time is equal to the area under the inverse demand curve, i.e., u ( xt , zt , t ) = ∫ p ( z ) dz ,
0
where p(z) is the inverse demand curve for extractions of the resource.
5- 2
The problem is constrained by the fact that the original supply of the resource is finite,
x(t=0)=x0 and any extraction of the resource will reduce the available stock, xɺ = − z . We
know that in any period xt≥0 and simple intuition assures us that xT=0. Do you see why
xT =0?
2. Hx= − λɺ : − λɺ =0
3. Hλ= xɺ : xɺt = − zt
The transversality condition in this case is found by the terminal point condition,
4. xT=0
Looking at 1 and using the intuition developed by Dorfman, we see that the marginal
benefit of extraction in t, e-rtp(zt), must be equal to the marginal cost in terms of foregone
future net benefits, λt.
From 2 we see that λ is constant at, say, λ0 so we can drop the subscript. This is true in
any dynamic optimization problem in which neither the benefit function nor the state
equation depend on the state variable. This too is consistent with the intuition of
Dorfman – since the state variable does not give rise to benefits at t, its marginal value
does not change over time.
Another thing that is interesting in this model is that the value of λ does not change over
time. That means that the marginal increment to the objective function (the whole
integral) of a unit of the resource stock never changes. In other words, looking at the
entire time horizon, the planner would be completely indifferent between receiving a
marginal unit of the resource at time 0 and the instant before T, as long as it is known in
advance that at some point the unit will be arriving. However, note that this is the present
value co-state variable, λ. What would the path of the current-value costate variable look
like? How does the economic meaning of µ differ from that of λ?
If we want to proceed further, it is necessary to define a particular functional form for our
demand equation. Suppose that p(z)=e-γz so that the inverse demand curve looks like the
figure above.
Hence, from 1, Hz=0⇒ e − rt e −γ zt = λ , or e −γ zt = λ e rt so that,
−γ zt = ln λ + rt
or
ln λ + rt
5 zt = −
γ
t
At any point in time it will always hold that xt = x0 + ∫ xɺτ dτ . Hence, from our
τ =0
transversality condition, 4,
T
xT = 0 ⇒ x0 = − ∫ xɺτ dτ .
τ =0
T T ln (λ ) + rt
From 3 and 5 this can be rewritten ∫
0
zt dt = x0 or ∫ −
0
γ
dt = x0 .
Evaluating this integral leads to
T
1 r 2
− ln ( λ ) t − t = x0
γ 2 0
r T
− ln λ − T = x0
2 γ
.
γ r
− ln λ = x0 + T
T 2
Hence, we can solve for the unknown value of λ,
γ r
− x0 − T
λ=e T
.
2
In this case we can then solve explicitly for z by substituting into 5, yielding
5- 4
ln λ + rt
zt = −
γ
− γ x0 − r T
ln e T 2 + rt
zt = −
γ
γ r r
zt = x0 + T − t
γT γ2 γ
x0 r r
6. zt = + T− t
T 2γ γ
To verify that this is correct, check the integral of this, from 0 to T
T x r 2 r 2
∫0 zt dt = T0 T + 2γ T − 2γ T = x0 .
Looking at 6, we see that the rate of consumption at any point in time is determined by
x
two parts: a constant portion of the total stock, 0 , plus a portion that declines linearly
T
r T T
over time − t . This second portion is greater than zero until t = , and is then
γ 2 2
less than zero for the remainder of the period.
Q: Consider again the question, “What would happen if we used the current-value instead
of the present-value Hamiltonian?”
A: Well, you can be sure that the current value co-state variable, µt, would not be
constant over time – how would the change in the shadow price of capital evolve?
What’s the economic interpretation of µ?
Q: What if there are costs to extraction c(zt) so that the planner's problem is to maximize
the area under the demand curve minus the area under the marginal cost curve?
5- 5
A: First recognize that if we define u~ (⋅) = ∫ p ( z ) − c' ( z )dz , where c' is the marginal cost
zt
0
function, then the general results will be exactly the same as in the original case after
substituting “marginal quasi rents” for “price”. That is, in this case the marginal surplus
will rise at the rate of interest. Obviously getting a nice clean closed-for solution z* will
not be as easy as it was in the first case, but the economic intuition does not change. This
economic principle is a central to a wide body of economic analysis.
Now, let's look at this question a little more intuitively. We know that one of the basic
results is that the price (or marginal quasi rents) grow at the rate of interest? Is this likely
to occur in a competitive economy as well? In the words of Hotelling, “it is a matter of
indifference to the owner of a mine whether he receives for a unit of his product a price
p0 now or a price p0eγt after time t” (p. 140). That is, price takers will look at the future
and decide to extract today, or a unit tomorrow at a higher price. The price must increase
by at least the rate of interest in this simple model because, if not, the market would face
a glut today. If the price rose faster than the rate of interest, then the owners would
choose to extract none today. Assuming that the inverse-demand curve is downward
sloping, supply and demand can be equal only if each individual is completely indifferent
as to when he or she extracts which also explains the constancy of λ.
This also gets at an important difference between profit and rents. We all know that in a
perfectly competitive economy with free entry, profits are pushed to zero -- so why do the
holders of the resource still make money in this case? Because there is not free entry.
The total resource endowment is fixed at x0. An owner of a portion of that stock is able
to make resource rents because he or she has access to a restricted profitable input.
Further, the owner is able to exploit the tradeoffs between current and future use to make
economic gains. This is what is meant by Hotelling rents.
II. Hartwick's model of national accounting and the general interpretation of the
Hamiltonian
Hartwick (1990) has a very nice presentation of the Hamiltonian's intuitive appeal as a
measure of welfare in a growth economy. The analogies to microeconomic problems will
be considered at the end of this section. Hartwick’s paper builds on Weitzman (1976)
and is a generalization of his more often cited 1977 paper.
5- 6
∫ U (C ) e
−ρ t
dt
0
subject to a state equation for a malleable capital stock, x0, that can either be consumed or
saved for next period
xɺ0 = g 0 ( x, z ) − C
and n additional state equations for the n other assets in the economy (e.g., infrastructure,
human capital, environmental quality, etc.).
xɺi = g i ( x, z ) , i=1,…,n.
Please excuse the possibly confusing notation. Here the subscript is an index
of the good and the time subscript is suppressed.
This is our first exposure to the problem of optimal control with multiple state and
control variables, but the maximization conditions are the simple analogues of the single
variable case:
∂H ∂H
= = 0 for all i [or in general, maximize H with respect C and all the zi’s]
∂C ∂zi
∂H
= ρµ j − µɺ j for all j
∂x j
∂H
= xɺ j for all j
∂µ j
∂H
Given the specification of utility, = U '− µ 0 = 0 ⇒ µ0=U'.
∂C
(remember, µ 0 is the costate variable on the numeraire good, not the costate variable at
t=0.)
1
Again to write more concisely, H is the current value Hamiltonian, which we typically write Hc.
5- 7
n µ
H
= C + xɺ0 + ∑ j xɺ j
U' j =1 µ 0
If you look at the RHS of this equation, you will see that this is equivalent to net national
product in a closed economy without government. NNP is equal to the value of goods
and services (C) plus the net change in the value of the assets of the economy,
n µ
xɺ0 + ∑ xɺ j .
j
j =1 µ 0
The first lesson from this model, therefore, is a general one and, as we will discuss below,
it carries over quite nicely to microeconomic problems: maximizing the Hamiltonian is
equivalent to maximizing NNP, which seems like a pretty reasonable goal.
Using some simplistic economies, Hartwick helps us understand what the appropriate
µj
shadow prices on changes in an economy's assets should be, i.e., what are ?
µ0
2
This is a refinement of the specification in Hartwick (1990) as proposed Hamilton (1994).
5- 8
HD=0: − µ K vD + µ S + µ D g ′ = 0 .
Production is assumed to be affected by pollution, i.e., F(K,L,X) so, for example, more
pollution makes production more difficult. The pollution stock is assumed to increase
3
This result differs from that presented in Hamilton (1994). I have not attempted to determine where the
difference comes from.
5- 9
with the production at the rate γ, and decrease with choices made regarding the level of
cleanup, b, which costs f(b), i.e., Xɺ = −bX + γ F ( K , L, X ) and the evolution of the
numeraire capital stock follows Kɺ = F ( K , L, R ) − C − f ( b ) .
The current value Hamiltonian with this stock change incorporated in the utility function,
therefore is
H = U (C ,−bX + γF (K , L, X )) + µ K [F (K , L, X ) − C − f (b )] + µ X [− bX + γF (K , L, X )]
Hicks' (1939, Value and Capital) defined income as, to paraphrase, the maximum amount
that an individual can consume in a week without diminishing his or her ability to
consume next week. Clearly, just as for a national account, farmers and managers also
need to be aware of the distinction between investment, capital consumption, and true
income. Hartwick's Hamiltonian formulation of NNP, therefore, with its useful
presentation of the correct prices for use in the calculation of income, might readily be
applied to a host of microeconomic problems of concern to applied economists.
III. References
Hartwick, John M. 1977. Intergenerational Equity and the Investing of Rents from
Exhaustible Resources. American Economic Review 67(5):972-74.
This document was generated at 5:17 PM, 07/22/11
Copyright 2011 Richard T. Woodward
We now return to an optimal control approach to dynamic optimization. This means that
our problem will be characterized by continuous time and will be deterministic.
It is usually the case that we are not Free to Choose.1 The choice set faced by decision
makers is almost always constrained in some way and the nature of the constraint
frequently changes over time. For example, a binding budget constraint or production
function might determine the options that are available to the decision maker at any point
in time. In general, this implies that we will need to reformulate the simple Hamiltonian
problem to take account of the constraints. Fortunately, in many cases, economic
intuition will tell us that the constraint will not bind (except for example at t=T), in which
case our life is much simplified. We consider here cases where we're not so lucky, where
the constraints cannot be ruled out ex ante.
We will assume throughout that a feasible solution exists to the problem. Obviously, this
is something that needs to be confirmed before proceeding to waste a lot of time trying to
solve an infeasible problem.
In this lecture we cover constrained optimal control problems rather quickly looking at
the important conceptual issues. For technical details I refer you to Kamien & Schwartz,
which covers the technical details of solving constrained optimal control problems in
various chapters. We then go on to consider a class of problems where the constraints
play a particularly central role in the solution.
xɺ = g ( z , x, t )
h ( z , x, t ) = c
x ( 0 ) = x0
In this case we cannot use the Hamiltonian alone, because this would not take account of
the constraint, h(z,x,t)=c. Rather, we need to maximize the Hamiltonian subject to a
constraint ⇒ so we use a Lagrangian2 in which Hc is the objective function, i.e.,
L = H c + φ ( h ( z , x, t ) − c )
= u ( z , x, t ) + µ g ( z , x, t ) + φ ( c − h ( z , x, t ) ) .
Equivalently, you can think about embedding a Lagrangian, within a Hamiltonian, i.e.
1
This is an obtuse reference to the first popular book on economics I ever read, Free to Choose by Milton
and Rose Friedman.
2
This Lagrangian is given a variety of names in the literature. Some call it an augmented Hamiltonian,
some a Lagrangian, some just a Hamiltonian. As long as you know what you’re talking about, you can
pretty much call it whatever you like.
6- 2
λdg/dz
--µdg/dz
φdh/dz
du/dz
with without
z
constraint constraint
6- 3
In principle, the problem can then be solved based on these equations. It is important to
note that φ will be a function of time and will typically change over time. What is the
economic significance of φ?
U
Following standard Phillips-curve logic, there is an assumed trade-off between these two
objectives,
p = γ (U ) + απ
where π is the expected rate of inflation. Expectations evolve according to the
differential equation
πɺ = b ( p − π )
6- 4
We assume that the votes obtained at time T are a weighted sum of the support that is
obtained from 0 to T, with support nearer to the voting date being more important. Votes
T
obtained at T are equal to ∫ v (U , p ) ert dt .
0
The optimization problem then is
T
max ∫ v (U , p ) ert dt s.t.
U,p
0
p = γ (U ) + απ
πɺ = b ( p − π )
π ( 0 ) = π 0 , and π (T ) free.
Now clearly the first constraint could be used to substitute out for p and convert the
problem to a single control problem, but let’s consider the alternative, explicitly including
the constraint.
The Lagrangian for this optimal control problem would be
L = v (U , p ) ert + λ ( b ( p − π ) ) + φ γ (U ) + απ − p
The optimum conditions would then be
∂L ∂v rt
= e + λb − φ = 0
∂p ∂p
∂L ∂v rt
= e + φγ ' = 0
∂U ∂U
∂L
= γ (U ) + απ − p = 0
∂φ
λɺ = λb − φα
πɺ = b ( p − π )
If we specify a functional form (see Chiang chapter 7) we can find that the optimal path
for policy, which shows that the political process creates a business cycle. In most
problems, however, it is easier to find the solution by using equality constraints to
eliminate variables before getting started.
A. Theory
Suppose now that the problem we face is one in which we have inequality constraints,
hi(t, x, z)≤ci, with i=1,…, n
for n constraints and x and z are assumed to be vectors of the state and control variables
respectively. For each xj∈x, the state equation takes the form xɺ j = g j (t , x, z ) .
As with standard constrained optimization problems, the Kuhn-Tucker conditions will
yield a global maximum if any one of the Arrow-Hurwicz-Uzawa constraint
qualifications is met (see Chiang p. 278). The way this is typically satisfied in most
economic problems is for the gj to be concave or linear in the control variables.
6- 5
Assuming that the constraint qualification is met, we can then proceed to use the
Lagrangian specification using a Hamiltonian which takes the form
m
H = u (t , x, z ) + ∑ λtj g j (t , x, z )
j =1
Note: For maximization problems I always write the constraint term of the
Lagrangian so that the argument inside the parentheses is greater than zero, or for
minimization problems you write it so that the argument is less than zero. If you
follow this rule, your Lagrange multiplier will always be positive.
As with all such problems, the appropriate transversality conditions must be used and, if
you choose to use a current-value Hamiltonian, the necessary adjustments must be made.
Note that in the current value specification, the interpretation of both the costate variable
and the shadow price on the intratemporal constraint would be altered.
Economic intuition tells us that xT=0. Hence, xt≥0 for all t if zt≥0. Hence, we can
convert the problem to one of having a constraint on the control variable. The associated
Lagrangian would then be
L=e-rtu(⋅) +λ(-zt)+φ t⋅zt.
From 0 to T1
ln ( λ − 0 ) + rt ln ( λ ) + rt
z=− =−
γ γ
and from T1 to T,
ln ( λ − φt ) + rt
0=− ⇒ ln ( λ − φt ) = rt
γ
10. φt = λ − e − rt .
Now, we can speculate about the solution. It seems likely that at T1, φ will equal zero and
will then increase over time from that time onward. If not, then the paths of z and φ will
be discontinuous at T1. So let’s use this possibility and then later confirm that it holds.
If φT1 = 0 , then
11. λT1 = e− rT1 .
Furthermore, we must exhaust the resource by T1 so that
6- 7
T1 T1 ln (λ ) + rt
∫0zt dt = x0 or ∫ −
0
γ
dt = x0
Which we solved in lecture 6 to obtain
γ r
− x0 − T1
T1 2
λ =e .
Now, substituting in from 11, we obtain
γ r
− x0 − T1
− rT1 T1 2
e =e
r γ
T1 = x0
2 T1
22γ
T1 = x0
r
2γ
T1 = x0
r
Hence, if our assumption that φ = 0 at T1 is valid, the optimal solution is for consumption
to decline from 0 to T1 and then stay constant at zero from that point onward.
Is the assumption correct? Without a formal proof, we can see using economic intuition
that it is. Suppose zT1 >0. A feasible option would be to reduce zT1 and consume for a
little longer. Since u(⋅) is concave (u''<0) it will hold that the marginal cost of a slight
reduction in z at T1 will be less than the marginal benefit of a slight increase in z a
moment later. Hence, it will never be optimal to consume a strictly positive amount of z
at T1 so the assumption that φ =0 at T1 is valid and our solution is the optimum
A. Theory
Suppose now that we have constraints on the state variables which define a feasible
range. This is likely to be common in economic problems. You may, for example, have
limited storage space so that you cannot accumulate your inventory forever. Or if you
were dealing with a biological problem, you might be constrained to keep your stock of a
species above a lower bound where reproduction begins to fail, and an upper bound
where epidemics are common.
xɺ = g (t , x, z ), x(0) = x0 and
h(t , x ) ≥ 0.
The augmented Hamiltonian for this problem is
L = u ( t , x, z ) + λ g ( t , x, z ) + φ h ( t , x )
6- 8
and the necessary conditions for optimality include, the constraints plus
∂L
=0
∂z
∂L
λɺ = −
∂x
φ ≥ 0 and φh = 0
and the transversality condition.
Solving problems like this by hand can be quite difficult, even for very simple problems.
(See K&S p.232 if you want to convince yourself). (An alternative approach presented in
Chiang (p. 300) is often easier and we follow this approach below). For much applied
analysis, however, there may be no alternative to setting a computer to the problem to
find a numerical solution.
We won’t solve this problem in all its detail, but the solution method would follow a
similar path. We divide time into two portions, from 0 to T1 where φ=0 and λ is constant,
and from T1 to T, where xt=0 and λ falls with the increase in φ. To solve the problem we
note that φT1 =0 and then solve for T1.
One thing that is interesting in this specification is that the costate variable is no longer
constant over time. This makes sense: between 0 and T1 we’re indifferent about when we
get the extra unit of the resource. But after T1 it clearly makes a difference – the sooner
we get the additional unit the more valuable (in PV terms) it will be. When t>T1, we
know that zt=0 ⇒ p=1 and λt=e−rt. A marginal increase in the stock over this range
would allow the immediate sale of that stock at a price of 1 and the present value of this
marginal change in stock would, therefore, be e−rt.
V. Bang-bang OC problems
There are some problems for which the optimal path does not involve a smooth approach
to the steady state or gradual changes over time. Two important classes of such problems
are known as "bang-bang" problems and most rapid approach problems. In such
problems the constraints play a central role in the solution.
xɺ = − z
x (t ) ≥ 0
x ( 0 ) = x0
What does intuition suggest about the solution to the problem? Will we want to consume
the resource stock x gradually? Why or why not? Let's check our intuition.
Following the framework from above, we set up the Lagrangian by adding the constraint
on the state variable to the Hamiltonian, i.e., L=H+φ(constraint). Using the current-value
specification, this give us
L = zt − µt zt + φt xt
The first of these implies that µt=1. Since this holds no matter the value of t, we know
that µɺ t = 0 for all t. Conditions i and ii together indicate that
µt=1 and φt=r.
The second of these is most interesting. It shows us that φt, the Lagrange multiplier, is
always positive. From the complementary slackness condition, it follows that xt must
equal 0 always. But wait! We know this isn't actually true at t=0. However, at t=0, xt is
not variable – it is parametric to our problem, so that point in time doesn’t count. But at
every instant except the immediate starting value, xt=0.
So how big is z at zero? The first thought is that it must equal x0 but this isn't quite right.
To see this, suppose that we found that the constraint started to bind, not immediately, but
after 10 seconds. To get the x to zero in 10 seconds, z per second would have to equal
x0/10. Now take the limit of this at the denominator goes to zero ⇒ z goes to infinity.
6 -10
Hence, what happens is that for one instant there is a spike of zt of infinite height and zero
length that pushes x exactly to zero. This type of solution is known as a bang-bang
problem because the state variable jumps discontinuously at a single point – BANG-
BANG! Since, in the real world it's pretty difficult to push anything to infinity, we would
typically interpret this solution as "consume it as fast as you can." This is formalized in
the framework of most-rapid-approach path problems below.
z 0
xɺt = zt rxt
0 ≤ zt ≤ 1
x(0) = x0
This time we have two constraints: zt≤1 and zt≥0. Hence, our Lagrangian is
L = [1 − zt ] rxt + λ zt rxt + φ1t (1 − zt ) + φ2t zt
So that the necessary conditions are
∂L
= 0 ⇔ −rxt + λrxt − φ1 + φ 2 = 0
∂z
∂L
= −λɺt ⇔ −λɺ = [1 − zt ]r + λz t r
∂x
The transversality condition in this problem is λT=0 since xT is unconstrained with the
Kuhn-Tucker conditions,
KT1: φ1≥0 & φ1(1−zt)=0, and
KT2: φ2≥0 & φ2z=0.
From the KT1, we know that if φ1>0, then the first constraint binds and zt=1. Similarly,
from KT2, if φ2>0, then the second constraint binds and z=0. i.e.
φ1>0 ⇒ z = 1 φ2>0 ⇒ z = 0.
φ1=0 ⇐ z < 1 φ2=0 ⇐ z > 0.
Clearly, it is not possible for both φ1 and φ2 to be positive at the same time.
3
This problem is very similar to one looked at in Lecture 3. Comparing the two you’ll see one key
difference is that here utility is linear, while in lecture 3 utility was logarithmic.
6 -11
By the transversality condition we know that eventually λ must hit λT=0. Hence,
eventually we'll reach case 3 where, λt<1 and zt=0 and we sell all of our output. But
when do we start selling – right away or after x has grown for a while? We know from
equation 2 that at λt=1 neither constraint binds.
• Suppose that at t=n λt=1.
• For t<n λt>1 and z=1.
• For t>n λt<1 and z=0.
An important question then is when is n? We can figure this out by working backwards
from λT=0. From the second FOC, we know that in the final period, (when λt<1) z=0, in
which case
λɺ = − r .
Solving this differential equation yields
λt = −rt + A.
Using the transversality condition,
λT = −rT + A = 0
A = rT
λt = −rt + rT = r (T − t )
Hence, λn=1 if
r (T − n ) = 1
n = ( rT − 1) r
Hence, we find that the optimal strategy is to invest everything from t=0 until
t = n = ( rT − 1) r . After t=n consume all of the interest. If (rT − 1) r < 0 then it would
be optimal to sell everything from the very outset.
6 -12
n T
Z X λ
What would be the solution as T→∞? Does this make intuitive sense? What is it about
the specification of the problem that makes it inconsistent with our economic intuition?
π t = p ⋅ f ( xt ) − c ⋅ zt
where x is the firm's capital stock and z is investment, p and ct are exogenously evolving
unit price and unit cost respectively. The capital stock that starts with x(0)=x0,
depreciates at the rate b so that
xɺt = z t − bxt .
The firm's problem, therefore, is to maximize the present value of its profits,
∞
∫ e − rt p ⋅ f ( xt ) − c ⋅ zt dt subject to
0
xɺt = zt − bxt ,
with three additional constraints:
i) x(t)≥0
ii) zt≥0
iii) p ⋅ f ( xt ) − c ⋅ zt ≥ 0
Let's use economic intuition to help us decide if we need to explicitly include all the
constraints in solving the problem?
4
Sometimes the term “bang-bang” is also used to describe MRAP problems.
6 -13
• The constraint on x almost certainly does not need to be imposed because as long as f'
gets big as x→0, the optimal solution will always avoid zero.
• The constraints on z, on the other hand might be relevant. But, we'll start by
assuming that neither constraint binds, and then see if we can figure out actual the
solution based on the assumed interior solution or, if not, we'll need to use the Kuhn-
Tucker specification. Note that if there does exist a steady state in x, then, as long as
b>0, z must be greater than zero. Hence, we anticipate that much might be learned
from the interior solution.
• Similarly, the profit constraint might also bind, but we would expect that in the long
run, profits would be positive. So again, we start by solving for an interior solution,
assuming π>0 where π = p ⋅ f ( xt ) − c ⋅ zt .
We see, therefore, that the optimum conditions tell us about the optimal level of x, say x*.
We can then use the state equation to find the value of z that maintains this relation.
Since c and p are constant, this means that the capital stock will be held at a constant
pf ' ( x )
level and 18 reduces to = c . This is known as the modified golden rule.
r +b
If p and c are not constant but grow in a deterministic way (e.g., constant and equal
inflation) then we could de-trend the values and find a real steady state. If p and c both
grow at a constant rate, say w, then there will be a unique and steady optimal value of x
for all z>0.
C. Corner solutions
All of the discussion above assumed that we are at an interior solution, where
0 < zt < p ⋅ f ( xt ) c . But, we ended up finding that the interior solution only holds when
the state variable x is at the point defined by equation 18. Hence, if we're not at x* at t=0,
then it must be that we're at a corner solution, either zt=0 or p ⋅ f ( xt ) − c ⋅ zt = 0 .
If x0>x* then it will follow that z will equal zero until xt depreciates to x*. If x0< x* then z
p
will be as large as possible f ( xt ) = zt until x* is reached.
c
Hence, economic intuition and a good understanding of the steady state can tell us where
we want to get and how we're going to get there – in the most rapid approach possible.
How does this rule apply here? The integrand is pt f ( xt ) − ct z t . Using the state equation
bxt + xɺt = zt , the integrand can be written
pt f ( xt ) − ct (bxt + xɺt ) = pt f ( xt ) − ct bxt − ct xɺt .
Converting this to the notation used by Wilen,
6 -15
M ( K ) = pt f ( xt ) − ct bxt
and
N ( K ) Kɺ = ct xɺt .
Hence this problem fits into the general class of MRAP problems.
VII. References
Kamien, Morton I. and Schwartz, Nancy Lou. 1991. Dynamic Optimization : The
Calculus of Variations and Optimal Control in Economics and Management. New
York, N.Y. : Elsevier.
Wilen, James E. 1985. Bioeconomics of Renewable Resource Use, In A.V. Kneese and
J.L. Sweeney (eds.) Handbook of Natural Resource and Energy Economics, vol. I.
New York: Elsevier Science Publishers B.V.
Spence, Michael and David Starrett. 1975. Most Rapid Approach Paths in Accumulation
Problems. International Economic Review 16(2):388-403.
Kamien, Morton I. and Schwartz, Nancy Lou. 1991. Dynamic Optimization : The
Calculus of Variations and Optimal Control in Economics and Management. New
York, N.Y. : Elsevier.