Notes ValueFunctionIteration
Notes ValueFunctionIteration
subject to
C + K 0 = zF (K, 1) + (1 − δ)K,
C ≥ 0,
K 0 ≥ 0,
K0 given
where U (·) and F (·, ·) satisfy the standard assumptions, 0 < β < 1 and 0 ≤ δ ≤ 1, and z is
assumed to be a constant.
We wish to solve for the value function, V (K), and optimal capital accumulation policy,
K 0 = g(K). The model can be solved analytically in the case of full depreciation (δ = 1)
and log utility. However, when we deviate from this often unrealistic case we can no longer
solve analytically. So instead we use numerical methods to compute approximations to the
value function and policy for capital. A large number of such numerical methods exist. The
most straightforward as well as popular is value function iteration. By the name you can
tell that this is an iterative method. The iteration rule is as follows. At iteration n, we have
some estimate of the value function, V (n) . We then use the Bellman equation to compute
an updated estimate of the value function, V (n+1) , as follows:
© ª
V (n+1) (K) = max
0
U (zF (K, 1) + (1 − δ)K − K 0
) + βV (n)
(K) (1)
K
1
We know from Stokey and Lucas that (1) a value function V ∗ that is a solution to the
fixed point problem above exists and is unique and (2) if we repeat this procedure many
times (i.e., start with an initial guess V (0) and use equation (1) to obtain V (1) and again to
obtain V (2) , and so on) our value function V (n) will converge monotonically to the V ∗ for any
continuous function V (0) . This means that after many iterations, each subsequent iteration
no longer changes our guess, i.e. limn→∞ ||V (n+1) − V (n) || = 0. The fact that value func-
tion iteration will eventually converge for virtually any initial guess, make value function
iteration a very stable algorithm.
The next task is to decide how to represent V , a function defined over a continuous state
space, by a finite number of numbers. In other words, we need to decide how we want to
approximate V . It is the approximation that we will solve for since it is not possible to solve
for V itself numerically. We will approximate the function over the continuous domain by
a discrete set of function values at a discrete set of points in the domain. Suppose we are
interested in V on the region [K, K] this could be an interval around the non-trivial steady
state level of capital K ∗ > 0 or an interval starting below the steady state but above the
trivial steady state of K ∗ = 0 and ending above the non-trivial steady state for example.
To approximate V on this domain we will choose a discrete state space or grid of nk K’s,
{K1 , K2 , K3 , ..., Knk }, where Ki ∈ [K, K̄], for all i. Instead of solving for V itself, we will solve
for the approximation to V which consists of the values of V at each point on the capital
grid.
The next step is to determine the grid. We would ideally like to select a very large nk
to obtain as good an approximation to the value function as possible. However, the usual
tradeoff between efficiency and accuracy arises since as nk increases so do the time required
to compute the approximation at each iteration. With only one state variable the rate of
increase of the time cost with increasing nk is not outrageous but in more complicated mod-
els with multiple state variables, processing time increases exponentially with increases in
grid size, the so-called curse of dimensionality.
The simplest way to discretize the state space is to evenly space the nk grid points
along the real line between the chosen upper and lower bounds but alternative methods
that place more grid points in areas where the value function is more non-linear may yield
better approximations for the same cost of computation time. The value function is in
2
general more non-linear for lower values of capital. Thus is is often better to make the grid
points evenly spaced in logs.
iii) Find the location of the maximum and maximum of Ti . Store the maximum
as the ith element of the updated value function, V (1) (K). Store the location
of the maximizer, as the ith element in the policy vector g.
iv) Choose a new grid point for the current capital stock in step (a) and repeat
steps (b) and (c). Once we have completed steps (a) through (c) for each value
of Ki , i.e. for i = 1, .., nk , we will have updated value and policy functions and
can continue to the next step.
Step 5. Compute the distance, d, between V (0) (K) and V (1) (K). A common definition
of d is the sup norm, i.e.
(1) (0)
d= max |Vi − Vi |
i∈{1,...,nk }
(1)
Step 6. If the distance is within the error tolerance level, d ≤ ε||Vi ||, the value
function has converged, so we have obtained the numerical estimates of the value
3
and policy functions. If d > ε, return to Step 4, setting the initial guess to the updated
value function, i.e. V (0) (K) = V (1) (K). Keep iterating until the value function has
converged.
1. Check that the policy function isn’t constrained by the discrete state space. If Kgi is
equal to the highest or lowest value of capital in the grid for some i, relax the bounds
of K and redo the value function iteration.
2. Check that the error tolerance is small enough. If a small reduction in ε results in
large changes in the value or policy functions, the tolerance is too high. Reduce ε until
the solution to the value and policy functions are insensitive to further reductions.
4. A good initial guess of the value function can reduce computation time substantially.
Solving the model using a small number of gridpoints for capital will give you an
optimal decision rule from which to develop a good guess at the value function using
a higher value for nk .
2 Speed Improvements
Value function iteration is stable, in that it converges to the true solution, however it is
also very slow. Fortunately, there are some methods that we can use to reduce computation
time without sacrificing stability.
The simplest way to speed up the algorithm is to ensure that you are not doing redoing
costly computations over and over again in the iteration loop. For example, notice that
U (zF (Ki , 1) + (1 − δ)Ki − Kj ) does not depend on the updated value of V or g and hence
can be computed for all i and j once at the beginning stored in an array and the values
retrieved when needed.
4
2.1 Howard’s Improvement
Selecting the maximizer is the most time-consuming step in the value function iteration.
Howard’s improvement reduces the number of times we update the policy function relative
to the number of times we update the value function. Essentially, on some iterations, we
simply use the current approximation to the policy function to update the value function
without updating the policy function. Updating the value function using a the current
estimate of the policy function will bring the value function closer to the true value since
the policy function tends to converge faster than the value function.
To implement Howard’s Improvement, we need to select an integer, nh , which is the
number of iterations using the existing policy function we will perform after each policy
function update. Some experimentation is usually required to determine the optimal nh .
To many may result in a value function moving further from the true one since the policy
function is not the optimal policy. We also add an additional step to the value function
iteration algorithm between the existing Step 4 and Step 5, that we’ll name Step 4.5, as
follows.
Step 4.5 Complete the following steps nh times.
a) Set V (0) (K) = V (1) (K)
b) For i = 1, ..., nk , update the value function using the following equation:
(1)
Vi = U (zF (Ki , 1) + (1 − δ)Ki − Kgi ) + βVg(0)
i
c) Repeat parts (a) and (b). Note that the value of Kj has been replaced by the
optimal decision rule Kgi . We now need just one computation for each value of
Ki , rather than nk computations, one for each possible choice of capital stock
next period. Also note that there is no optimization, since we do not update
the policy function.
Note that as nh → ∞, the difference between V (0) (K) and V (1) (K) goes to zero and they
converge to the fixed point of
Instead of iterating on the bellman equation you could solve for Vi∞ directly using matrix
operations. It requires the solution to a system of nk linear equations however.
5
2.2 Concavity of the Value Function
The value function that solves the neoclassical growth model here is strictly concave in the
choice of K 0 . Therefore the value of the Bellman equation that we compute in Step 4 will
have a unique maximum. For a given i, if we start with j = 1 and sequentially calculate
the value of Ti,j for each j = 1, . . . , nk , we know that Kt−1 is the maximizer if Ti,t < Ti,t−1 . To
exploit this and reduce computation time, as soon as we find the maximizer, we can stop
looking, i.e. we don’t need to compute the value of Ti,j for j = t + 1, ..., nk .
We can modify Step 4 as follows:
Step 4 Update the value function using equation (1). Specifically,
a) Fix the current capital stock at Ki , beginning with i = 1.
b) Fix next period’s capital stock at Kj , beginning with j = 1. Compute Ti,j using:
(0)
Ti,j = U (zF (Ki , 1) + (1 − δ)Ki − Kj ) + βVj
The policy function that solves the neoclassical growth model here is monotonic non-decreasing
in Ki , i.e. if Ki < Ki+1 , then g(Ki ) ≤ g(Ki+1 ). Once we have found the optimal decision rule
for Ki , we know that the optimal decision rule for Ki+1 cannot be any of the grid points
from K1 to Ki−1 . Therefore, we can begin our search for gi+1 at gi , reducing the number of
computations we need to perform.
We can modify Step 4 as follows:
Step 4 Update the value function using equation (1). Specifically,
a) Fix the current capital stock at Ki , beginning with i = 1.
6
b) Compute Ti,j for j from gi−1 or 1 if i = 1 to nk using:
(0)
T (j) = U (zF (Ki , 1) + (1 − δ)Ki − Kj ) + βVj
c) Find the maximizer and maximum of T . Store the maximum as the ith ele-
ment of the updated value function, V (1) (K). Store the maximum location as
the ith element in the policy function, g.
d) Choose the next grid point for Ki in step (a) and repeat steps (b) and (c). Once
steps (a) through (c) have been complete for i = 1, ..., nk , we will have updated
value and policy functions and can continue to the next step.
3 Linear Interpolation
Consider beginning the value function iteration procedure as normal. Start with an initial
capital level, Ki , and choose the level of capital next period that maximizes the value func-
tion, KJ . We know that value function is higher when we choose KJ compared to either
KJ−1 or KJ+1 , however, no other level of capital in the range (KJ−1 , KJ+1 ) was available. It
is almost certain that the true maximizer in a continuous state space would be a point other
than KJ in this range, so our approximation to the true optimal policy and value functions
almost certainly contains some error. How can we reduce this error?
One solution is to increase the number of grid points. As nk → ∞, the discrete state
space approaches a continuous state space over the range of the grid. Computing power,
however, is not infinite, and increasing the number of grid points increases computation
time exponentially. In the single state variable case, we need to compute the modified
Bellman equation n2k times, so increasing the grid size from 9 to 10 increases the number of
computations from 81 to 100. If there were 2 state variables, the number of computations
would increase from 94 to 104 , i.e. by 3,439. The memory requirements to store value and
policy functions also increase exponentially.
An alternative to increasing nk that improves accuracy without increasing computation
time or memory requirements substantially is linear interpolation. This involves approxi-
mating the initial guess of the value function for points off the main capital grid by straight
lines. Once we have chosen the optimal capital next period, KJ , we construct a new grid
7
for capital of length, mk , over the range, [KJ−1 , KJ ]. We want to check whether any of the
points on this sub-grid is preferred to KJ .
Recall equation (5).
(0)
T (j) = U (zF (Ki , 1) + (1 − δ)Ki − Kj ) + βVj (Kj )
For points on the sub-grid, we can calculate U (zF (Ki , 1) + (1 − δ)Ki − Kj ), however we
(0)
cannot calculate Vj (Kj ), since it only allows points on the main grid as inputs. We need
to interpolate for points between KJ−1 and KJ and will do so by constructing a straight line
(0) (0)
between Vj (KJ−1 ) and Vj (KJ ).
Recall that the line that joins two points, (x1 , y1 ) and (x2 , y2 ), is given by the following.
y2 − y1
y − y1 = (x − x1 ) (3)
x2 − x1
We can therefore approximate the value function for intermediate guesses, Kt , by the fol-
lowing.
(0) (0)
(0) (0) Vj (KJ ) − Vj (KJ−1 )
Vj (Kt ) − Vj (KJ−1 ) = (Kt − KJ−1 ) (4)
KJ − KJ−1
We should also check whether any point in the range [KJ , KJ+1 ] would be chosen over KJ .
We therefore construct a second sub-grid over this range and use an analagous approxima-
tion to that in equation (4) for the value function.
Implementing linear interpolation requires systematically calculating the value of T (j)
for each point in the two sub-grids and selecting the maximum. Note that we use the linear
(0)
function to approximate values in the initial guess of the value function, Vj (Kj ), not the
(1)
updated value function, Vj (Kj ). Note also that if KJ is either K1 or Knk , we can only
construct one sub-grid for the interpolation.
To add linear interpolation to the value function iteration, replace the original Step 4
with the following.
a) Fix the current capital stock at one of the grid points, Ki , beginning with i = 1
(0)
T (j) = U (zF (Ki , 1) + (1 − δ)Ki − Kj ) + βVj (Kj ) (5)
8
If consumption is negative for a particular Kj , assign a large negative number to
T (j). Note that T is a vector of length nk , with each element, T (j), representing
the value of the right hand side of equation (5) conditional upon the respective
choice of capital next period, Kj , i.e. T = {T (j)}nj=1
k
d) If KJ > K1 :
e) If KJ < Knk :
f) Find the maximum of all points in the two sub-grids, T (l) (t) and T (r) (t), combined.
Store the maximum as the ith element of the updated value function, V (1) (K).
Store the maximizer, Kj ∈ {KJ−1 , ..., KJ , ..., KJ+1 }, as the ith element in the policy
function, g(K).
g) Choose a new grid point for the current capital stock in step (a) and repeat steps
(b) through (f). Once we have completed steps (a) through (f) for each value of
Ki , i.e. for i = 1, .., nk , we will have updated value and policy functions and can
continue to the next step.
9
4 Adding Leisure Choice
Consider the standard neoclassical growth model with leisure choice.
V (K) = max
0
{U (C, L) + βV (K 0 )}
C,K ,L
subject to
C + K 0 = zF (K, N ) + (1 − δ)K
C≥0
K0 ≥ 0
L+N =1
10