# Stochastic dynamic programming I.

Dynamic programming is a technique that is frequently used in macroeconomics. It can help us solve models with many (actually an infinite number of) overlapping equations. In this post I will introduce a very basic application of stochastic dynamic programming. I will introduce a model of the economy that includes random shocks. I will demonstrate how dynamic programming can be used to solve this model.

The stochastic nature of the model will come in the form of random shocks to technology. This means that output will not only depend on capital and labor but also on a random term. Using dynamic programming we will be able to solve the model for capital and consumption, and thus we will be able to model how the economy is supposed to evolve over time.

Since this is an introductory model in stochastic dynamic programming, it is not one that you will be able to predict the actual real-world evolution of various economic variables with. However, as we shall see it does fit some features of the post-war U.S. economy surprisingly well. These kinds of models also serve as a basis for real business cycle (RBC) models, which are more modern models of the macroeconomy. The model introduced here is based on section 5.2 of Williamson (2006).

Suppose the representative agent has a utility function

$E_0[\sum\limits_{t=1}^{\infty} \beta^t u(c_t)],$

where E is the expectation operator, 0 < beta < 1 is the discount rate, c is consumption and u is utility function.

Firms produce using the technology

$y_t = z_t F(k_t, n_t),$

where y is output, z is a random disturbance/shock term, k is capital and 0 <= n <= 1 is labor. Since there is no disutility of labor, n = 1 units will be supplied always. Therefore, the technology can be rewritten as

$y_t = z_t F(k_t, 1) = z_t f(k_t),$

where f(k) is just F(k, 1). Capital stock moves according to

$k_{t+1} = i_t + (1 - \delta) k_t,$

where i is investment and 0 < delta < 1 is the depreciation rate. Finally, the resource constraint is

$c_t + i_t = y_t.$

So the social planner’s problem is

$\max_{\left\{c_t, k_{t+1}\right\}_{t=0}^{\infty}} E_0[\sum\limits_{t=1}^{\infty} \beta^t u(c_t)]$

subject to

$c_t + k_{t+1} = z_t f(k_t) + (1 - \delta) k_t.$

The constraint is obtained as follows: plug in for i using the equation of capital stock movements, plug in for y using the production function, rearrange.

Now comes dynamic programming. Recognize that we can write the above equation as:

$v(k_t, z_t) = \max_{c_t, k_{t+1}} u(c_t) + \beta E_t[v(k_{t+1}, z_{t+1})]$

subject to the same contraint as above. Let’s go through this term by term. The function v() is just the value function of the consumer. At time t the consumer has a choice over c_t and i_t. The former is consumption while the latter determines k_{t+1}, i.e. the capital stock of the next period. So essentially, the consumer can choose to allocate between consumption and saving. That is why we are maximizing over these two terms.

Moving on, first we have the utility of the chosen consumption. This is straightforward. Afterwards we have the discounted utility of future consumption. Beta is the discount term there (think of beta as 1/(1+r) with r being the interest rate), so it’s obvious that it discounts the term that it is multiplied by. Then we have the expectation of the value function’s value in the next period, conditional on time t information.

Let me show with a quick example what this means. Consider the following value functions:

$v(k_1, z_1) = \max_{c_1, k_{2}} u(c_1) + \beta E_1[v(k_{2}, z_{2})].$

$v(k_2, z_2) = \max_{c_2, k_{3}} u(c_2) + \beta E_2[v(k_{3}, z_{3})].$

$v(k_3, z_3) = \max_{c_3, k_{4}} u(c_3) + \beta E_3[v(k_{4}, z_{4})], etc.$

You can see that the first value function contains the second one, but the second one contains the third one, the third contains the fourth one and so on. So basically, the first equation alone summarizes the whole system of infinite equations. This is what’s captured by the general form shown above and that form is called the Bellman equation.

Now to proceed one must find the maximum of the right-hand side of the Bellman equation. In order to do this, we must first solve the constraint for c and plug in for c in the Bellman equation. Then we only have one choice variable, k_{t+1}. In order to maximize, we must differentiate the right-hand side of the Bellman equation with respect to k_{t+1} and set the derivative equal to zero to get the first-order condition (FOC). The FOC will have a term v'(k_{t+1}, z_{t+1}). But using the envelope theorem we can calculate v'(k_t, z_t). This term will not contain any v’s anymore. We can just “update” the indices of the resulting expression to obtain v'(k_{t+1}, z_{t+1}). Plugging this into the FOC we can go ahead and come up with some solutions for the growth rate of the economy, f'(k), for instance.

I am not going to go through this with the general case, as the implications are rather uninteresting. Instead, I will move on from this step using an example, which we can solve for c and k. But as this post is already quite long, I put this in a seperate one. Click here to continue.

Other posts in my series on dynamic programming: Stochastic dynamic programming II., Stochastic dynamic programming III..