Skip to content

Added parameter derivatives of perturbation solution up to 3 order

Preliminary comments

I finished the identification toolbox at orders two and three using the pruned state space system, but before I merge request this, I decided to first merge the new functionality to compute parameter derivatives of perturbation solution matrices at orders two and three. So after this is approved, I would merge the improved identification toolbox. I guess @rattoma, @sebastien, and @MichelJuillard are best choices to review this. Note that @rattoma already implemented the parameter derivatives of g_x and g_u analytically (and numerically), and I rely heavily on his work in get_first_order_solution_params_derivs.m (previously getH.m). My additions are mainly to this function and thus it is renamed to get_perturbation_params_derivs.m.

I outline the main idea first and then provide some more detailed changes I made to the functions.

Main idea

This merge request is concerned with the analytical computation of the parameter derivatives of first, second and third order perturbation solution matrices, i.e. using closed-form expressions to efficiently compute the derivative of g_x , g_u, g_{xx}, g_{xu}, g_{uu}, g_{\sigma\sigma}, g_{xxx}, g_{xxu}, g_{xuu}, g_{uuu}, g_{x\sigma\sigma}, g_{u\sigma\sigma} with respect to model parameters \theta. Note that \theta contains model parameters, stderr and corr parameters of shocks. stderr and corr parameters of measurement errors are not yet supported, (they can easily be included as exogenous shocks). The availability of such derivatives is beneficial in terms of more reliable analysis of model sensitivity and parameter identifiability as well as more efficient estimation methods, in particular for models solved up to third order, as it is well-known that numerical derivatives are a tricky business, especially for large models.

References for my approach are:

  • Iskrev (2008, 2010) and Schmitt-Grohé and Uribe (2012, Appendix) who were the first to compute the parameter derivatives analytically at first order, however, using inefficient (sparse) Kronecker products.
  • Mutschler (2015) who provides the expressions for a second-order, but again using inefficient (sparse) Kronecker products.
  • Ratto and Iskrev (2012) who show how the first-order system can be solved accurately, fast and efficiently using existing numerical algorithms for generalized Sylvester equations by taking the parameter derivative with respect to each parameter separately.
  • Julliard and Kamenik (2004) who provide the perturbation solution equation system in tensor notation at any order k.
  • Levintal (2017) who introduces permutation matrices to express the perturbation solution equation system in matrix notation up to fifth order.

The basic idea of this merge request is to take the second- and third-order perturbation solution systems in Julliard and Kamenik (2004), unfold these into an equivalent matrix representation using permutation matrices as in Levintal (2017). Then extending Ratto and Iskrev (2012) one takes the derivative with respect to each parameter separately and gets a computational problem that is linear, albeit large, as it involves either solving generalized Sylvester equations or taking inverses of highly sparse matrices. I will now briefly summarize the perturbation solution system at third order and the system that results when taking the derivative with respect to parameters.

Perturbation Solution

The following systems arise at first, second, and third order:

(ghx): f_{x} z_{x} = f_{y_{-}^*} + f_{y_0} g_{x} + f_{y_{+}^{**}} g^{**}_{x} g^{*}_{x}= A g_{x} + f_{y_{-}^*}=0

(ghu): f_{z} z_{u} = f_{y_0} g_{u} + f_{y_{+}^{**}} g^{**}_{x} g^{*}_{u} + f_{u}= A g_u + f_u = 0

(ghxx) : A g_{xx} + B g_{xx} \left(g^{*}_{x} \otimes g^{*}_{x}\right) + f_{zz} \left( z_{x} \otimes z_{x} \right) = 0

(ghxu) : A g_{xu} + B g_{xx} \left(g^{*}_{x} \otimes g^{*}_{u}\right) + f_{zz} \left( z_{x} \otimes z_{u} \right) = 0

(ghuu) : A g_{uu} + B g_{xx} \left(g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zz} \left( z_{u} \otimes z_{u} \right) = 0

(ghs2) : (A+B) g_{\sigma\sigma} + \left( f_{y^{**}_{+}y^{**}_{+}} \left(g^{**}_{u} \otimes g^{**}_{u}\right) + f_{y^{**}_{+}} g^{**}_{uu}\right)vec(\Sigma) = 0

(ghxxx) : A g_{xxx} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{x} \otimes g^{*}_{x}\right) + f_{y_{+}}g^{**}_{xx} \left(g^{*}_x \otimes g^{*}_{xx}\right)P_{x\_xx} + f_{zz} \left( z_{x} \otimes z_{xx} \right)P_{x\_xx} + f_{zzz} \left( z_{x} \otimes z_{x} \otimes z_{x} \right) = 0

(ghxxu) : A g_{xxu} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{x} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{x} \otimes z_{x} \otimes z_{u} \right) + f_{zz} \left( \left( z_{x} \otimes z_{xu} \right)P_{x\_xu} + \left(z_{xx} \otimes z_{u}\right) \right) + f_{y_{+}}g^{**}_{xx} \left( \left(g^{*}_{x} \otimes g^{*}_{xu}\right)P_{x\_xu} + \left(g^{*}_{xx} \otimes g^{*}_{u}\right) \right) = 0

(ghxuu) : A g_{xuu} + B g_{xxx} \left(g^{*}_{x} \otimes g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{x} \otimes z_{u} \otimes z_{u} \right)+ f_{zz} \left( \left( z_{xu} \otimes z_{u} \right)P_{xu\_u} + \left(z_{x} \otimes z_{uu}\right) \right) + f_{y_{+}}g^{**}_{xx} \left( \left(g^{*}_{xu} \otimes g^{*}_{u}\right)P_{xu\_u} + \left(g^{*}_{x} \otimes g^{*}_{uu}\right) \right) = 0

(ghuuu) : A g_{uuu} + B g_{xxx} \left(g^{*}_{u} \otimes g^{*}_{u} \otimes g^{*}_{u}\right) + f_{zzz} \left( z_{u} \otimes z_{u} \otimes z_{u} \right)+ f_{zz} \left( z_{u} \otimes z_{uu} \right)P_{u\_uu} + f_{y_{+}}g^{**}_{xx} \left(g^{*}_{u} \otimes g^{*}_{uu}\right)P_{u\_uu} = 0

(ghx\sigma\sigma) : A g_{x\sigma\sigma} + B g_{x\sigma\sigma} g^{*}_x + f_{y_{+}} g^{**}_{xx}\left(g^{*}_{x} \otimes g^{*}_{\sigma\sigma}\right) + f_{zz} \left(z_{x} \otimes z_{\sigma\sigma}\right) + F_{xu_{+}u_{+}}\left(I_{n_x} \otimes vec(\Sigma)\right) = 0

F_{xu_{+}u_{+}} = f_{y_{+}^{\ast\ast}} g_{xuu}^{\ast\ast} (g_x^{\ast} \otimes I_{n_u^2}) + f_{zz} \left( \left( z_{xu_{+}} \otimes z_{u_{+}} \right)P_{xu\_u} + \left(z_{x} \otimes z_{u_{+}u_{+}}\right) \right) + f_{zzz}\left(z_{x} \otimes z_{u_{+}} \otimes z_{u_{+}}\right)

(ghu\sigma\sigma) : A g_{u\sigma\sigma} + B g_{x\sigma\sigma} g^{*}_{u} + f_{y_{+}} g^{**}_{xx}\left(g^{*}_{u} \otimes g^{*}_{\sigma\sigma}\right) + f_{zz} \left(z_{u} \otimes z_{\sigma\sigma}\right) + F_{uu_{+}u_{+}}\left(I_{n_u} \otimes vec(\Sigma_u)\right) = 0

F_{uu_{+}u_{+}} = f_{y_{+}^{\ast\ast}} g_{xuu}^{\ast\ast} (g_u^{\ast} \otimes I_{n_u^2}) + f_{zz} \left( \left( z_{uu_{+}} \otimes z_{u_{+}} \right)P_{uu\_u} + \left(z_{u} \otimes z_{u_{+}u_{+}}\right) \right) + f_{zzz}\left(z_{u} \otimes z_{u_{+}} \otimes z_{u_{+}}\right)

A and B are the common perturbation matrices: A = f_{y_0} + \begin{pmatrix} \underbrace{0}_{n\times n_{static}} &\vdots& \underbrace{f_{y^{**}_{+}} \cdot g^{**}_{x}}_{n \times n_{spred}} &\vdots& \underbrace{0}_{n\times n_{frwd}} \end{pmatrix} and B = \begin{pmatrix} \underbrace{0}_{n \times n_{static}}&\vdots & \underbrace{0}_{n \times n_{pred}} & \vdots & \underbrace{f_{y^{**}_{+}}}_{n \times n_{sfwrd}} \end{pmatrix} and z=(y_{-}^{\ast}; y; y_{+}^{\ast\ast}; u) denotes the dynamic model variables as in M_.lead_lag_incidence, y^\ast denote state variables, y^{\ast\ast} denote forward looking variables, y_+ denote the variables with a lead, y_{-} denote variables with a lag, y_0 denote variables at period t, f the model equations, and f_z the first-order dynamic model derivatives g1, f_{zz} the second-order dynamic derivatives g2, and f_{zzz} the third-order dynamic model derivatives g3. Then:

z_{x} = \begin{pmatrix}I\\g_{x}\\g^{**}_{x} g^{*}_{x}\\0\end{pmatrix}, z_{u} =\begin{pmatrix}0\\g_{u}\\g^{**}_{x} \cdot g^{*}_{u}\\I\end{pmatrix}, z_{u_{+}} =\begin{pmatrix}0\\0\\g^{**}_{u}\\0\end{pmatrix} z_{xx} = \begin{pmatrix} 0\\g_{xx}\\g^{**}_{x} \left( g^{*}_x \otimes g^{*}_{x} \right) + g^{**}_{x} g^{*}_{x}\\0\end{pmatrix}, z_{xu} =\begin{pmatrix}0\\g_{xu}\\g^{**}_{xx} \left( g^{*}_x \otimes g^{*}_{u} \right) + g^{**}_{x} g^{*}_{xu}\\0\end{pmatrix}, z_{uu} =\begin{pmatrix}0\\g_{uu}\\g^{**}_{xx} \left( g^{*}_u \otimes g^{*}_{u} \right) + g^{**}_{x} g^{*}_{uu}\\0\end{pmatrix}, z_{xu_{+}} =\begin{pmatrix}0\\0\\g^{**}_{xu} \left( g^{*}_x \otimes I \right)\\0\end{pmatrix}, z_{uu_{+}} =\begin{pmatrix}0\\0\\g^{**}_{xu} \left( g^{*}_{u} \otimes I \right)\\0\end{pmatrix}, $z_{u_{+}u_{+}} =\begin{pmatrix}0\\0\\g^{\ast\ast}_{uu}\\0\end{pmatrix}$, $z_{\sigma\sigma} = \begin{pmatrix}0\\ g_{\sigma\sigma}\\ g^{\ast\ast}_{x}g^{\ast}_{\sigma\sigma} + g^{\ast\ast}_{\sigma\sigma}\\0 \end{pmatrix}$

Note that $P$ are permutation matrices that can be computed using Matlab's ipermute function.

Parameter derivatives of perturbation solutions

First, we need the parameter derivatives of first, second, third, and fourth derivatives of the dynamic model (i.e. g1,g2,g3,g4 in dynamic files), I make use of the implicit function theorem: Let $f_{z^k}$ denote the kth derivative (wrt all dynamic variables) of the dynamic model, then let $df_{z^k}$ denote the first-derivative (wrt all model parameters) of $f_{z^k}$ evaluated at the steady state. Note that $f_{z^k}$ is a function of both the model parameters $\theta$ and of the steady state of all dynamic variables $\bar{z}$, which also depend on the parameters. Hence, implicitly $f_{z^k}=f_{z^k}(\theta,\bar{z}(\theta))$ and $df_{z^k}$ consists of two parts:

  1. direct derivative wrt to all model parameters given by the preprocessor in the _params_derivs.m files
  2. contribution of derivative of steady state of dynamic variables (wrt all model parameters): $f_{z^{k+1}} \cdot d\bar{z}$ Note that we already have functionality to compute $d\bar{z}$ analytically.

Having this, the above perturbation systems are basically equations of the following types

$AX +BXC = RHS$ or $AX = RHS$

Now when taking the derivative (wrt to one single parameter $\theta_j$), we get

$A\mathrm{d}\{X\} + B\mathrm{d}\{X\}C = \mathrm{d}\{RHS\} - \mathrm{d}\{A\}X - \mathrm{d}\{B\}XC - BX\mathrm{d}\{C\}$ or $A\mathrm{d}\{X\} = \mathrm{d}\{RHS\} - \mathrm{d}\{A\}X$

The first one is a Sylvester type equation, the second one can be solved by taking the inverse of $A$. The only difficulty and tedious work arises in computing (the highly sparse) derivatives of $RHS$.

New functions:

get_perturbation_params_derivs.mand get_perturbation_params_derivs_numerical_objective.m

  • The parameter derivatives up to third order are computed in the new functionget_perturbation_params_derivs.m both analytically and numerically. For numerical derivatives get_perturbation_params_derivs_numerical_objective.m is the objective for fjaco.m or hessian_sparse.m or hessian.m.
  • get_perturbation_params_derivs.m is basically an extended version of the previous get_first_order_solution_params_derivs.m function.
  • get_perturbation_params_derivs_numerical_objective.mbuilds upon identification_numerical_objective.m. It is used for numerical derivatives, whenever analytic_derivation_mode=-1|-2. It takes from identification_numerical_objective.m the parts that compute numerical parameter Jacobians of steady state, dynamic model equations, and perturbation solution matrices. Hence, these parts are removed in identification_numerical_objective.m and it only computes numerical parameter Jacobian of moments and spectrum which are needed for identification analysis in get_identification_jacobians.m, when analytic_derivation_mode=-1 only.
  • Detailed changes:
    • Most important: notation of this function is now in accordance to the k_order_solver or Dynare++, i.e. we do not compute derivatives of Kalman transition matrices A and B, but rather the solution matrices ghx,ghu,ghxx,ghxu,ghuu,ghs2,ghxxx,ghxxu,ghxuu,ghuuu,ghxss,ghuss in accordance with notation used in oo_.dr. As a byproduct at first-order, focusing on ghx and ghu instead of Kalman transition matrices A and B makes the computations slightly faster for large models (e.g. for Quest the computations were faster by a couple of seconds, not much, but okay).
    • Removed use of kstate, see also #1653 (closed) and !1656 (merged)
    • Output arguments are stored in a structure DERIVS, there is also a flag d2flag that computes parameter hessians needed only in dsge_likelihood.m.
    • Removed kronflag as input. options_.analytic_derivation_mode is now used instead of kronflag.
    • Removed indvar, an index that was used to selected specific variables in the derivatives. This is not needed, as we always compute the parameter derivatives for all variables first and then select a subset of variables. The selection now takes place in other functions, like in dsge_likelihood.m.
    • Some additional checks: (i) deterministic exogenous variables are not supported, (ii) Kronecker method only compatible with first-order approximation so reset to sylvester method, (iii) for purely backward or forward models we need to be careful with the rows in M_.lead_lag_incidence, (iv) if _params_derivs.m files are missing an error is thrown.
    • For numerical derivatives, if mod file does not contain an estimated_params_block, a temporary one with the most important parameter information is created.

unfold_g4.m

  • When evaluating g3 and g4 one needs to take into account that these do not contain symmetric elements, so one needs to use unfold_g3.m and the new function unfold_g4.m. This returns an unfolded version of the same matrix (i.e. with symmetric elements).

New test models

  • .gitignore and Makefile.am are changed accordingly. Also now it is possible to run test suite on analytic_derivatives, i.e. run make check m/analytic_derivatives
  • Removed analytic_derivatives/ls2003.mod as this mod file is neither in the testsuite nor does it work.

analytic_derivatives/BrockMirman_PertParamsDerivs.mod

  • This is the Brock Mirman model, where we know the exact policy function $g$ for capital and consumption. As this does not imply a nonzero $g_{\sigma\sigma}$, $g_{x\sigma\sigma}$, $g_{u\sigma\sigma}$ I added some artificial equations to get nonzero solution matrices with respect to $\sigma$. The true perturbation solution matrices $g_x$ , $g_u$, $g_{xx}$, $g_{xu}$, $g_{uu}$, $g_{\sigma\sigma}$, $g_{xxx}$, $g_{xxu}$, $g_{xuu}$, $g_{uuu}$, $g_{x\sigma\sigma}$, $g_{u\sigma\sigma}$ are then computed analytically with Matlab's symbolic toolbox and saved in nBrockMirmanSYM.mat. There is a preprocessor flag that recreates these analytical computations if changes are needed (and to check whether I made some errors here ;-) )
  • Then solution matrices up to third order and their parameter Jacobians are then compared to the ones computed by Dynare's k_order_solver and by get_perturbation_params_derivs for all analytic_derivation_mode's. There will be an error if the maximum absolute deviation is too large, i.e. for numerical derivatives (analytic_derivation_mode=-1|-2) the tolerance is choosen lower (around 1e-5); for analytical methods we are stricter: around 1e-13 for first-order, 1e-12 for second order, and 1e-11 for third-order.
  • As a side note, this mod file also checks Dynare's k_order_solver algorithm and throws an error if something is wrong.
  • This test model shows that the new functionality works well. And analytical derivatives perform way better and accurate than numerical ones, even for this small model.

analytic_derivatives/burnside_3_order_PertParamsDerivs.mod

  • This builds upon tests/k_order_perturbation/burnside_k_order.mod and computes the true parameter derivatives analytically by hand.
  • This test model also shows that the new functionality works well.

analytic_derivatives/LindeTrabandt2019.mod

  • Shows that the new functionality also works for medium-sized models, i.e. a SW type model solved at third order with 35 variables (11 states). 2 shocks and 20 parameters.
  • This mod file can be used to tweak the speed of the computations in the future.
  • Compares numerical versus analytical parameter derivatives (for first, second and third order). Note that this model clearly shows that numerical ones are quite different than analytical ones even at first order!

identification/LindeTrabandt2019_xfail.mod

  • This model is a check for (confidential) issue Dynare/dynare#1595, see fjaco.m below for a description, and will fail.

Detailed changes in other functions

get_first_order_solution_params_derivs.m

  • Deleted, or actually, renamed to get_perturbation_params_derivs.m, as this function now is able to compute the derivatives up to third order

identification_numerical_objective.m

  • get_perturbation_params_derivs_numerical_objective.mbuilds upon identification_numerical_objective.m. It takes from identification_numerical_objective.m the parts that compute numerical parameter Jacobians of steady state, dynamic model equations, and perturbation solution matrices. Hence, these parts are removed in identification_numerical_objective.m and it only computes numerical parameter Jacobian of moments and spectrum which are needed for identification analysis in get_identification_jacobians.m, when analytic_derivation_mode=-1 only.

dsge_likelihood.m

  • As get_first_order_solution_params_derivs.mis renamed to get_perturbation_params_derivs.m, the call is adapted. That is,get_perturbation_params_derivs does not compute the derivatives of the Kalman transition T matrix anymore, but instead of the Dynare solution matrix ghx. So we recreate T here as this amounts to adding some zeros and focusing on selected variables only.
  • Added some checks to make sure the first-order approximation is selected.
  • Removed kron_flag as input, as get_perturbation_params_derivs looks into options_.analytic_derivation_mode for kron_flag.

dynare_identification.m

  • make sure that setting analytic_derivation_mode is set both in options_ident and options_. Note that at the end of the function we restore the options_ structure, so all changes are local. In a next merge request, I will remove the global variables to make all variables local.

get_identification_jacobians.m

  • As get_first_order_solution_params_derivs.mis renamed to get_perturbation_params_derivs.m, the call is adapted. That is,get_perturbation_params_derivs does not compute the derivatives of the Kalman transition A and B matrix anymore, but instead of the dynare solution matrix ghx and ghu. So we recreate these matrices here instead of in get_perturbation_params_derivs.m.
  • Added str2func for better function handles in fjaco.m.

fjaco.m

  • make tolan option, which can be adjusted by changing options_.dynatol.xfor identification and parameter derivatives purposes.
  • include a check and an informative error message, if numerical derivatives (two-sided finite difference method) yield errors in resol.m for identification and parameter derivatives purposes. This closes issue Dynare/dynare#1595.
  • Changed year of copyright to 2010-2017,2019

Further suggestions and questions

  • Once this is merged, I will merge an improvement of the identification toolbox, which will work up to third order using the pruned state space. This will also remove some issues and bugs, and also I will remove global variables in this commit.
  • The third-order derivatives can be further improved by taking sparsity into account and use mex versions for Kronecker products etc. I leave this for further testing (and if anybody actually uses this ;-) )

Merge request reports