bvar-a-la-sims.tex 27 KB
Newer Older
1
\documentclass[11pt,a4paper]{article}
2 3 4

\usepackage{amsmath}
\usepackage{amssymb}
5 6 7 8
\usepackage{hyperref}
\hypersetup{breaklinks=true,pagecolor=white,colorlinks=true,linkcolor=blue,citecolor=blue,urlcolor=blue}
\usepackage{fullpage}
\usepackage{textcomp}
9 10 11 12 13

\newcommand{\df}{\text{df}}

\begin{document}

14
\title{BVAR models ``\`a la Sims'' in Dynare\thanks{Copyright \copyright~2007--2015 S\'ebastien
15
    Villemot; \copyright~2016--2017 S\'ebastien
16
    Villemot and Johannes Pfeifer. Permission is granted to copy, distribute and/or modify
17 18 19 20 21 22 23 24 25 26 27 28
    this document under the terms of the GNU Free Documentation
    License, Version 1.3 or any later version published by the Free
    Software Foundation; with no Invariant Sections, no Front-Cover
    Texts, and no Back-Cover Texts. A copy of the license can be found
    at: \url{http://www.gnu.org/licenses/fdl.txt}
    \newline
    \indent Many thanks to Christopher Sims for providing his BVAR
    MATLAB\textregistered~routines, to St\'ephane Adjemian and Michel Juillard
    for their helpful support, and to Marek Jaroci\'nski for reporting a bug.
  }}

\author{S\'ebastien Villemot\thanks{Paris School of Economics and
29
    CEPREMAP.} \and Johannes Pfeifer\thanks{University of Cologne. E-mail: \href{mailto:jpfeifer@uni-koeln.de}{\texttt{jpfeifer@uni-koeln.de}}.}}
30
\date{First version: September 2007 \hspace{1cm} This version: May 2017}
31 32 33

\maketitle

34 35
\begin{abstract}
  Dynare incorporates routines for Bayesian VAR models estimation, using a
Houtan Bastani's avatar
Houtan Bastani committed
36
  flavor of the so-called ``Minnesota priors.'' These routines can be used
37 38 39
  alone or in parallel with a DSGE estimation. This document describes their
  implementation and usage.
\end{abstract}
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

If you are impatient to try the software and wish to skip mathematical details, jump to section \ref{dynare-commands}.

\section{Model setting}

Consider the following VAR(p) model:

$$y'_t = y'_{t-1}\beta_1 + y'_{t-2}\beta_2 + \ldots + y'_{t-p}\beta_p + x'_t\alpha + u_t$$

where:
\begin{itemize}
\item $t = 1\ldots T$ is the time index
\item $y_t$ is a column vector of $ny$ endogenous variables
\item $x_t$ a column vector of $nx$ exogenous variables
\item the residuals $u_t \sim \mathcal{N}(0, \Sigma_u)$ are i.i.d. (with $\Sigma$ a $ny\times ny$ matrix)
\item $\beta_1,\beta_2,\ldots,\beta_p$ are $ny\times ny$ matrices
\item $\alpha$ is a $nx\times ny$ matrix
\end{itemize}

59
In the actual implementation, exogenous variables $x_t$ are either empty ($nx = 0$), or only include a constant (so that $nx = 1$ and $x'_t = (1, \ldots, 1)$). This alternative is controlled by options \texttt{constant} (the default) and \texttt{noconstant} (see section \ref{sec-model-prior-options}).
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94

The matrix form of the model is:
$$Y = X\Phi + U$$

where:
\begin{itemize}
\item $Y$ and $U$ are $T\times ny$
\item $X$ is $T\times k$ where $k = ny\cdot p + nx$
\item $\Phi$ is $k \times ny$
\end{itemize}

In other words:
$$Y = \left[
\begin{array}{c}
y_1 \\
\vdots \\
y_T \\
\end{array}
\right]
\; X = \left[
\begin{array}{cccc}
y_0 & \ldots & y_{1-p} & x_1 \\
\vdots & \ddots & \vdots & \vdots \\
y_{T-1} & \ldots & y_{T-p} & x_T
\end{array}
\right]
\; \Phi = \left[
\begin{array}{c}
\beta_1 \\
\vdots \\
\beta_p \\
\alpha \\
\end{array}
\right]$$

95

96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
\section{Constructing the prior}
\label{sec-prior}

We need a prior distribution over the parameters $(\Phi, \Sigma)$ before moving to Bayesian estimation. This section describes the construction of the prior used in Dynare implementation.

The prior is made of three components, which are described in the following subsections.

\subsection{Diffuse prior}

The first component of the prior is, by default, Jeffreys' improper prior:

$$p_1(\Phi,\Sigma) \propto |\Sigma|^{-(ny+1)/2}$$

However, it is possible to choose a flat prior by specifying option \texttt{bvar\_prior\_flat}. In, that case:
$$p_1(\Phi, \Sigma) = \text{const}$$

\subsection{Dummy observations prior}

The second component of the prior is constructed from the likelihood of $T^*$ dummy observations $(Y^*,X^*)$:

$$p_2(\Phi, \Sigma) \propto |\Sigma|^{-T^*/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(Y^*-X^*\Phi)'(Y^*-X^*\Phi))\right\}$$

118
The dummy observations are constructed according to Sims' version of the Minnesota prior.\footnote{See Doan, Litterman and Sims (1984).}
119 120 121 122 123 124 125

Before constructing the dummy observations, one needs to choose values for the following parameters:
\begin{itemize}
\item $\tau$: the overall tightness of the prior. Large values imply a small prior covariance matrix. Controlled by option \texttt{bvar\_prior\_tau}, with a default of 3
\item $d$: the decay factor for scaling down the coefficients of lagged values. Controlled by option \texttt{bvar\_prior\_decay}, with a default of 0.5
\item $\omega$ controls the tightness for the prior on $\Sigma$. Must be an integer. Controlled by option \texttt{bvar\_prior\_omega}, with a default of 1
\item $\lambda$ and $\mu$: additional tuning parameters, respectively controlled by option \texttt{bvar\_prior\_lambda} (with a default of 5) and option \texttt{bvar\_prior\_mu} (with a default of 2)
126 127 128 129 130 131 132
\item based on a short presample $Y^0$ (in Dynare implementation, this
  presample consists of the $p$ observations used to initialize the VAR, plus
  one extra observation at the beginning of the sample\footnote{In Dynare 4.2.1
    and older versions, only $p$ observations where used. As a consequence the
    case $p=1$ was buggy, since the standard error of a one observation sample
    is undefined.}), one also calculates $\sigma = std(Y^0)$ and $\bar{y} =
  mean(Y^0)$
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
\end{itemize}

Below is a description of the different dummy observations. For the sake of simplicity, we should assume that $ny = 2$, $nx = 1$ and $p = 3$. The generalization is straigthforward.

\begin{itemize}
\item Dummies for the coefficients on the first lag:
$$\left[
\begin{array}{cc}
\tau\sigma_1 & 0 \\
0 & \tau\sigma_2
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
\tau\sigma_1 & 0 & 0&0& 0&0& 0 \\
0 & \tau\sigma_2 & 0&0& 0&0& 0
\end{array}
\right]\Phi + U$$

\item Dummies for the coefficients on the second lag:
$$\left[
\begin{array}{cc}
0 & 0 \\
0 & 0
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
0&0& \tau\sigma_1 2^d & 0 & 0&0& 0 \\
0&0& 0 & \tau\sigma_2 2^d & 0&0& 0
\end{array}
\right]\Phi + U$$

\item Dummies for the coefficients on the third lag:
$$\left[
\begin{array}{cc}
0 & 0 \\
172
0 & 0
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
0&0& 0&0& \tau\sigma_1 3^d & 0 & 0 \\
0&0& 0&0& 0 & \tau\sigma_2 3^d & 0 \\
\end{array}
\right]\Phi + U$$

\item The prior for the covariance matrix is implemented by:
$$\left[
\begin{array}{cc}
\sigma_1 & 0 \\
0 & \sigma_2
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
0&0& 0&0& 0&0& 0 \\
0&0& 0&0& 0&0& 0
\end{array}
\right]\Phi + U$$

These observations are replicated $\omega$ times.

\item Co-persistence prior dummy observation, reflecting the belief that when data on all $y$'s are stable at their initial levels, they will tend to persist at that level:

$$\left[
\begin{array}{cc}
\lambda\bar{y}_1 & \lambda\bar{y}_2
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
\lambda\bar{y}_1 & \lambda\bar{y}_2 & \lambda\bar{y}_1 & \lambda\bar{y}_2 & \lambda\bar{y}_1 & \lambda\bar{y}_2 & \lambda
\end{array}
\right]\Phi + U$$

\textit{Note:} in the implementation, if $\lambda < 0$, the exogenous variables will not be included in the dummy. In that case, the dummy observation becomes:
$$\left[
\begin{array}{cc}
-\lambda\bar{y}_1 & -\lambda\bar{y}_2
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
-\lambda\bar{y}_1 & -\lambda\bar{y}_2 & -\lambda\bar{y}_1 & -\lambda\bar{y}_2 & -\lambda\bar{y}_1 & -\lambda\bar{y}_2 & 0
\end{array}
\right]\Phi + U$$

\item Own-persistence prior dummy observations, reflecting the belief that when $y_i$ has been
stable at its initial level, it will tend to persist at that level, regardless of the value of
other variables:

$$\left[
\begin{array}{cc}
\mu\bar{y}_1 & 0 \\
0 & \mu\bar{y}_2
\end{array}
\right]
=
\left[
\begin{array}{ccccccc}
\mu\bar{y}_1 & 0 & \mu\bar{y}_1 &0 & \mu\bar{y}_1 & 0 & 0 \\
0 & \mu\bar{y}_2 & 0 & \mu\bar{y}_2 & 0 &\mu\bar{y}_2 & 0
\end{array}
\right]\Phi + U$$


\end{itemize}

This makes a total of $T^* = ny\cdot p + ny\cdot\omega + 1 + ny = ny\cdot(p+\omega+1)+1$ dummy observations.

\subsection{Training sample prior}

The third component of the prior is constructed from the likelihood of $T^-$ observations $(Y^-,X^-)$ extracted from the beginning of the sample:

$$p_3(\Phi, \Sigma) \propto |\Sigma|^{-T^-/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(Y^--X^-\Phi)'(Y^--X^-\Phi))\right\}$$

In other words, the complete sample is divided in two parts such that $T = T^- + T^+$,
$Y = \left[
\begin{array}{c}
Y^- \\
Y^+
\end{array}
\right]$ and
$X = \left[
\begin{array}{c}
X^- \\
X^+
\end{array}
\right]$.

The size of the training sample $T^-$ is controlled by option \texttt{bvar\_prior\_train}. It is null by default.

\section{Characterization of the prior and posterior distributions}

\textit{Notation:} in the following, we will use a small ``$p$'' as superscript for notations related to the prior, and a capital ``$P$'' for notations related to the posterior.

\subsection{Prior distribution}
\label{prior-distrib}

We define the following notations:
\begin{itemize}
\item $T^p = T^* + T^-$
\item
$Y^p = \left[
\begin{array}{c}
Y^* \\
Y^-
\end{array}
\right]$
\item
$X^p = \left[
\begin{array}{c}
X^* \\
X^-
\end{array}
\right]$
\item $\df^p = T^p - k$ if $p_1$ is Jeffrey's prior, or $\df^p = T^p - k - ny - 1$ if $p_1$ is a constant
\end{itemize}

With these notations, one can see that the prior is:
\begin{eqnarray*}
p(\Phi, \Sigma) & = & p_1(\Phi, \Sigma)\cdot p_2(\Phi, \Sigma)\cdot p_3(\Phi, \Sigma) \\
& \propto & |\Sigma|^{-(\df^p + ny + 1 + k)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(Y^p-X^p\Phi)'(Y^p-X^p\Phi))\right\}
\end{eqnarray*}

We define the following notations:
\begin{itemize}
\item $\hat{\Phi^p} = ({X^p}'X^p)^{-1} {X^p}' Y^p$ the linear regression of $X^p$ on $Y^p$
\item $S^p = (Y^p - X^p\hat{\Phi^p})'(Y^p - X^p\hat{\Phi^p})$
\end{itemize}

After some manipulations, one obtains:

\begin{eqnarray*}
p(\Phi, \Sigma) & \propto & |\Sigma|^{-(\df^p + ny + 1 + k)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(S^p + (\Phi-\hat{\Phi^p})'{X^p}'X^p(\Phi-\hat{\Phi^p})))\right\} \\
& \propto & |\Sigma|^{-(\df^p + ny + 1)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}S^p)\right\} \times \\
& & |\Sigma|^{-k/2}\exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(\Phi-\hat{\Phi^p})'{X^p}'X^p(\Phi-\hat{\Phi^p})))\right\}
\end{eqnarray*}

From the above decomposition, one can see that the prior distribution is such that:
\begin{itemize}
\item $\Sigma$ is distributed according to an inverse-Wishart distribution, with $\df^p$ degrees of freedom and parameter $S^p$
\item conditionally to $\Sigma$, matrix $\Phi$ is distributed according to a matrix-normal distribution, with mean $\hat{\Phi^p}$ and variance-covariance parameters $\Sigma$ and $({X^p}'X^p)^{-1}$
\end{itemize}

\emph{Remark concerning the degrees of freedom of the inverse-Wishart:} the inverse-Wishart distribution requires the number of degrees of freedom to be greater or equal than the number of variables, i.e. $\df^p \geq ny$. When the \texttt{bvar\_prior\_flat} option is not specified, we have:
$$\df^p = T^p - k = ny\cdot(p+\omega+1)+1+T^--ny\cdot p-nx = ny\cdot(\omega+1)+T^-$$
so that the condition is always fulfilled. When \texttt{bvar\_prior\_flat} option is specified, we have:
$$\df^p = ny\cdot w + T^- - 1$$
so that with the defaults ($\omega = 1$ and $T^- = 0$) the condition is not met. The user needs to increase either \texttt{bvar\_prior\_omega} or \texttt{bvar\_prior\_train}.

\subsection{Posterior distribution}

Using Bayes formula, the posterior density is given by:

\begin{equation}
\label{bayes-formula}
p(\Phi, \Sigma | Y^+, X^+) = \frac{p(Y^+ | \Phi, \Sigma, X^+) \cdot p(\Phi, \Sigma)}{p(Y^+ | X^+)}
\end{equation}

The posterior kernel is:

$$p(\Phi, \Sigma | Y^+, X^+) \propto p(Y^+ | \Phi, \Sigma, X^+) \cdot p(\Phi, \Sigma)$$

Since the likelihood is given by:

$$p(Y^+ | \Phi, \Sigma, X^+) = (2\pi)^{-\frac{T^+ \cdot ny}{2}} |\Sigma|^{\frac{T^+}{2}} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(Y^+-X^+\Phi)'(Y^+-X^+\Phi))\right\}$$

We obtain the following posterior kernel, when combining with the prior:

$$p(\Phi, \Sigma | Y^+, X^+) \propto  |\Sigma|^{-(\df^P + ny + 1 + k)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(Y^P-X^P\Phi)'(Y^P-X^P\Phi))\right\}$$

where:
\begin{itemize}
\item $T^P = T^+ + T^p = T^+ + T^- + T^*$
\item $Y^P = \left[
\begin{array}{c}
Y^p \\
Y^+
\end{array}
\right]
= \left[
\begin{array}{c}
Y^* \\
Y^- \\
Y^+
\end{array}
\right]$
\item $X^P = \left[
\begin{array}{c}
X^p \\
X^+
\end{array}
\right]
= \left[
\begin{array}{c}
X^* \\
X^- \\
X^+
\end{array}
\right]$
\item $\df^P = \df^p + T^+$. If $p_1$ is Jeffrey's prior, then $\df^P = T^P - k$. If $p_1$ is a constant, $\df^P = T^P - k - ny - 1$.
\end{itemize}

Using the same manipulations than for the prior, the posterior density can be rewritten as:
\begin{eqnarray*}
p(\Phi, \Sigma | Y^+, X^+) & \propto & |\Sigma|^{-(\df^P + ny + 1)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}S^P)\right\} \times \\
& & |\Sigma|^{-k/2}\exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(\Phi-\hat{\Phi^P})'{X^P}'X^P(\Phi-\hat{\Phi^P})))\right\}
\end{eqnarray*}
where:
\begin{itemize}
\item $\hat{\Phi^P} = ({X^P}'X^P)^{-1} {X^P}' Y^P$ the linear regression of $X^P$ on $Y^P$
\item $S^P = (Y^P - X^P\hat{\Phi^P})'(Y^P - X^P\hat{\Phi^P})$
\end{itemize}

From the above decomposition, one can see that the posterior distribution is such that:
\begin{itemize}
\item $\Sigma$ is distributed according to an inverse-Wishart distribution, with $\df^P$ degrees of freedom and parameter $S^P$
\item conditionally to $\Sigma$, matrix $\Phi$ is distributed according to a matrix-normal distribution, with mean $\hat{\Phi^P}$ and variance-covariance parameters $\Sigma$ and $({X^P}'X^P)^{-1}$
\end{itemize}

\emph{Remark concerning the degrees of freedom of the inverse-Wishart:} in theory, the condition over the degrees of freedom of the inverse-Wishart may not be satisfied. In practice, it is not a problem, since $T^+$ is great.

\section{Marginal density}

By integrating equation (\ref{bayes-formula}) over $(\Phi, \Sigma)$, one gets:

$$p(Y^+ | X^+) = \int p(Y^+ | \Phi, \Sigma, X^+) \cdot p(\Phi, \Sigma) d\Phi d\Sigma$$

We define the following notation for the unnormalized density of a matrix-normal-inverse-Wishart:
\begin{eqnarray*}
f(\Phi,\Sigma | \df,S,\hat{\Phi},\Omega) & = & |\Sigma|^{-(\df + ny + 1)/2} \exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}S)\right\} \times \\
& & |\Sigma|^{-k/2}\exp\left\{-\frac{1}{2}Tr(\Sigma^{-1}(\Phi-\hat{\Phi})'\Omega^{-1}(\Phi-\hat{\Phi})))\right\}
\end{eqnarray*}

We also note:
$$F(\df,S,\hat{\Phi},\Omega) = \int f(\Phi,\Sigma | \df,S,\hat{\Phi},\Omega)d\Phi d\Sigma$$

418
The function $F$ has an analytical form, which is given by the normalization constants of matrix-normal and inverse-Wishart densities:\footnote{Function \texttt{matricint} of file \texttt{bvar\_density.m} implements the calculation of the log of $F$.}
419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437

$$F(\df,S,\hat{\Phi},\Omega) = (2\pi)^{\frac{ny\cdot k}{2}} |\Omega|^{\frac{ny}{2}} \cdot 2^{\frac{ny\cdot \df}{2}} \pi^{\frac{ny(ny-1)}{4}} |S|^{-\frac{\df}{2}} \prod_{i=1}^{ny} \Gamma\left(\frac{\df + 1 - i}{2}\right) $$

The prior density is:
$$p(\Phi, \Sigma) = c^p \cdot f(\Phi,\Sigma | \df^p,S^p,\hat{\Phi^p},({X^p}'X^p)^{-1})$$
where the normalization constant is $c^p = F(\df^p,S^p,\hat{\Phi^p},({X^p}'X^p)^{-1})$.



Combining with the likelihood, one can see that the density is:

\begin{eqnarray*}
p(Y^+ | X^+) & = & \frac{\int (2\pi)^{-\frac{T^+\cdot ny}{2}} f(\Phi,\Sigma | \df^P,S^P,\hat{\Phi^P},({X^P}'X^P)^{-1})d\Phi d\Sigma}{F(\df^p,S^p,\hat{\Phi^p},({X^p}'X^p)^{-1})} \\
& = & \frac{(2\pi)^{-\frac{T^+\cdot ny}{2}} F(\df^P,S^P,\hat{\Phi^P},({X^P}'X^P)^{-1})}{F(\df^p,S^p,\hat{\Phi^p},({X^p}'X^p)^{-1})}
\end{eqnarray*}

\section{Dynare commands}
\label{dynare-commands}

438
Dynare incorporates three commands related to BVAR models \`a la Sims:
439 440
\begin{itemize}
\item \texttt{bvar\_density} for computing marginal density,
441 442
\item \texttt{bvar\_forecast} for forecasting (and RMSE computation),
\item \texttt{bvar\_irf} for computing Impulse Response Functions.
443 444 445 446 447 448 449 450
\end{itemize}

\subsection{Common options}

The two commands share a set of common options, which can be divided in two groups. They are described in the following subsections.

\emph{An important remark concerning options:} in Dynare, all options are global. This means that, if you have set an option in a given command, Dynare will remember this setting for subsequent commands (unless you change it again). For example, if you call \texttt{bvar\_density} with option \texttt{bvar\_prior\_tau = 2}, then all subsequent \texttt{bvar\_density} and \texttt{bvar\_forecast} commands will assume a value of 2 for \texttt{bvar\_prior\_tau}, unless you redeclare it. This remark also applies to \texttt{datafile} and similar options, which means that you can run a BVAR estimation after a Dynare estimation without having to respecify the datafile.

451 452
\subsubsection{Options related to model and prior specifications}
\label{sec-model-prior-options}
453

454
The options related to the prior are:
455 456 457 458 459 460 461 462 463 464 465 466 467
\begin{itemize}
\item \texttt{bvar\_prior\_tau} (default: 3)
\item \texttt{bvar\_prior\_decay} (default: 0.5)
\item \texttt{bvar\_prior\_lambda} (default: 5)
\item \texttt{bvar\_prior\_mu} (default: 2)
\item \texttt{bvar\_prior\_omega} (default: 1)
\item \texttt{bvar\_prior\_flat} (not enabled by default)
\item \texttt{bvar\_prior\_train} (default: 0)
\end{itemize}
Please refer to section \ref{sec-prior} for the discussion of their meaning.

\emph{Remark:} when option \texttt{bvar\_prior\_flat} is specified, the condition over the degrees of freedom of the inverse-Wishart distribution is not necessarily verified (see section \ref{prior-distrib}). One needs to increase either \texttt{bvar\_prior\_omega} or \texttt{bvar\_prior\_train} in that case.

468 469
It is also possible to use either option \texttt{constant} or \texttt{noconstant}, to specify whether a constant term should be included in the BVAR model. The default is to include one.

470 471
\subsubsection{Options related to the estimated dataset}

472 473
The list of (endogenous) variables of the BVAR model has to be declared through a \texttt{varobs} statement (see Dynare reference manual).

474 475 476 477 478 479
The options related to the estimated dataset are the same than for the \texttt{estimation} command (please refer to the Dynare reference manual for more details):
\begin{itemize}
\item \texttt{datafile}
\item \texttt{first\_obs}
\item \texttt{presample}
\item \texttt{nobs}
480
\item \texttt{prefilter}
481 482 483 484
\item \texttt{xls\_sheet}
\item \texttt{xls\_range}
\end{itemize}

485 486 487
Note that option \texttt{prefilter} implies option \texttt{noconstant}.

Please also note that if option \texttt{loglinear} had been specified in a previous \texttt{estimation} statement, without option \texttt{logdata}, then the BVAR model will be estimated on the log of the provided dataset, for maintaining coherence with the DSGE estimation procedure.
488

489
\emph{Restrictions related to the initialization of lags:} in DSGE estimation routines, the likelihood (and therefore the marginal density) are evaluated starting from the observation numbered \texttt{first\_obs + presample} in the datafile.\footnote{\texttt{first\_obs} points to the first observation to be used in the datafile (defaults to 1), and \texttt{presample} indicates how many observations after \texttt{first\_obs} will be used to initialize the Kalman filter (defaults to 0).} The BVAR estimation routines use the same convention (i.e. the first observation of $Y^+$ will be \texttt{first\_obs + presample}). Since we need $p$ observations to initialize the lags, and since we may also use a training sample, the user must ensure that the following condition holds (estimation will fail otherwise):
490 491 492 493 494 495 496 497 498 499 500 501 502
$$\texttt{first\_obs} + \texttt{presample} > \texttt{bvar\_prior\_train} + \text{number\_of\_lags}$$


\subsection{Marginal density}

The syntax for computing the marginal density is:

\medskip
\texttt{bvar\_density(}\textit{options\_list}\texttt{) }\textit{max\_number\_of\_lags}\texttt{;}
\medskip

The options are those described above.

503
The command will actually compute the marginal density for several models: first for the model with one lag, then with two lags, and so on up to \textit{max\_number\_of\_lags} lags. Results will be stored in a \textit{max\_number\_of\_lags} by 1 vector \texttt{oo\_.bvar.log\_marginal\_data\_density}. The command will also store the prior and posterior information into \textit{max\_number\_of\_lags} by 1 cell arrays \texttt{oo\_.bvar.prior} and \texttt{oo\_.bvar.posterior}.
504 505 506

\subsection{Forecasting}

507
The syntax for computing (out-of-sample) forecasts is:
508 509

\medskip
sebastien's avatar
sebastien committed
510
\texttt{bvar\_forecast(}\textit{options\_list}\texttt{) }\textit{max\_number\_of\_lags}\texttt{;}
511 512 513 514 515 516 517 518 519 520 521 522 523
\medskip

The options are those describe above, plus a few ones:
\begin{itemize}
\item \texttt{forecast}: the number of periods over which to compute forecasts after the end of the sample (no default)
\item \texttt{bvar\_replic}: the number of replications for Monte-Carlo simulations (default: 2000)
\item \texttt{conf\_sig}: confidence interval for graphs (default: 0.9)
\end{itemize}

The \texttt{forecast} option is mandatory.

The command will draw \texttt{bvar\_replic} random samples from the posterior distribution. For each draw, it will simulate one path without shocks, and one path with shocks.

524
% \emph{Note:} during the random sampling process, every draw such that the associated companion matrix has eigenvalues outside the unit circle will be discarded. This is meant to avoid explosive time series, especially when using a distant prediction horizon. Since this behaviour induces a distortion of the prior distribution, a message will be displayed if draws are thus discarded, indicating how many have been (knowing that the number of accepted draws is equal to \texttt{bvar\_replic}).
525 526

The command will produce one graph per observed variable. Each graph displays:
527
\begin{itemize}
528
\item a blue line for the posterior median forecast,% (equal to the mean of the simulated paths by linearity),
529 530 531 532 533 534
\item two green lines giving the confidence interval for the forecasts without shocks,
\item two red lines giving the confidence interval for the forecasts with shocks.
\end{itemize}

Morever, if option \texttt{nobs} is specified, the command will also compute root mean squared error (RMSE) for all variables between end of sample and end of datafile.

535 536 537 538 539 540 541 542
Most results are stored for future use:
\begin{itemize}
\item mean, median, variance and confidence intervals for forecasts (with shocks) are stored in \texttt{oo\_.bvar.forecast.with\_shocks} (in time series form),
\item \textit{idem} for forecasts without shocks in \texttt{oo\_.bvar.forecast.no\_shock},
\item all simulated samples are stored in variables \texttt{sims\_no\_shock} and \texttt{sims\_with\_shocks} in file \textit{mod\_file}\texttt{/bvar\_forecast/simulations.mat}. Those variables are 3-dimensional arrays: first dimension is time, second dimension is variable (in the order of the \texttt{varobs} declaration), third dimension indexes the sample number,
\item if RMSE has been computed, results are in \texttt{oo\_.bvar.forecast.rmse}.
\end{itemize}

543 544 545 546 547
\subsection{Impulse Response Functions}

The syntax for computing impulse response functions is:

\medskip
548
\texttt{bvar\_irf(}\textit{number\_of\_lags},\textit{identification\_scheme}\texttt{);}
549 550 551 552
\medskip

The \textit{identification\_scheme} option has two potential values
\begin{itemize}
553 554
\item \texttt{'Cholesky'}: uses a lower triangular factorization of the covariance matrix (default),
\item \texttt{'SquareRoot'}: uses the Matrix square root of the covariance matrix (\verb+sqrtm+ matlab's routine).
555 556
\end{itemize}

557 558
Keep in mind that the first factorization of the covariance matrix is sensible to the ordering of the variables (as declared in the mod file with \verb+var+). This is not the case of the second factorization, but its structural interpretation is, at best, unclear (the Matrix square root of a covariance matrix, $\Sigma$, is the unique symmetric matrix $A$ such that $\Sigma = AA$).\newline

559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577
If you want to change the length of the IRFs plotted by the command, you can put\\

\medskip
\texttt{options\_.irf=40;}\\
\medskip

before the \texttt{bvar\_irf}-command. Similarly, to change the coverage of the highest posterior density intervals to e.g. 60\% you can put the command\\

\medskip
\texttt{options\_.bvar.conf\_sig=0.6;}\\
\medskip

there.\newline


The mean, median, variance, and confidence intervals for IRFs are saved in \texttt{oo\_.bvar.irf}



578

579 580
\section{Examples}

581
This section presents two short examples of BVAR estimations. These examples and the associated datafile (\texttt{bvar\_sample.m}) can be found in the \texttt{tests/bvar\_a\_la\_sims} directory of the Dynare v4 subversion tree.
582 583 584 585 586 587 588 589 590

\subsection{Standalone BVAR estimation}

Here is a simple \texttt{mod} file example for a standalone BVAR estimation:

\begin{verbatim}
var dx dy;
varobs dx dy;

591
bvar_density(datafile = bvar_sample, first_obs = 20, bvar_prior_flat,
592 593 594
             bvar_prior_train = 10) 8;

bvar_forecast(forecast = 10, bvar_replic = 10000, nobs = 200) 8;
595 596

bvar_irf(8,'Cholesky');
597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634
\end{verbatim}

Note that you must declare twice the variables used in the estimation: first with a \texttt{var} statement, then with a \texttt{varobs} statement. This is necessary to have a syntactically correct \texttt{mod} file.

The first component of the prior is flat. The prior also incorporates a training sample. Note that the \texttt{bvar\_prior\_*} options also apply to the second command since all options are global.

The \texttt{bvar\_density} command will compute marginal density for all models from 1 up to 8 lags.

The \texttt{bvar\_forecast} command will compute forecasts for a BVAR model with 8 lags, for 10 periods in the future, and with 10000 replications. Since \texttt{nobs} is specified and is such that \texttt{first\_obs + nobs - 1} is strictly less than the number of observations in the datafile, the command will also compute RMSE.

\subsection{In parallel with a DSGE estimation}

Here follows an example \texttt{mod} file, which performs both a DSGE and a BVAR estimation:

\begin{verbatim}
var dx dy;
varexo e_x e_y;
parameters rho_x rho_y;

rho_x = 0.5;
rho_y = -0.3;

model;
dx = rho_x*dx(-1)+e_x;
dy = rho_y*dy(-1)+e_y;
end;

estimated_params;
rho_x,NORMAL_PDF,0.5,0.1;
rho_y,NORMAL_PDF,-0.3,0.1;
stderr e_x,INV_GAMMA_PDF,0.01,inf;
stderr e_y,INV_GAMMA_PDF,0.01,inf;
end;

varobs dx dy;

check;

635
estimation(datafile = bvar_sample, mh_replic = 1200, mh_jscale = 1.3,
636
           first_obs = 20);
637

638
bvar_density(bvar_prior_train = 10) 8;
639 640 641 642

bvar_forecast(forecast = 10, bvar_replic = 2000, nobs = 200) 8;
\end{verbatim}

643
Note that the BVAR commands take their \texttt{datafile} and \texttt{first\_obs} options from the \texttt{estimation} command.
644 645 646 647 648 649 650 651 652 653 654 655 656



\section*{References}

\noindent Doan, Thomas, Robert Litterman, and Christopher Sims (1984), ``\textit{Forecasting and Conditional Projections Using Realistic Prior Distributions}'', Econometric Reviews, \textbf{3}, 1-100

Schorfheide, Frank (2004), ``\textit{Notes on Model Evaluation}'', Department of Economics, University of Pennsylvania

Sims, Christopher (2003), ``\textit{Matlab Procedures to Compute Marginal Data Densities for VARs with Minnesota and Training Sample Priors}'', Department of Economics, Princeton University

\end{document}