\chapter{Multivariate time series models} \label{chap:var} \app{Gretl} provides a standard set of procedures for dealing with the multivariate time-series models known as VARs (\emph{Vector AutoRegression}). More general models --- such as VARMAs, nonlinear models or multivariate GARCH models --- are not provided as of now, although it is entirely possible to estimate them by writing custom procedures in the \app{gretl} scripting language. In this chapter, we will briefly review \app{gretl}'s VAR toolbox. \section{Notation} \label{sec:var-def} A VAR is a structure whose aim is to model the time persistence of a vector of $n$ time series, $y_t$, via a multivariate autoregression, as in \begin{equation} \label{eq:VAR} y_t = A_1 y_{t-1} + A_2 y_{t-2} + \cdots + A_p y_{t-p} + B x_t + \epsilon_t \end{equation} The number of lags $p$ is called the \emph{order} of the VAR. The vector $x_t$, if present, contains a set of exogenous variables, often including a constant, possibly with a time trend and seasonal dummies. The vector $\epsilon_t$ is typically assumed to be a vector white noise, with covariance matrix $\Sigma$. Equation \eqref{eq:VAR} can be written more compactly as \begin{equation} \label{eq:VARpoly} A(L) y_t = B x_t + \epsilon_t \end{equation} where $A(L)$ is a matrix polynomial in the lag operator, or as \begin{equation} \label{eq:VARcompan} \left[\begin{array}{c} y_t \\ y_{t-1} \\ \cdots \\ y_{t-p-1} \end{array} \right] = \mathbf{A} \left[\begin{array}{c} y_{t-1} \\ y_{t-2} \\ \cdots \\ y_{t-p} \end{array} \right] + \left[\begin{array}{c} B \\ 0 \\ \cdots \\ 0 \end{array} \right] x_t + \left[\begin{array}{c} \epsilon_t \\ 0 \\ \cdots \\ 0 \end{array} \right] \end{equation} The matrix $\mathbf{A}$ is known as the ``companion matrix'' and equals \[ \mathbf{A} = \left[\begin{array}{ccccc} A_1 & A_2 & \cdots & A_p \\ I & 0 & \cdots & 0 \\ 0 & I & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \end{array} \right] \] Equation \eqref{eq:VARcompan} is known as the ``companion form'' of the VAR. Another representation of interest is the so-called ``VMA representation'', which is written in terms of an infinite series of matrices $\Theta_i$ defined as \begin{equation} \label{eq:VMA} \Theta_i = \frac{\partial y_t}{\partial \epsilon_{t-i}} \end{equation} The $\Theta_i$ matrices may be derived by recursive substitution in equation \eqref{eq:VAR}: for example, assuming for simplicity that $B=0$ and $p=1$, equation \eqref{eq:VAR} would become \[ y_t = A y_{t-1} + \epsilon_t \] which could be rewritten as \[ y_t = A^{n+1} y_{t-n-1} + \epsilon_t + A \epsilon_{t-1} + A^2 \epsilon_{t-2} + \cdots + A^n \epsilon_{t-n} \] In this case $\Theta_i = A^i$. In general, it is possible to compute $\Theta_i$ as the $n \times n$ north-west block of the $i$-th power of the companion matrix $\mathbf{A}$ (so $\Theta_0$ is always an identity matrix). The VAR is said to be \emph{stable} if all the eigenvalues of the companion matrix $\mathbf{A}$ are smaller than 1 in absolute value, or equivalently, if the matrix polynomial $A(L)$ in equation \eqref{eq:VARpoly} is such that $|A(z)| = 0$ implies $|z|>1$. If this is the case, $\lim_{n \to \infty} \Theta_n = 0$ and the vector $y_t$ is stationary; as a consequence, the equation \begin{equation} \label{eq:VMArep} y_t - E(y_t) = \sum_{i=0}^{\infty} \Theta_i \epsilon_{t-i} \end{equation} is a legitimate Wold representation. If the VAR is not stable, then the inferential procedures that are called for become somewhat more specialized, except for some simple cases. In particular, if the number of eigenvalues of $\mathbf{A}$ with modulus 1 is between 1 and $n-1$, the canonical tool to deal with these models is the cointegrated VAR model, discussed in chapter \ref{chap:vecm}. \section{Estimation} \label{sec:var-estim} \begin{script}[htbp] \caption{Estimation of a VAR via OLS} \label{script:var-ols} Input: \begin{scodebit} open sw_ch14.gdt genr infl = 400*sdiff(log(PUNEW)) scalar p = 2 list X = LHUR infl list Xlag = lags(p,X) loop foreach i X ols $i const Xlag end loop var p X \end{scodebit} %$ Output (selected portions): \begin{scodebit} Model 1: OLS, using observations 1960:3-1999:4 (T = 158) Dependent variable: LHUR coefficient std. error t-ratio p-value -------------------------------------------------------- const 0.113673 0.0875210 1.299 0.1960 LHUR_1 1.54297 0.0680518 22.67 8.78e-51 *** LHUR_2 -0.583104 0.0645879 -9.028 7.00e-16 *** infl_1 0.0219040 0.00874581 2.505 0.0133 ** infl_2 -0.0148408 0.00920536 -1.612 0.1090 Mean dependent var 6.019198 S.D. dependent var 1.502549 Sum squared resid 8.654176 S.E. of regression 0.237830 ... VAR system, lag order 2 OLS estimates, observations 1960:3-1999:4 (T = 158) Log-likelihood = -322.73663 Determinant of covariance matrix = 0.20382769 AIC = 4.2119 BIC = 4.4057 HQC = 4.2906 Portmanteau test: LB(39) = 226.984, df = 148 [0.0000] Equation 1: LHUR coefficient std. error t-ratio p-value -------------------------------------------------------- const 0.113673 0.0875210 1.299 0.1960 LHUR_1 1.54297 0.0680518 22.67 8.78e-51 *** LHUR_2 -0.583104 0.0645879 -9.028 7.00e-16 *** infl_1 0.0219040 0.00874581 2.505 0.0133 ** infl_2 -0.0148408 0.00920536 -1.612 0.1090 Mean dependent var 6.019198 S.D. dependent var 1.502549 Sum squared resid 8.654176 S.E. of regression 0.237830 \end{scodebit} \end{script} The \app{gretl} command for estimating a VAR is \cmd{var} which, in the command line interface, is invoked in the following manner: \begin{flushleft} \texttt{[ \emph{modelname} \textless - ] var \emph{p} \emph{Ylist} [; \emph{Xlist}]} \end{flushleft} where \texttt{p} is a scalar (the VAR order) and \texttt{Ylist} is a list of variables describing the content of $y_t$. If the list \texttt{Xlist} is absent, the vector $x_t$ is understood to contain a constant only; if present, must be separated from \texttt{Ylist} by a semi-colon and contains the other exogenous variables. Note, however, that a few common choices can be obtained in a simpler way via options: the options \app{gretl} provides are \option{trend}, \option{seasonals} and \option{nc} (no constant). Either \texttt{Ylist} and \texttt{Xlist} may be named lists (see section \ref{named-lists}). The ``\texttt{\textless -}'' construct can be used to store the model under a name (see section \ref{sect-script-objects}), if so desired. To estimate a VAR using the graphical interface, choose ``Time Series, Vector Autoregression'', under the Model menu. The parameters in eq. \eqref{eq:VAR} are typically free from restrictions, which implies that multivariate OLS provides a consistent and asymptotically efficient estimator of all the parameters.\footnote{In fact, under normality of $\epsilon_t$ OLS is indeed the conditional ML estimator. You may want to use other methods if you need to estimate a VAR in which some parameters are constrained.} Given the simplicity of OLS, this is what every software package, including \app{gretl}, uses: example script \ref{script:var-ols} exemplifies the fact that the \cmd{var} command gives you exactly the output you would have from a battery of OLS regressions. The advantage of using the dedicated command is that, after estimation is done, it makes it much easier to access certain quantities and manage certain tasks. For example, the \dollar{coeff} accessor returns the estimated coefficients as a matrix with $n$ columns and \dollar{sigma} returns an estimate of the matrix $\Sigma$, the covariance matrix of $\epsilon_t$. Moreover, for each variable in the system an F test is automatically performed, in which the null hypothesis is that no lags of variable $j$ are significant in the equation for variable $i$. This is commonly known as a \textbf{Granger causality} test. \begin{table}[htbp] \centering \begin{tabular}{rl} \hline Periodicity & horizon \\ \hline Quarterly & 20 (5 years) \\ Monthly & 24 (2 years) \\ Daily & 3 weeks \\ All other cases & 10 \\ \hline \end{tabular} \caption{VMA horizon as a function of the dataset periodicity} \label{tab:var-horizon} \end{table} In addition, two accessors become available for the companion matrix (\dollar{compan}) and the VMA representation (\dollar{vma}). The latter deserves a detailed description: since the VMA representation \eqref{eq:VMArep} is of infinite order, \app{gretl} defines a \emph{horizon} up to which the $\Theta_i$ matrices are computed automatically. By default, this is a function of the periodicity of the data (see table \ref{tab:var-horizon}), but it can be set by the user to any desired value via the \cmd{set} command with the \cmd{horizon} parameter, as in \begin{code} set horizon 30 \end{code} Calling the horizon $h$, the \dollar{vma} accessor returns an $(h+1) \times n^2$ matrix, in which the $(i+1)$-th row is the vectorized form of $\Theta_i$. \subsection{VAR order selection} In order to help the user choose the most appropriate VAR order, \app{gretl} provides a special syntax construct to the \texttt{var} command: \begin{flushleft} \texttt{var \emph{p} \emph{Ylist} [; \emph{Xlist}]} \option{lagselect} \end{flushleft} When the command is invoked with the \option{lagselect} option, estimation is performed for all lags up to \texttt{p} and a table is printed: it displays, for each order, a LR test for the order $p$ versus $p-1$, plus an array of information criteria (see chapter \ref{select-criteria}). For each information criterion in the table, a star indicates what appears to be the ``best'' choice. The same output can be obtained through the graphical interface via the ``Time Series, VAR lag selection'' entry under the Model menu. \begin{script}[htbp] \caption{VAR lag selection via Information Criteria} \label{script:var-lagselect} Input: \begin{scodebit} open denmark list Y = 1 2 3 4 var 4 Y --lagselect var 6 Y --lagselect \end{scodebit} %$ Output (selected portions): \begin{scodebit} VAR system, maximum lag order 4 The asterisks below indicate the best (that is, minimized) values of the respective information criteria, AIC = Akaike criterion, BIC = Schwarz Bayesian criterion and HQC = Hannan-Quinn criterion. lags loglik p(LR) AIC BIC HQC 1 609.15315 -23.104045 -22.346466* -22.814552 2 631.70153 0.00013 -23.360844* -21.997203 -22.839757* 3 642.38574 0.16478 -23.152382 -21.182677 -22.399699 4 653.22564 0.15383 -22.950025 -20.374257 -21.965748 VAR system, maximum lag order 6 The asterisks below indicate the best (that is, minimized) values of the respective information criteria, AIC = Akaike criterion, BIC = Schwarz Bayesian criterion and HQC = Hannan-Quinn criterion. lags loglik p(LR) AIC BIC HQC 1 594.38410 -23.444249 -22.672078* -23.151288* 2 615.43480 0.00038 -23.650400* -22.260491 -23.123070 3 624.97613 0.26440 -23.386781 -21.379135 -22.625083 4 636.03766 0.13926 -23.185210 -20.559827 -22.189144 5 658.36014 0.00016 -23.443271 -20.200150 -22.212836 6 669.88472 0.11243 -23.260601 -19.399743 -21.795797 \end{scodebit} \end{script} Warning: in finite samples the choice of $p$ may affect the outcome of the procedure. \textbf{This is not a bug}, but rather a nasty but unavoidable side effect of the way these comparisons should be made: if your sample contains $T$ observations, the lag selection procedure, if invoked with parameter $p$, examines all VARs of order ranging form 1 to $p$, estimated on a sample of $T-p$ observations. In other words, the comparison procedure does not use all the data available when estimating VARs of order less than $p$ to make sure that all the models compared are estimated on the same data range. Under these circumstances, choosing a different value of $p$ may alter the results, although this is unlikely to happen if your sample size is reasonably large. An example of this unpleasant phenomenon is given in example script \ref{script:var-lagselect}. As can be seen, according to the Hannan-Quinn criterion, order 2 seems preferable to order 1 if the maximum tested order is 4, but the situation is reversed if the maximum tested order is 6. \section{Structural VARs} \label{sec:svar} As of today, \app{gretl} does not provide a native implementation for the class of models known as ``Structural VARs''; however, it provides an implementation of the Cholesky deconposition-based approach, which is the most classic, and certainly most popular SVAR version. \subsection{IRF and FEVD} Assume that the disturbance in equation \eqref{eq:VAR} can be thought of as a linear function of a vector of \emph{structural shocks} $u_t$, which are assumed to have unit variance and to be incorrelated to one another, so $V(u_t) = I$. If $\epsilon_t = K u_t$, it follows that $\Sigma = V(\epsilon_t) = KK'$. The main object of interest in this setting the sequence of matrices \begin{equation} \label{eq:svma} C_k = \frac{\partial y_t}{\partial u_{t-i}} = \Theta_k K, \end{equation} known as the structural VMA representation. From the $C_k$ matrices defined in equation \eqref{eq:svma} two quantities of interest may be derived: the \textbf{Impulse Response Function} (IRF) and the \textbf{Forecast Error Variance Decomposition} (FEVD). The IRF of variable $i$ to shock $j$ is simply the sequence of the elements in row $i$ and column $j$ of the $C_k$ matrices. In formulae: \[ \mathcal{I}_{i,j,k} = \pder{y_{i,t}}{u_{j, t-k}} \] As a rule, Impulse Response Functions are plotted as a function of $k$, and are interpreted as the effect that a shock has on an observable variable through time. Of course, what we observe are the estimated IRFs, so it is natural to endow them with confidence intervals: following common practice among econometric software, \app{gretl} computes the confidence intervals by using the bootstrap\footnote{It is possible, in principle, to compute analytical confidence intervals via an asymptotic approximation, but this is not a very popular choice: asymptotic formulae are known to often give a very poor approximation of the finite-sample properties.}; details are later in this section. Another quantity of interest that may be computed from the structural VMA representation is the Forecast Error Variance Decomposition (FEVD). The forecast error variance after $h$ steps is given by \[ \Omega_h = \sum_{k=0}^h C_k C_k' \] hence the variance for variable $i$ is \[ \omega^2_i = \left[ \Omega_h \right]_{i,i} = \sum_{k=0}^h \mathrm{diag}(C_k C_k')_i = \sum_{k=0}^h \sum_{l=1}^n ({}_kc_{i.l})^2 \] where ${}_kc_{i.l}$ is, trivially, the $i,l$ element of $C_k$. As a consequence, the share of uncertainty on variable $i$ that can be attributed to the $j$-th shock after $h$ periods equals \[ \mathcal{VD}_{i,j,h} = \frac{\sum_{k=0}^h ({}_kc_{i.j})^2 }{ \sum_{k=0}^h \sum_{l=1}^n ({}_kc_{i.l})^2 } . \] This makes it possible to quantify which shocks are most important to determine a certain variable in the short and/or in the long run. \subsection{Triangularization} The formula \ref{eq:svma} takes $K$ as known, while of course it has to be estimated. The estimation problem has been the subject of an enormous body of literature we will not even attempt to summarize here: see for example \cite[chapter 9]{LKBook05}. Suffice it to say that the most popular choice dates back to \cite{sims80}, and consists in assuming that $K$ is lower triangular, so its estimate is simply the Cholesky deconposition of the estimate of $\Sigma$. The main consequence of this choice is that the ordering of variables within the vector $y_t$ becomes meaningful: since $K$ is also the matrix of Impulse Response Functions at lag 0, the triangularity assumption means that the first variable in the ordering responds instantaneously only to shock number 1, the second one only to shocks 1 and 2, and so forth. For this reason, each variable is thought to ``own'' one shock: variable 1 owns shock number 1, etcetera. This is the reason why in this sort of exercises the ordering of the variables is important and the applied literature has developed the ``most exogenous first'' mantra. Where, in this setting, ``exogenous'' really means ``instantaneously insensitive to structural shocks''\footnote{The word ``exogenous'' has caught on in this context, but it's a rather unfortunate choice: for a start, each shock impacts on every variable after one lag, so nothing is really exogenous here. A much better choice of words would probably have been something like ``sturdy'', but it's too late now.}. To put it differently, if variable \texttt{foo} comes before variable \texttt{bar} in the $Y$ list, it follows that the shock owned by \texttt{foo} affects \texttt{bar} instantaneously, but the reverse does not happen. Impulse Response Functions and the FEVD can be printed out via the command line interface by using the \option{impulse-response} and \option{variance-decomp} options, respectively. If you need to store them into matrices, you can compute the structural VMA and proceed from there. For example, the following code snippet shows you how to compute a matrix containing the IRFs: \begin{code} open denmark list Y = 1 2 3 4 scalar n = nelem(Y) var 2 Y --quiet --impulse matrix K = cholesky($sigma) matrix V = $vma matrix IRF = V * (K ** I(n)) print IRF \end{code} in which the equality \[ \mathrm{vec}(C_k) = \mathrm{vec}(\Theta_k K) = (K' \otimes I) \mathrm{vec} (\Theta_k) \] was used. FIXME: show all the nice stuff we have under the GUI. \subsection{IRF bootstrap} FIXME: todo %%% Local Variables: %%% mode: latex %%% TeX-master: "gretl-guide" %%% End: