Sophie: gretl-1.8.6-2mdv2010.1 x86

gretl-1.8.6-2mdv2010.1.x86_64.rpm

\newcommand{\obsvec}{\mathbf{y}}
\newcommand{\obsmat}{\mathbf{H}}
\newcommand{\obsx}{\mathbf{x}}
\newcommand{\obsxmat}{\mathbf{A}}
\newcommand{\obsdist}{\mathbf{w}}
\newcommand{\obsvar}{\mathbf{R}}

\newcommand{\statevec}{\bm{\xi}}
\newcommand{\statevar}{\mathbf{P}}
\newcommand{\statemat}{\mathbf{F}}
\newcommand{\strdist}{\mathbf{v}}
\newcommand{\strvar}{\mathbf{Q}}
\newcommand{\gain}{\mathbf{K}}

\newcommand{\altstrvar}{\mathbf{B}}
\newcommand{\altobsvar}{\mathbf{C}}
\newcommand{\alldist}{\bm{\varepsilon}}

\newcommand{\prederr}{\mathbf{e}}
\newcommand{\predvar}{\bm{\Sigma}}

\newcommand{\myvec}{\mbox{vec}}
\newcommand{\myvech}{\mbox{vech}}

\makeatletter
\def\pdots{\vbox{\baselineskip2.5\p@ 
  \lineskiplimit\z@ \kern2\p@\hbox{.}\hbox{.}\hbox{.}\hbox{.}}}
\makeatother

\chapter{The Kalman Filter}
\label{chap:kalman}

\section{Preamble}
\label{sec:amble}

The Kalman filter has been used ``behind the scenes'' in gretl for
quite some time, in computing ARMA estimates.  But user access to the
Kalman filter is new and it has not yet been tested to any great
extent.  We have run some tests of relatively simple cases against the
benchmark of \textsf{SsfPack Basic}.  This is state-space software
written by Koopman, Shephard and Doornik and documented in Koopman
\textit{et al} (1999).  It requires Doornik's \textsf{ox} program.
Both \textsf{ox} and \textsf{SsfPack} are available as free downloads
for academic use but neither is open-source; see
\url{http://www.ssfpack.com}.  Since Koopman is one of the leading
researchers in this area, presumably the results from \textsf{SsfPack}
are generally reliable.  To date we have been able to replicate the
\textsf{SsfPack} results in gretl with a high degree of precision.

We welcome both success reports and bug reports.

\section{Notation}

It seems that in econometrics everyone is happy with $y = X \beta +
u$, but we can't, as a community, make up our minds on a standard
notation for state-space models. Harvey (1989), Hamilton (1994),
Harvey and Proietti (2005) and Pollock (1999) all use different
conventions. The notation used here is based on James Hamilton's, with
slight variations.

A state-space model can be written as
%
\begin{align}
  \label{eq:state}
  \statevec_{t+1} &=  \statemat_t \statevec_t + \strdist_t \\
  \label{eq:obs}
  \obsvec_t &= \obsxmat_t'\obsx_t + \obsmat_t' \statevec_t +
  \obsdist_t 
\end{align}
%
where (\ref{eq:state}) is the state transition equation and
(\ref{eq:obs}) is the observation or measurement equation.  The state
vector, $\statevec_{t}$, is ($r \times 1$) and the vector of
observables, $\obsvec_t$, is ($n \times 1$); $\obsx_t$ is a ($k
\times 1$) vector of exogenous variables.  The ($r \times 1$) vector
$\strdist_t$ and the ($n \times 1$) vector $\obsdist_t$ are assumed to
be vector white noise:
%
\begin{align*}
E(\strdist_t \strdist_s') &= \strvar_t \mbox{ for } t = s, 
    \mbox{ otherwise } \mathbf{0} \\
E(\obsdist_t \obsdist_s') &= \obsvar_t \mbox{ for } t = s, 
    \mbox{ otherwise } \mathbf{0}
\end{align*}

The number of time-series observations will be denoted by $T$.
In the special case when $\statemat_t = \statemat$, $\obsmat_t = \obsmat$,
$\obsxmat_t = \obsxmat$, $\strvar_t = \strvar$ and $\obsvar_t =
\obsvar$, the model is said to be \emph{time-invariant}.

\subsection{The Kalman recursions}
\label{sec:kalman-recursions}

Using this notation, and assuming for the moment that $\strdist_t$ and
$\obsdist_t$ are mutually independent, the Kalman recursions can be
written as follows.

Initialization is via the unconditional mean and variance of
$\statevec_1$:
%
\begin{align*}
\hat{\statevec}_{1|0} &= E(\statevec_1) \\
\statevar_{1|0} &= E\left\{\left[\statevec_1 - E(\statevec_1)\right]
   \left[\statevec_1 - E(\statevec_1)\right]' \right\}
\end{align*}
%
Usually these are given by $\hat{\statevec}_{1|0} = \mathbf{0}$ and
%
\begin{equation}
\label{eq:inivar}
\myvec(\statevar_{1|0}) = \left[\mathbf{I}_{r^2} - \statemat \otimes
  \statemat\right]^{-1} \cdot \myvec(\strvar)
\end{equation}
but see below for further discussion of the initial variance.

Iteration then proceeds in two steps.\footnote{For a justification of
  the following formulae see the classic book by Anderson and Moore
  (1979) or, for a more modern treatment, Pollock (1999) or Hamilton
  (1994).  A transcription of R. E. Kalman's original paper of 1960 is
  available at
  \url{http://www.cs.unc.edu/~welch/kalman/kalmanPaper.html}.}  First
we update the estimate of the state
%
\begin{equation}
\hat{\statevec}_{t+1|t} = \statemat_t\hat{\statevec}_{t|t-1} + 
  \gain_t \prederr_t
\end{equation}
%
where $\prederr_t$ is the prediction error for the observable:
\[
\prederr_t = \obsvec_t - \obsxmat_t'\obsx_t - \obsmat_t' \hat{\statevec}_{t|t-1}
\]
%
and $\gain_t$ is the gain matrix, given by
%
\begin{equation}
\label{eq:gain}
\gain_t = \statemat_t\statevar_{t|t-1}\obsmat_t \predvar_t^{-1}
\end{equation}
%
with
%
\[
\predvar_t = \obsmat_t'\statevar_{t|t-1}\obsmat_t + \obsvar_t
\]

The second step then updates the estimate of the variance of the state
using
%
\begin{equation}
\statevar_{t+1|t} = \statemat_t\statevar_{t|t-1}\statemat_t' -
 \gain_t \predvar_t \gain_t' + \strvar_t
\end{equation}

\subsection{Cross-correlated disturbances}

The formulation given above assumes mutual independence of the
disturbances in the state and observation equations, $\strdist_t$ and
$\obsdist_t$.  This assumption holds good in many practical
applications, but a more general formulation allows for
cross-correlation.  In place of (\ref{eq:state})--(\ref{eq:obs}) we
may write
%
\begin{align*}
  \statevec_{t+1} &= \statemat_t \statevec_t + 
     \altstrvar_t \alldist_t \\
  \obsvec_t &= \obsxmat_t'\obsx_t + \obsmat_t' \statevec_t + 
     \altobsvar_t \alldist_t 
\end{align*}
%
where $\alldist_t$ is a ($p \times 1$) disturbance vector, all the
elements of which have unit variance, $\altstrvar_t$ is ($r \times p$)
and $\altobsvar_t$ is ($n \times p$).

The no-correlation case is nested thus: define $\strdist^*_t$ and
$\obsdist^*_t$ as modified versions of $\strdist_t$ and $\obsdist_t$,
scaled such that each element has unit variance, and let
%
\[
\alldist_t =
\left[
\begin{array}{c}
  \strdist^*_t \\ \obsdist^*_t
\end{array}
\right]
\]
%
so that $p = r+n$.  Then (suppressing time subscripts for simplicity)
let 
%
\begin{align*}
  \altstrvar &= \left[
   \begin{array}{ccc} \bm{\Gamma}_{r\times r} & \pdots & 
     \mathbf{0}_{r\times n}\end{array}
  \right] \\
  \altobsvar &= \left[ 
   \begin{array}{ccc} \mathbf{0}_{n\times r} & \pdots & 
   \bm{\Lambda}_{n\times n}\end{array}
  \right]
\end{align*}
%
where $\bm{\Gamma}$ and $\bm{\Lambda}$ are lower triangular matrices
satisfying $\strvar = \bm{\Gamma} \bm{\Gamma}'$ and $\obsvar =
\bm{\Lambda} \bm{\Lambda}'$ respectively.  The zero
sub-matrices in the above expressions for $\altstrvar$ and
$\altobsvar$ produce the case of mutual independence; this corresponds
to the condition $\altstrvar \altobsvar' = \mathbf{0}$.

In the general case $p$ is not necessarily equal to $r+n$, and
$\altstrvar \altobsvar'$ may be non-zero.  This means that the
Kalman gain equation (\ref{eq:gain}) must be modified as
%
\begin{equation}
\label{eq:altgain}
\gain_t = (\statemat_t\statevar_{t|t-1}\obsmat_t + \altstrvar_t\altobsvar_t')
  \predvar_t^{-1}
\end{equation}
%
Otherwise, the equations given earlier hold good, if we write
$\altstrvar\altstrvar'$ in place of $\strvar$ and
$\altobsvar\altobsvar'$ in place of $\obsvar$.

In the account of gretl's Kalman facility below we take the
uncorrelated case as the baseline, but add remarks on how to
handle the correlated case where applicable.

\section{Intended usage}

The Kalman filter can be used in three ways: two of these are the
classic forward and backward pass, or filtering and smoothing
respectively; the third use is simulation.  In the
filtering/smoothing case you have the data $\obsvec_t$ and you want to
reconstruct the states $\statevec_t$ (and the forecast errors as a
by-product), but we may also have a computational apparatus that does
the reverse: given artificially-generated series $\obsdist_t$ and
$\strdist_t$, generate the states $\statevec_t$ (and the observables
$\obsvec_t$ as a by-product).

The usefulness of the classical filter is well known; the usefulness
of the Kalman filter as a simulation tool may be huge too. Think for
instance of Monte Carlo experiments, simulation-based inference---see
Gourieroux--Monfort (1996)---or Bayesian methods, especially in the
context of the estimation of DSGE models.

\section{Overview of syntax}

Using the Kalman filter in gretl is a two-step process.  First you set
up your filter, using a block of commands starting with
\texttt{kalman} and ending with \texttt{end kalman}---much like the
\texttt{gmm} command.  Then you invoke the functions \texttt{kfilter},
\texttt{ksmooth} or \texttt{ksimul} to do the actual work.  The next
two sections expand on these points.

\section{Defining the filter}

Each line within the \texttt{kalman} \dots{} \texttt{end kalman} block 
takes the form

\vspace{1ex}
\textsl{keyword} \textsl{value}
\vspace{1ex}

\noindent where \textsl{keyword} represents a matrix, as shown 
below.
% % in Table~\ref{tab:keywords}.

%% \begin{table}[htbp]
\begin{center}
\begin{tabular}{lcc}
Keyword & Symbol & Dimensions \\[6pt]
\texttt{obsy}     & $\obsvec$         & $T \times n$ \\
\texttt{obsymat}  & $\obsmat$         & $r \times n$ \\
\texttt{obsx}     & $\obsx$           & $T \times k$ \\
\texttt{obsxmat}  & $\obsxmat$        & $k \times n$ \\ 
\texttt{obsvar}   & $\obsvar$         & $n \times n$ \\
\texttt{statemat} & $\statemat$       & $r \times r$ \\
\texttt{statevar} & $\strvar$         & $r \times r$ \\
\texttt{inistate} & $\hat{\statevec}_{1|0}$  & $r \times 1$ \\
\texttt{inivar}   & $\statevar_{1|0}$ & $r \times r$ \\
\end{tabular}
%% \caption{Kalman block keywords}
%% \label{tab:keywords}
\end{center}
%% \end{table}

For the data matrices $\obsvec$ and $\obsx$ the corresponding
\textsl{value} may be the name of a predefined matrix, the name of a
data series, or the name of a list of series.\footnote{Note that the
  data matrices \texttt{obsy} and \texttt{obsx} have $T$ rows.  That
  is, the column vectors $\obsvec_t$ and $\obsx_t$ in (\ref{eq:state})
  and (\ref{eq:obs}) are in fact the transposes of the $t$-dated rows
  of the full matrices.}  

For the other inputs, \textsl{value} may be the name of a predefined
matrix or, if the input in question happens to be ($1 \times $1), the
name of a scalar variable or a numerical constant.  If the
\textsl{value} of a coefficient matrix is given as the name of a
matrix or scalar variable, the input is not ``hard-wired'' into the
Kalman structure, rather a record is made of the \textit{name} of the
variable and on each run of a Kalman function (as described below) its
value is re-read.  It is therefore possible to write one
\texttt{kalman} block and then do several filtering or smoothing
passes using different sets of coefficients.\footnote{Note, however,
  that the dimensions of the various input matrices are defined via
  the initial \texttt{kalman} set-up and it is an error if any of the
  matrices are changed in size.}  An example of this technique is
provided later, in the example scripts \ref{script:armaest} and
\ref{script:loclev}.  This facility to alter the values of the
coefficients between runs of the filter is to be distinguished from
the case of \emph{time-varying} matrices, which is discussed below.

Not all of the above-mentioned inputs need be specified in every case;
some are optional. (In addition, you can specify the matrices in any
order.)  The mandatory elements are $\obsvec$, $\obsmat$, $\statemat$
and $\strvar$, so the minimal \texttt{kalman} block looks like this:

\begin{code}
kalman 
  obsy y
  obsymat H
  statemat F
  statevar Q
end kalman
\end{code} 

The optional matrices are listed below, along with the implication
of omitting the given matrix.

\begin{center}
\begin{tabular}{ll}
Keyword & If omitted\dots \\ [6pt]
\texttt{obsx} & no exogenous variables in observation equation\\
\texttt{obsxmat} & no exogenous variables in observation equation\\
\texttt{obsvar} & no disturbance term in observation equation\\
\texttt{inistate} & $\hat{\statevec}_{1|0}$ is set to a zero vector\\
\texttt{inivar} & $\statevar_{1|0}$ is set automatically\\
\end{tabular}
\end{center}

It might appear that the \texttt{obsx} ($\obsx$) and \texttt{obsxmat}
($\obsxmat$) matrices must go together---either both are given or
neither is given.  But an exception is granted for convenience.  If
the observation equation includes a constant but no additional
exogenous variables, you can give a ($1 \times n$) value for
$\obsxmat$ without having to specify \texttt{obsx}.  More generally,
if the row dimension of $\obsxmat$ is 1 greater than the column
dimension of $\obsx$, it is assumed that the first element of
$\obsxmat$ is associated with an implicit column of 1s.

Regarding the automatic initialization of $\statevar_{1|0}$ (in case
no \texttt{inivar} input is given): by default this is done as in
equation (\ref{eq:inivar}).  However, this method is applicable only
if all the eigenvalues of $\statemat$ lie inside the unit circle.  If
this condition is not satisfied we instead apply a diffuse prior,
setting $\statevar_{1|0} = \kappa \mathbf{I}_r$ with $\kappa = 10^7$.
If you wish to impose this diffuse prior from the outset, append the
option flag \verb|--diffuse| to the \texttt{end kalman}
statement.\footnote{ Initialization of the Kalman filter outside of
  the case where equation (\ref{eq:inivar}) applies has been the
  subject of much discussion in the literature---see for example de
  Jong (1991), Koopman (1997).  At present gretl does not implement
  any of the more elaborate proposals that have been made.}

\subsection{Time-varying matrices}

Any or all of the matrices \texttt{obsymat}, \texttt{obsxmat},
\texttt{obsvar}, \texttt{statemat} and \texttt{statevar} may be
time-varying.  In that case the \textsl{value} corresponding to the
matrix keyword should be given in a special form: the name of an
existing matrix plus a function call which modifies that matrix,
separated by a semicolon.  Note that in this case you must use
a matrix variable, even if the matrix in question happens to be
$1 \times 1$. 

For example, suppose the matrix $\obsmat$ is time-varying.  Then we
might write
%
\begin{code}
obsymat H ; modify_H(&H, theta)
\end{code}
%
where \texttt{modify\_H} is a user-defined function which modifies
matrix \texttt{H} (and \texttt{theta} is a suitable additional
argument to that function, if required).

The above is just an illustration: the matrix argument does not have
to come first, and the function can have as many arguments as you
like.  The essential point is that the function must modify the
specified matrix, which requires that it be given as an argument in
``pointer'' form (preceded by \verb|&|).  The function need not return
any value directly; if it does, that value is ignored.

Such matrix-modifying functions will be called at each time-step of
the filter operation, prior to performing any calculations.  They have
access to the current time-step of the Kalman filter via the internal
variable \verb+$kalman_t+, which has value 1 on the first step, 2 on
the second, and so on, up to step $T$.  They also have access to the
previous $n$-vector of forecast errors, $\prederr_{t-1}$, under the
name \verb+$kalman_uhat+.  When $t=1$ this will be a zero vector.

\subsection{Correlated disturbances}

Defining a filter in which the disturbances $\strdist_t$ and
$\obsdist_t$ are correlated involves one modification to the account
given above.  If you append the \verb|--cross| option flag to the
\texttt{end kalman} statement, then the matrices corresponding to
the keywords \texttt{statevar} and \texttt{obsvar} are interpreted
not as $\strvar$ and $\obsvar$ but rather as $\altstrvar$ and
$\altobsvar$ as discussed in section~\ref{sec:kalman-recursions}.
Gretl then computes $\strvar = \altstrvar\altstrvar'$ and
$\obsvar = \altobsvar\altobsvar'$ as well as the cross-product
$\altstrvar\altobsvar'$ and utilizes the modified expression for
the gain as given in equation (\ref{eq:altgain}).  As mentioned
above, $\altstrvar$ should be ($r \times p$) and $\altobsvar$
should be ($n \times p$), where $p$ is the number of elements
in the combined disturbance vector $\alldist_t$.

\subsection{Handling of missing values}

It is acceptable for the data matrices, \texttt{obsy} and
\texttt{obsx}, to contain missing values.  In this case the filtering
operation will work around the missing values, and the \texttt{ksmooth}
function can be used to obtain estimates of these values.  However,
there are two points to note.

First, gretl's default behavior is to skip missing observations when
constructing matrices from data series.  To change this, use the \texttt{set} 
command thus:
%
\begin{code}
set skip_missing off
\end{code}

Second, the handling of missing values is not yet quite right for the
case where the observable vector $\obsvec_t$ contains more than one
element.  At present, if any of the elements of $\obsvec_t$ are
missing the entire observation is ignored.  Clearly it should be
possible to make use of any non-missing elements, and this is not
very difficult in principle, it's just awkward and is not
implemented yet.

\subsection{Persistence and identity of the filter}

At present there is no facility to create a ``named filter''.  Only
one filter can exist at any point in time, namely the one created by
the last \texttt{kalman} block.\footnote{This is not quite true: more
  precisely, there can be no more than one Kalman filter \textit{at
    each level of function execution}.  That is, if a gretl script
  creates a Kalman filter, a user-defined function called from that
  script may also create a filter, without interfering with the
  original one.}  If a filter is already defined, and you give a new
\texttt{kalman} block, the old filter is over-written.  Otherwise the
existing filter persists (and remains available for the
\texttt{kfilter}, \texttt{ksmooth} and \texttt{ksimul} functions)
until either (a) the gretl session is terminated or (b) the command
\texttt{delete kalman} is given.


\section{The \texttt{kfilter} function}

Once a filter is established, as discussed in the previous section,
\texttt{kfilter} can be used to run a forward, forecasting pass.
This function returns a scalar code: 0 for successful completion, or 1
if numerical problems were encountered.  On successful completion, two
scalar accessor variables become available: \verb+$kalman_lnl+, which
gives the overall log-likelihood under the joint normality assumption,
%
\[
  \ell = -\frac{1}{2} \left[nT \log(2 \pi) + \sum_{t=1}^T\left|\predvar_t\right| + 
    \sum_{t=1}^T\prederr_t'\predvar_t^{-1} \prederr_t
  \right]
\]
%
and \verb+$kalman_s2+, which gives the estimated variance,
%
\[
\hat{\sigma}^2 = \frac{1}{nT} 
   \sum_{t=1}^T\prederr_t'\predvar_t^{-1} \prederr_t
\]
(but see below for modifications to these formulae for the case of a
diffuse prior).  In addition the accessor \verb+$kalman_llt+ gives a
($T \times 1$) vector, element $t$ of which is
%
\[
  \ell_t = -\frac{1}{2} \left[n \log(2 \pi) + \left|\predvar_t\right| + 
    \prederr_t'\predvar_t^{-1} \prederr_t
  \right]
\]
%

The \texttt{kfilter} function does not require any arguments, but up
to five matrix quantities may be retrieved via optional pointer
arguments.  Each of these matrices has $T$ rows, one for each
time-step; the contents of the rows are shown in the following
listing.
%
\begin{enumerate}
\item Forecast errors for the observable variables: $\prederr_t'$, $n$
  columns.
\item Variance matrix for the forecast errors: $\myvech(\predvar_t)'$,
  $n(n+1)/2$ columns.
\item Estimate of the state vector: $\hat{\statevec}_{t|t-1}'$, $r$ columns.
\item MSE of estimate of the state vector:
  $\myvech(\statevar_{t|t-1})'$, $r(r+1)/2$ columns.
\item Kalman gain: $\myvec(\gain_t)'$, $rn$ columns.
\end{enumerate}

Unwanted trailing arguments can be omitted, otherwise unwanted
arguments can be skipped by using the keyword \texttt{null}.  For
example, the following call retrieves the forecast errors in the
matrix \texttt{E} and the estimate of the state vector in \texttt{S}:
%
\begin{code}
matrix E S
kfilter(&E, null, &S)
\end{code}

Matrices given as pointer arguments do not have to be correctly
dimensioned in advance; they will be resized to receive the specified
content.

Further note: in general, the arguments to \texttt{kfilter} should all
be matrix-pointers, but under two conditions you can give a pointer to
a series variable instead.  The conditions are: (i) the matrix in
question has just one column in context (for example, the first two
matrices will have a single column if the length of the observables
vector, $n$, equals 1) and (ii) the time-series length of the filter
is equal to the current gretl sample size.

\subsection{Likelihood under the diffuse prior}

There seems to be general agreement in the literature that the
log-likelihood calculation should be modified in the case of a diffuse
prior for $\statevar_{1|0}$.  However, it is not clear to us that
there is a well-defined ``correct'' method for this.  At present we
emulate \textsf{SsfPack} (see Koopman \textit{et al} (1999) and
section~\ref{sec:amble}).  In case $\statevar_{1|0} = \kappa
\mathbf{I}_r$, we set $d = r$ and calculate
%
\[
  \ell = -\frac{1}{2} \left[(nT-d) \log(2 \pi) + 
    \sum_{t=1}^T\left|\predvar_t\right| + 
    \sum_{t=1}^T\prederr_t'\predvar_t^{-1} \prederr_t
    - d \log(\kappa)
  \right]
\]
%
and
%
\[
\hat{\sigma}^2 = \frac{1}{nT-d} 
   \sum_{t=1}^T\prederr_t'\predvar_t^{-1} \prederr_t
\]

\section{The \texttt{ksmooth} function}

This function returns the ($T \times r$) matrix of smoothed estimates
of the state vector---that is, estimates based on all $T$
observations: row $t$ of this matrix holds $\hat{\statevec}_{t|T}'$.  This
function has no required arguments but it offers one optional
matrix-pointer argument, which retrieves the variance of the smoothed
state estimate, $\statevar_{t|T}$.  The latter matrix is ($T \times
r(r+1)/2$); each row is in transposed vech form.  Examples:
%
\begin{code}
matrix S = ksmooth()  # smoothed state only
matrix P
S = ksmooth(&P)       # the variance is wanted
\end{code}

These values are computed via a backward pass of the filter, from
$t = T$ to $t = 1$, as follows:
%
\begin{align*}
\mathbf{L}_t &= \statemat_t - \gain_t \obsmat_t' \\
\mathbf{u}_{t-1} &= \obsmat_t \predvar_t^{-1} \prederr_t 
 + \mathbf{L}_t' \mathbf{u}_t \\
\mathbf{U}_{t-1} &= \obsmat_t \predvar_t^{-1} \obsmat_t' + 
  \mathbf{L}_t' \mathbf{U}_t \mathbf{L}_t \\
\hat{\statevec}_{t|T} &= \hat{\statevec}_{t|t-1} + 
  \statevar_{t|t-1} \mathbf{u}_{t-1} \\
\statevar_{t|T} &= \statevar_{t|t-1} - 
  \statevar_{t|t-1} \mathbf{U}_{t-1} \statevar_{t|t-1}
\end{align*}
%
with initial values $\mathbf{u}_T = 0$ and $\mathbf{U}_T =
0$.\footnote{See I. Karibzhanov's exposition at
\url{http://www.econ.umn.edu/~karib003/help/kalcvs.htm}.}

This iteration is preceded by a special forward pass in which the
matrices $\gain_t$, $\predvar_t^{-1}$, $\hat{\statevec}_{t|t-1}$ and
$\statevar_{t|t-1}$ are stored for all $t$.  If $\statemat$
is time-varying, its values for all $t$ are stored on the forward
pass, and similarly for $\obsmat$.


\section{The \texttt{ksimul} function}

This simulation function takes up to three arguments.  The first,
mandatory, argument is a ($T \times r$) matrix containing artificial
disturbances for the state transition equation: row $t$ of this matrix
represents $\strdist_t'$.  If the current filter has a non-null
$\obsvar$ (\texttt{obsvar}) matrix, then the second argument should be
a ($T \times n$) matrix containing artificial disturbances for the
observation equation, on the same pattern.  Otherwise the second
argument should be given as \texttt{null}.  If $r=1$ you may give a
series for the first argument, and if $n=1$ a series is acceptable for
the second argument.

Provided that the current filter does not include exogenous variables
in the observation equation (\texttt{obsx}), the $T$ for simulation
need not equal that defined by the original \texttt{obsy} data matrix:
in effect $T$ is temporarily redefined by the row dimension of the
first argument to \texttt{ksimul}.  Once the simulation is completed,
the $T$ value associated with the original data is restored.

The value returned by \texttt{ksimul} is a ($T \times n$) matrix
holding simulated values for the observables at each time step.  A
third optional matrix-pointer argument allows you to retrieve a ($T
\times r$) matrix holding the simulated state vector.  Examples:
%
\begin{code}
matrix Y = ksimul(V)    # obsvar is null
Y = ksimul(V, W)        # obsvar is non-null
matrix S
Y = ksimul(V, null, &S) # the simulated state is wanted
\end{code}

The initial value $\statevec_1$ is calculated thus: we find the matrix
$\mathbf{T}$ such that $\mathbf{T}\mathbf{T}' = \statevar_{1|0}$ (as
given by the \texttt{inivar} element in the \texttt{kalman} block),
multiply it into $\strdist_1$, and add the result to $\statevec_{1|0}$
(as given by \texttt{inistate}).

If the disturbances are correlated across the two equations the
arguments to \texttt{ksimul} must be revised: the first argument
should be a ($T \times p$) matrix, each row of which represents
$\alldist_t'$ (see section~\ref{sec:kalman-recursions}), and the
second argument should be given as \texttt{null}.



\section{Example 1: ARMA estimation}
\label{sec:example_arma}

As is well known, the Kalman filter provides a very efficient way to
compute the likelihood of ARMA models; as an example, take an
ARMA(1,1) model
\[
  y_t = \phi y_{t-1} + \varepsilon_t + \theta \varepsilon_{t-1}
\]
One of the ways the above equation can be cast in state-space form is
by defining a latent process $\xi_t = (1 - \phi L)^{-1}
\varepsilon_t$.   The observation equation corresponding to (\ref{eq:obs})
is then
%
\begin{equation}
y_t = \xi_t + \theta \xi_{t-1} \label{eq:arma-meas}
\end{equation}
%
and the state transition equation corresponding to (\ref{eq:state}) is
%
\[
  \left[ \begin{array}{c} \xi_t \\ \xi_{t-1} \end{array} \right] =
  \left[ \begin{array}{cc} \phi & 0 \\ 1 & 0 \end{array} \right]
  \left[ \begin{array}{c} \xi_{t-1} \\ \xi_{t-2} \end{array} \right] +
  \left[ \begin{array}{c} \varepsilon_t \\ 0 \end{array} \right] 
\]

The gretl syntax for a corresponding \texttt{kalman} block would be
\begin{code}
matrix H = {1; theta}
matrix F = {phi, 0; 1, 0}
matrix Q = {s^2, 0; 0, 0}

kalman
    obsy y
    obsymat H
    statemat F
    statevar Q
end kalman
\end{code}
Note that the observation equation (\ref{eq:arma-meas}) does not
include an ``error term''; this is equivalent to saying that
$V(\obsdist_t) = 0$ and, as a consequence, the \texttt{kalman} block
does not include an \texttt{obsvar} keyword.

Once the filter is set up, all it takes to compute the loglikelihood
for given values of $\phi$, $\theta$ and $\sigma^2$ is to execute the
\texttt{kfilter()} function and use the \verb+$kalman_lnl+ accessor
(which returns the total log-likelihood) or, more appropriately if the
likelihood has to be maximized through \texttt{mle}, the
\verb+$kalman_llt+ accessor, which returns the series of individual
contribution to the log-likelihood for each observation. An example
is shown in script~\ref{script:armaest}.

\begin{script}[htbp]
  \caption{ARMA estimation}
  \label{script:armaest}
\begin{scode}
function void arma11_via_kalman(series y)
    /* parameter initalization */
    phi = 0
    theta = 0
    sigma = 1

    /* Kalman filter setup */
    matrix H = {1; theta}
    matrix F = {phi, 0; 1, 0}
    matrix Q = {sigma^2, 0; 0, 0}

    kalman
        obsy y
        obsymat H
        statemat F
        statevar Q
    end kalman

    /* maximum likelihood estimation */
    mle logl = ERR ? NA : $kalman_llt
        H[2] = theta
        F[1,1] = phi
        Q[1,1] = sigma^2
        ERR = kfilter()
        params phi theta sigma
    end mle -h
end function

# ------------------------ main ---------------------------

open arma.gdt        # open the "arma" example dataset
arma11_via_kalman(y) # estimate an arma(1,1) model
arma 1 1 ; y --nc    # check via native command
\end{scode}
\end{script}

\section{Example 2: local level model}
\label{sec:example_loclev}

Suppose we have a series $y_t = \mu_t + \varepsilon_t$, where $\mu_t$
is a random walk with normal increments of variance $\sigma^2_1$ and $
\varepsilon_t$ is a normal white noise with variance $\sigma^2_2$,
independent of $\mu_t$. This is known as the ``local level'' model in
Harvey's (1991) terminology, and it can be cast in state-space form as
equations (\ref{eq:state})-(\ref{eq:obs}) with
$\statemat = 1$, $\strdist_t \sim N(0,\, \sigma^2_1)$, $\obsmat = 1$
and $\obsdist_t \sim N(0,\, \sigma^2_2)$.  The translation to a
\texttt{kalman} block is
\begin{code}
kalman
   obsy y
   obsymat 1
   statemat 1
   statevar s2
   obsvar s1
end kalman --diffuse
\end{code}

The two unknown parameters $\sigma^2_1$ and $\sigma^2_2$ can be
estimated via maximum likelihood.  Script~\ref{script:loclev} provides
an example of simulation and estimation of such a model. For the sake
of brevity, simulation is carried out via ordinary gretl commands,
rather than the state-space apparatus described above.

The example contains two functions: the first one carries out the
estimation of the unknown parameters $\sigma^2_1$ and $\sigma^2_2$ via
maximum likelihood; the second one uses these estimates to compute a
smoothed estimate of the unobservable series $\mu_t$ calles
\texttt{muhat}. A plot of $\mu_t$ and its estimate is presented in
Figure~\ref{fig:loclev}.

\begin{script}[htbp]
  \caption{Local level model}
  \label{script:loclev}
\begin{scode}
function matrix local_level (series y)
    /* starting values */
    scalar s1 = 1
    scalar s2 = 1

    /* Kalman filter set-up */
    kalman
       obsy y
       obsymat 1
       statemat 1
       statevar s2
       obsvar s1
    end kalman --diffuse

    /* ML estimation */
    mle ll = ERR ? NA : $kalman_llt
        ERR = kfilter()
        params s1 s2
    end mle

    return s1 ~ s2
end function

function series loclev_sm (series y, scalar s1, scalar s2)
    /* return the smoothed estimate of \mu_t */
    kalman
       obsy y
       obsymat 1
       statemat 1
       statevar s2
       obsvar s1
    end kalman --diffuse
    series ret = ksmooth()
    return ret
end function

/* -------------------- main script -------------------- */

nulldata 200
set seed 202020
setobs 1 1 --special
true_s1 = 0.25
true_s2 = 0.5
v = normal() * sqrt(true_s1)
w = normal() * sqrt(true_s2)
mu = 2 + cum(w)
y = mu + v

matrix Vars = local_level(y)           # estimate the variances
muhat = loclev_sm(y, Vars[1], Vars[2]) # compute the smoothed state
\end{scode}
\end{script}

\begin{figure}[htbp]
  \centering
  \includegraphics{figures/local_level}
  \caption{Local level model: $\mu_t$ and its smoothed estimate}
  \label{fig:loclev}
\end{figure}
By appending the following code snippet to the example in
Table~\ref{script:loclev}, one may check the results against the
\textsf{R} command \texttt{StructTS}.

\begin{code}
foreign language=R --send-data
  y <- gretldata[,"y"]
  a <- StructTS(y, type="level")
  a
  StateFromR <- as.ts(tsSmooth(a))
  gretl.export(StateFromR)
end foreign

append @dotdir/StateFromR.csv

ols Uhat 0 StateFromR --simple
\end{code}