\chapter{Special functions in genr} \label{chap-genr} \section{Introduction} \label{genr-intro} The \verb+genr+ command provides a flexible means of defining new variables. It is documented in the \GCR. This chapter offers a more expansive discussion of some of the special functions available via \verb+genr+ and some of the finer points of the command. \section{Long-run variance} \label{sec:lrvar} As is well known, the variance of the average of $T$ random variables $x_1, x_2, \ldots, x_T$ with equal variance $\sigma^2$ equals $\sigma^2/T$ if the data are uncorrelated. In this case, the sample variance of $x_t$ over the sample size provides a consistent estimator. If, however, there is serial correlation among the $x_t$s, the variance of $\bar{X} = T^{-1} \sum_{t=1}^T x_t$ must be estimated differently. One of the most widely used statistics for this purpose is a nonparametric kernel estimator with the Bartlett kernel defined as \begin{equation} \label{eq:scalar-lrvar} \hat{\omega}^2(k) = T^{-1} \sum_{t=k}^{T-k} \left[ \sum_{i=-k}^k w_i (x_t - \bar{X}) (x_{t-i} - \bar{X}) \right] , \end{equation} where the integer $k$ is known as the window size and the $w_i$ terms are the so-called \emph{Bartlett weights}, defined as $w_i = 1 - \frac{|i|}{k + 1}$. It can be shown that, for $k$ large enough, $\hat{\omega}^2(k)/T$ yields a consistent estimator of the variance of $\bar{X}$. \app{Gretl} implements this estimator by means of the function \texttt{lrvar()}, which takes two arguments: the series whose long-run variance must be estimated and the scalar $k$. If $k$ is negative, the popular choice $T^{1/3}$ is used. \section{Time-series filters} \label{sec:filters} One sort of specialized function in \verb+genr+ is time-series filtering. In addition to the usual application of lags and differences, \app{gretl} provides fractional differencing and two filters commonly used in macroeconomics for trend-cycle decomposition: the Hodrick--Prescott filter (Hodrick and Prescott, 1997) and the Baxter--King bandpass filter (Baxter and King, 1999). \subsection{Fractional differencing} \label{sec:fracdiff} The concept of differencing a time series $d$ times is pretty obvious when $d$ is an integer; it may seem odd when $d$ is fractional. However, this idea has a well-defined mathematical content: consider the function \[ f(z) = (1 - z)^{-d}, \] where $z$ and $d$ are real numbers. By taking a Taylor series expansion around $z=0$, we see that \[ f(z) = 1 + dz + \frac{d (d+1)}{2} z^2 + \cdots \] or, more compactly, \[ f(z) = 1 + \sum_{i=1}^{\infty} \psi_i z^i \] with \[ \psi_k = \frac{\prod_{i=1}^{k} (d+i-1) }{k!} = \psi_{k-1} \frac{d+k-1}{k} \] The same expansion can be used with the lag operator, so that if we defined \[ Y_t = (1-L)^{0.5} X_t \] this could be considered shorthand for \[ Y_t = X_t - 0.5 X_{t-1} - 0.125 X_{t-2} - 0.0625 X_{t-3} - \cdots \] In \app{gretl} this transformation can be accomplished by the syntax \begin{code} genr Y = fracdiff(X,0.5) \end{code} \subsection{The Hodrick--Prescott filter} \label{sec:hodrick-prescott} This filter is accessed using the \verb+hpfilt()+ function, which takes one argument, the name of the variable to be processed. A time series $y_t$ may be decomposed into a trend or growth component $g_t$ and a cyclical component $c_t$. % \[ y_t = g_t + c_t, \quad t = 1,2,\dots,T \] % The Hodrick--Prescott filter effects such a decomposition by minimizing the following: % \[ \sum_{t = 1}^T {(y_t - g_t )^2 } + \lambda \sum_{t = 2}^{T - 1} \left((g_{t+1} - g_t) - (g_t - g_{t - 1} )\right)^2 . \] % The first term above is the sum of squared cyclical components $c_t = y_t - g_t$. The second term is a multiple $\lambda$ of the sum of squares of the trend component's second differences. This second term penalizes variations in the growth rate of the trend component: the larger the value of $\lambda$, the higher is the penalty and hence the smoother the trend series. Note that the \cmd{hpfilt} function in \app{gretl} produces the cyclical component, $c_t$, of the original series. If you want the smoothed trend you can subtract the cycle from the original: \begin{code} genr ct = hpfilt(yt) genr gt = yt - ct \end{code} Hodrick and Prescott (1997) suggest that a value of $\lambda = 1600$ is reasonable for quarterly data. The default value in \app{gretl} is 100 times the square of the data frequency (which, of course, yields 1600 for quarterly data). The value can be adjusted using the \cmd{set} command, with a parameter of \cmd{hp\_lambda}. For example, \cmd{set hp\_lambda 1200}. \subsection{The Baxter and King filter} \label{sec:baxter-king} This filter is accessed using the \verb+bkfilt()+ function, which again takes the name of the variable to be processed as its single argument. Consider the spectral representation of a time series $y_t$: % \[ y_t = \int_{-\pi}^{\pi} e^{i\omega} \mathrm{d} Z(\omega) \] % To extract the component of $y_t$ that lies between the frequencies $\underline{\omega}$ and $\overline{\omega}$ one could apply a bandpass filter: % \[ c^*_t = \int_{-\pi}^{\pi} F^*(\omega) e^{i\omega} \mathrm{d} Z(\omega) \] % where $F^*(\omega) = 1$ for $\underline{\omega} < |\omega| < \overline{\omega}$ and 0 elsewhere. This would imply, in the time domain, applying to the series a filter with an infinite number of coefficients, which is undesirable. The Baxter and King bandpass filter applies to $y_t$ a finite polynomial in the lag operator $A(L)$: % \[ c_t = A(L) y_t \] % where $A$($L$) is defined as % \[ A(L) = \sum_{i=-k}^{k} a_i L^i \] The coefficients $a_i$ are chosen such that $F(\omega) = A(e^{i\omega})A(e^{-i\omega})$ is the best approximation to $F^*(\omega)$ for a given $k$. Clearly, the higher $k$ the better the approximation is, but since $2k$ observations have to be discarded, a compromise is usually sought. Moreover, the filter has also other appealing theoretical properties, among which the property that $A(1) = 0$, so a series with a single unit root is made stationary by application of the filter. In practice, the filter is normally used with monthly or quarterly data to extract the ``business cycle'' component, namely the component between 6 and 36 quarters. Usual choices for $k$ are 8 or 12 (maybe higher for monthly series). The default values for the frequency bounds are 8 and 32, and the default value for the approximation order, $k$, is 8. You can adjust these values using the \cmd{set} command. The keyword for setting the frequency limits is \verb+bkbp_limits+ and the keyword for $k$ is \verb+bkbp_k+. Thus for example if you were using monthly data and wanted to adjust the frequency bounds to 18 and 96, and $k$ to 24, you could do \begin{code} set bkbp_limits 18 96 set bkbp_k 24 \end{code} These values would then remain in force for calls to the \verb+bkfilt+ function until changed by a further use of \verb+set+. \section{Panel data specifics} \label{panel-genr} \subsection{Dummy variables} \label{dummies} In a panel study you may wish to construct dummy variables of one or both of the following sorts: (a) dummies as unique identifiers for the units or groups, and (b) dummies as unique identifiers for the time periods. The former may be used to allow the intercept of the regression to differ across the units, the latter to allow the intercept to differ across periods. Two special functions are available to create such dummies. These are found under the ``Add'' menu in the GUI, or under the \cmd{genr} command in script mode or \app{gretlcli}. \begin{enumerate} \item ``unit dummies'' (script command \cmd{genr unitdum}). This command creates a set of dummy variables identifying the cross-sectional units. The variable \verb+du_1+ will have value 1 in each row corresponding to a unit 1 observation, 0 otherwise; \verb+du_2+ will have value 1 in each row corresponding to a unit 2 observation, 0 otherwise; and so on. \item ``time dummies'' (script command \cmd{genr timedum}). This command creates a set of dummy variables identifying the periods. The variable \verb+dt_1+ will have value 1 in each row corresponding to a period 1 observation, 0 otherwise; \verb+dt_2+ will have value 1 in each row corresponding to a period 2 observation, 0 otherwise; and so on. \end{enumerate} If a panel data set has the \verb+YEAR+ of the observation entered as one of the variables you can create a periodic dummy to pick out a particular year, e.g.\ \cmd{genr dum = (YEAR=1960)}. You can also create periodic dummy variables using the modulus operator, \verb+%+. For instance, to create a dummy with value 1 for the first observation and every thirtieth observation thereafter, 0 otherwise, do % \begin{code} genr index genr dum = ((index-1) % 30) = 0 \end{code} \subsection{Lags, differences, trends} \label{panel-lagged} If the time periods are evenly spaced you may want to use lagged values of variables in a panel regression (but see section~\ref{panel-dyn} below); you may also wish to construct first differences of variables of interest. Once a dataset is identified as a panel, \app{gretl} will handle the generation of such variables correctly. For example the command \verb+genr x1_1 = x1(-1)+ will create a variable that contains the first lag of \verb+x1+ where available, and the missing value code where the lag is not available (e.g.\ at the start of the time series for each group). When you run a regression using such variables, the program will automatically skip the missing observations. When a panel data set has a fairly substantial time dimension, you may wish to include a trend in the analysis. The command \cmd{genr time} creates a variable named \varname{time} which runs from 1 to $T$ for each unit, where $T$ is the length of the time-series dimension of the panel. If you want to create an index that runs consecutively from 1 to $m\times T$, where $m$ is the number of units in the panel, use \cmd{genr index}. \subsection{Basic statistics by unit} \label{panel-stats} \app{Gretl} contains functions which can be used to generate basic descriptive statistics for a given variable, on a per-unit basis; these are \texttt{pnobs()} (number of valid cases), \texttt{pmin()} and \texttt{pmax()} (minimum and maximum) and \texttt{pmean()} and \texttt{psd()} (mean and standard deviation). As a brief illustration, suppose we have a panel data set comprising 8 time-series observations on each of $N$ units or groups. Then the command % \begin{code} genr pmx = pmean(x) \end{code} % creates a series of this form: the first 8 values (corresponding to unit 1) contain the mean of \varname{x} for unit 1, the next 8 values contain the mean for unit 2, and so on. The \texttt{psd()} function works in a similar manner. The sample standard deviation for group $i$ is computed as \[ s_i = \sqrt{\frac{\sum(x-\bar{x}_i)^2}{T_i-1}} \] where $T_i$ denotes the number of valid observations on \varname{x} for the given unit, $\bar{x}_i$ denotes the group mean, and the summation is across valid observations for the group. If $T_i < 2$, however, the standard deviation is recorded as 0. One particular use of \texttt{psd()} may be worth noting. If you want to form a sub-sample of a panel that contains only those units for which the variable \varname{x} is time-varying, you can either use % \begin{code} smpl (pmin(x) < pmax(x)) --restrict \end{code} or % \begin{code} smpl (psd(x) > 0) --restrict \end{code} \subsection{Special functions for data manipulation} \label{panel-manip} Besides the functions discussed above, there are some facilities in \texttt{genr} designed specifically for manipulating panel data --- in particular, for the case where the data have been read into the program from a third-party source and they are not in the correct form for panel analysis. These facilities are explained in Chapter~\ref{datafiles}. \section{Resampling and bootstrapping} \label{sec:genr-resample} Another specialized function is the resampling, with replacement, of a series. Given an original data series \varname{x}, the command % \begin{code} genr xr = resample(x) \end{code} % creates a new series each of whose elements is drawn at random from the elements of \varname{x}. If the original series has 100 observations, each element of \varname{x} is selected with probability $1/100$ at each drawing. Thus the effect is to ``shuffle'' the elements of \varname{x}, with the twist that each element of \varname{x} may appear more than once, or not at all, in \varname{xr}. The primary use of this function is in the construction of bootstrap confidence intervals or p-values. Here is a simple example. Suppose we estimate a simple regression of $y$ on $x$ via OLS and find that the slope coefficient has a reported $t$-ratio of 2.5 with 40 degrees of freedom. The two-tailed p-value for the null hypothesis that the slope parameter equals zero is then 0.0166, using the $t(40)$ distribution. Depending on the context, however, we may doubt whether the ratio of coefficient to standard error truly follows the $t(40)$ distribution. In that case we could derive a bootstrap p-value as shown in Example~\ref{resampling-loop}. Under the null hypothesis that the slope with respect to $x$ is zero, $y$ is simply equal to its mean plus an error term. We simulate $y$ by resampling the residuals from the initial OLS and re-estimate the model. We repeat this procedure a large number of times, and record the number of cases where the absolute value of the $t$-ratio is greater than 2.5: the proportion of such cases is our bootstrap p-value. For a good discussion of simulation-based tests and bootstrapping, see Davidson and MacKinnon (2004, chapter 4). \begin{script}[htbp] \caption{Calculation of bootstrap p-value} \label{resampling-loop} \begin{scode} ols y 0 x # save the residuals genr ui = $uhat scalar ybar = mean(y) # number of replications for bootstrap scalar replics = 10000 scalar tcount = 0 series ysim = 0 loop replics --quiet # generate simulated y by resampling ysim = ybar + resample(ui) ols ysim 0 x scalar tsim = abs($coeff(x) / $stderr(x)) tcount += (tsim > 2.5) endloop printf "proportion of cases with |t| > 2.5 = %g\n", tcount / replics \end{scode} %$ \end{script} \section{Cumulative densities and p-values} \label{sec:genr-cdf} The two functions \cmd{cdf} and \cmd{pvalue} provide complementary means of examining values from several probability distributions: the standard normal, Student's $t$, $\chi^2$, $F$, gamma, and binomial. The syntax of these functions is set out in the \GCR; here we expand on some subtleties. The cumulative density function or CDF for a random variable is the integral of the variable's density from its lower limit (typically either $-\infty$ or 0) to any specified value $x$. The p-value (at least the one-tailed, right-hand p-value as returned by the \cmd{pvalue} function) is the complementary probability, the integral from $x$ to the upper limit of the distribution, typically $+\infty$. In principle, therefore, there is no need for two distinct functions: given a CDF value $p_0$ you could easily find the corresponding p-value as $1-p_0$ (or vice versa). In practice, with finite-precision computer arithmetic, the two functions are not redundant. This requires a little explanation. In \app{gretl}, as in most statistical programs, floating point numbers are represented as ``doubles'' --- double-precision values that typically have a storage size of eight bytes or 64 bits. Since there are only so many bits available, only so many floating-point numbers can be represented: \textit{doubles do not model the real line}. Typically doubles can represent numbers over the range (roughly) $\pm 1.7977 \times 10^{308}$, but only to about 15 digits of precision. Suppose you're interested in the left tail of the $\chi^2$ distribution with 50 degrees of freedom: you'd like to know the CDF value for $x = 0.9$. Take a look at the following interactive session: \begin{code} ? genr p1 = cdf(X, 50, 0.9) Generated scalar p1 (ID 2) = 8.94977e-35 ? genr p2 = pvalue(X, 50, 0.9) Generated scalar p2 (ID 3) = 1 ? genr test = 1 - p2 Generated scalar test (ID 4) = 0 \end{code} The \cmd{cdf} function has produced an accurate value, but the \cmd{pvalue} function gives an answer of 1, from which it is not possible to retrieve the answer to the CDF question. This may seem surprising at first, but consider: if the value of \texttt{p1} above is correct, then the correct value for \texttt{p2} is $1 - 8.94977 \times 10^{-35}$. But there's no way that value can be represented as a double: that would require over 30 digits of precision. Of course this is an extreme example. If the $x$ in question is not too far off into one or other tail of the distribution, the \cmd{cdf} and \cmd{pvalue} functions will in fact produce complementary answers, as shown below: \begin{code} ? genr p1 = cdf(X, 50, 30) Generated scalar p1 (ID 2) = 0.0111648 ? genr p2 = pvalue(X, 50, 30) Generated scalar p2 (ID 3) = 0.988835 ? genr test = 1 - p2 Generated scalar test (ID 4) = 0.0111648 \end{code} But the moral is that if you want to examine extreme values you should be careful in selecting the function you need, in the knowledge that values very close to zero can be represented as doubles while values very close to 1 cannot. \section{Handling missing values} \label{sec:genr-missing} Four special functions are available for the handling of missing values. The boolean function \verb+missing()+ takes the name of a variable as its single argument; it returns a series with value 1 for each observation at which the given variable has a missing value, and value 0 otherwise (that is, if the given variable has a valid value at that observation). The function \verb+ok()+ is complementary to \verb+missing+; it is just a shorthand for \verb+!missing+ (where \verb+!+ is the boolean NOT operator). For example, one can count the missing values for variable \verb+x+ using \begin{code} genr nmiss_x = sum(missing(x)) \end{code} The function \verb+zeromiss()+, which again takes a single series as its argument, returns a series where all zero values are set to the missing code. This should be used with caution --- one does not want to confuse missing values and zeros --- but it can be useful in some contexts. For example, one can determine the first valid observation for a variable \verb+x+ using \begin{code} genr time genr x0 = min(zeromiss(time * ok(x))) \end{code} The function \verb+misszero()+ does the opposite of \verb+zeromiss+, that is, it converts all missing values to zero. It may be worth commenting on the propagation of missing values within \verb+genr+ formulae. The general rule is that in arithmetical operations involving two variables, if either of the variables has a missing value at observation $t$ then the resulting series will also have a missing value at $t$. The one exception to this rule is multiplication by zero: zero times a missing value produces zero (since this is mathematically valid regardless of the unknown value). \section{Retrieving internal variables} \label{sec:genr-internal} The \verb+genr+ command provides a means of retrieving various values calculated by the program in the course of estimating models or testing hypotheses. The variables that can be retrieved in this way are listed in the \GCR; here we say a bit more about the special variables \dollar{test} and \dollar{pvalue}. These variables hold, respectively, the value of the last test statistic calculated using an explicit testing command and the p-value for that test statistic. If no such test has been performed at the time when these variables are referenced, they will produce the missing value code. The ``explicit testing commands'' that work in this way are as follows: \cmd{add} (joint test for the significance of variables added to a model); \cmd{adf} (Augmented Dickey--Fuller test, see below); \cmd{arch} (test for ARCH); \cmd{chow} (Chow test for a structural break); \cmd{coeffsum} (test for the sum of specified coefficients); \cmd{cusum} (the Harvey--Collier $t$-statistic); \cmd{kpss} (KPSS stationarity test, no p-value available); \cmd{lmtest} (see below); \cmd{meantest} (test for difference of means); \cmd{omit} (joint test for the significance of variables omitted from a model); \cmd{reset} (Ramsey's RESET); \cmd{restrict} (general linear restriction); \cmd{runs} (runs test for randomness); \cmd{testuhat} (test for normality of residual); and \cmd{vartest} (test for difference of variances). In most cases both a \dollar{test} and a \dollar{pvalue} are stored; the exception is the KPSS test, for which a p-value is not currently available. An important point to notice about this mechanism is that the internal variables \dollar{test} and \dollar{pvalue} are over-written each time one of the tests listed above is performed. If you want to reference these values, you must do so at the correct point in the sequence of \app{gretl} commands. A related point is that some of the test commands generate, by default, more than one test statistic and p-value; in these cases only the last values are stored. To get proper control over the retrieval of values via \dollar{test} and \dollar{pvalue} you should formulate the test command in such a way that the result is unambiguous. This comment applies in particular to the \verb+adf+ and \verb+lmtest+ commands. \begin{itemize} \item By default, the \cmd{adf} command generates three variants of the Dickey--Fuller test: one based on a regression including a constant, one using a constant and linear trend, and one using a constant and a quadratic trend. When you wish to reference \dollar{test} or \dollar{pvalue} in connection with this command, you can control the variant that is recorded by using one of the flags \option{nc}, \option{c}, \option{ct} or \option{ctt} with \verb+adf+. \item By default, the \cmd{lmtest} command (which must follow an OLS regression) performs several diagnostic tests on the regression in question. To control what is recorded in \dollar{test} and \dollar{pvalue} you should limit the test using one of the flags \option{logs}, \option{autocorr}, \option{squares} or \option{white}. \end{itemize} As an aid in working with values retrieved using \dollar{test} and \dollar{pvalue}, the nature of the test to which these values relate is written into the descriptive label for the generated variable. You can read the label for the variable using the \cmd{label} command (with just one argument, the name of the variable), to check that you have retrieved the right value. The following interactive session illustrates this point. \begin{code} ? adf 4 x1 --c Augmented Dickey-Fuller tests, order 4, for x1 sample size 59 unit-root null hypothesis: a = 1 test with constant model: (1 - L)y = b0 + (a-1)*y(-1) + ... + e estimated value of (a - 1): -0.216889 test statistic: t = -1.83491 asymptotic p-value 0.3638 P-values based on MacKinnon (JAE, 1996) ? genr pv = $pvalue Generated scalar pv (ID 13) = 0.363844 ? label pv pv=Dickey-Fuller pvalue (scalar) \end{code} %$ \section{Numerical procedures} \label{sec:genr-numerical} Two special functions are available to aid in the construction of special-purpose estimators, namely \texttt{BFGSmax} (the BFGS maximizer, discussed in Chapter~\ref{chap:mle}) and \texttt{fdjac}, which produces a forward-difference approximation to the Jacobian. \subsection{The BFGS maximizer} \label{sec:BFGSmax} The \texttt{BFGSmax} function takes two arguments: a vector holding the initial values of a set of parameters, and a call to a function that calculates the (scalar) criterion to be maximized, given the current parameter values and any other relevant data. If the object is in fact minimization, this function should return the negative of the criterion. On successful completion, \texttt{BFGSmax} returns the maximized value of the criterion and the matrix given via the first argument holds the parameter values which produce the maximum. Here is an example: % \begin{code} matrix X = { dataset } matrix theta = { 1, 100 }' scalar J = BFGSmax(theta, ObjFunc(&theta, &X)) \end{code} % It is assumed here that \texttt{ObjFunc} is a user-defined function (see Chapter~\ref{chap:functions}) with the following general set-up: % \begin{code} function scalar ObjFunc (matrix *theta, matrix *X) scalar val = ... # do some computation return val end function \end{code} \begin{script}[htbp] \caption{Finding the minimum of the Rosenbrock function} \label{rosenbrock} \begin{scode} function scalar Rosenbrock(matrix *param) scalar x = param[1] scalar y = param[2] return -(1-x)^2 - 100 * (y - x^2)^2 end function matrix theta = { 0 , 0 } set max_verbose 1 M = BFGSmax(theta, Rosenbrock(&theta)) print theta \end{scode} \end{script} The operation of the BFGS maximizer can be adjusted using the \texttt{set} variables \verb+bfgs_maxiter+ and \verb+bfgs_toler+ (see Chapter~\ref{chap:mle}). In addition you can provoke verbose output from the maximizer by assigning a positive value to \verb|max_verbose|, again via the \texttt{set} command. The Rosenbrock function is often used as a test problem for optimization algorithms. It is also known as ``Rosenbrock's Valley'' or ``Rosenbrock's Banana Function'', on account of the fact that its contour lines are banana-shaped. It is defined by: % \[ f(x,y) = (1 - x)^2 + 100(y - x^2)^2 \] % The function has a global minimum at $(x,y) = (1,1)$ where $f(x,y) = 0$. Example~\ref{rosenbrock} shows a \app{gretl} script that discovers the minimum using \texttt{BFGSmax} (giving a verbose account of progress). \subsection{Computing a Jacobian} \label{sec:fdjac} \app{Gretl} offers the possibility of differentiating numerically a user-defined function via the \texttt{fdjac} function. This function again takes two arguments: an $n \times 1$ matrix holding initial parameter values and a function call that calculates and returns an $m \times 1$ matrix, given the current parameter values and any other relevant data. On successful completion it returns an $m \times n$ matrix holding the Jacobian. For example, % \begin{code} matrix Jac = fdjac(theta, SumOC(&theta, &X)) \end{code} where we assume that \texttt{SumOC} is a user-defined function with the following structure: % \begin{code} function matrix SumOC (matrix *theta, matrix *X) matrix V = ... # do some computation return V end function \end{code} This may come in handy in several cases: for example, if you use \texttt{BFGSmax} to estimate a model, you may wish to calculate a numerical approximation to the relevant Jacobian to construct a covariance matrix for your estimates. Another example is the delta method: if you have a consistent estimator of a vector of parameters $\hat{\theta}$, and a consistent estimate of its covariance matrix $\Sigma$, you may need to compute estimates for a nonlinear continuous transformation $\psi = g(\theta)$. In this case, a standard result in asymptotic theory is that \[ \left\{ \begin{array}{c} \hat{\theta} \convp \theta \\ \sqrt{T} \left( \hat{\theta} - \theta \right) \convd N(0, \Sigma) \end{array} \right\} \Longrightarrow \left\{ \begin{array}{c} \hat{\psi} = g(\hat{\theta}) \convp \psi = g(\theta) \\ \sqrt{T} \left( \hat{\psi} - \psi \right) \convd N(0, J \Sigma J') \end{array} \right\} \] where $T$ is the sample size and $J$ is the Jacobian $\left.\pder{g(x)}{x}\right|_{x = \theta}$. \begin{script}[htbp] \caption{Delta Method} \label{delta-method} \begin{scode} function matrix MPC(matrix *param, matrix *Y) beta = param[2] gamma = param[3] y = Y[1] return beta*gamma*y^(gamma-1) end function # William Greene, Econometric Analysis, 5e, Chapter 9 set echo off set messages off open greene5_1.gdt # Use OLS to initialize the parameters ols realcons 0 realdpi --quiet genr a = $coeff(0) genr b = $coeff(realdpi) genr g = 1.0 # Run NLS with analytical derivatives nls realcons = a + b * (realdpi^g) deriv a = 1 deriv b = realdpi^g deriv g = b * realdpi^g * log(realdpi) end nls matrix Y = realdpi[2000:4] matrix theta = $coeff matrix V = $vcv mpc = MPC(&theta, &Y) matrix Jac = fdjac(theta, MPC(&theta, &Y)) Sigma = qform(Jac, V) printf "\nmpc = %g, std.err = %g\n", mpc, sqrt(Sigma) scalar teststat = (mpc-1)/sqrt(Sigma) printf "\nTest for MPC = 1: %g (p-value = %g)\n", \ teststat, pvalue(n,abs(teststat)) \end{scode} \end{script} Script \ref{delta-method} exemplifies such a case: the example is taken from Greene (2003), section 9.3.1. The slight differences between the results reported in the original source and what \app{gretl} returns are due to the fact that the Jacobian is computed numerically, rather than analytically as in the book. \section{The discrete Fourier transform} \label{sec:genr-fft} The discrete Fourier transform can be best thought of as a linear, invertible transform of a complex vector. Hence, if $\mathbf{x}$ is an $n$-dimensional vector whose $k$-th element is $x_k = a_k + i b_k$, then the output of the discrete Fourier transform is a vector $\mathbf{f} = \mathcal{F}(\mathbf{x})$ whose $k$-th element is \[ f_k = \sum_{j=0}^{n-1} e^{-i \omega(j,k) } x_j \] where $\omega(j,k) = 2 \pi i \frac{j k}{n}$. Since the transformation is invertible, the vector $\mathbf{x}$ can be recovered from $\mathbf{f}$ via the so-called inverse transform \[ x_k = \frac{1}{n} \sum_{j=0}^{n-1} e^{i \omega(j,k) } f_j . \] The Fourier transform is used in many diverse situations on account of this key property: the convolution of two vectors can be performed efficiently by multiplying the elements of their Fourier transforms and inverting the result. If \[ z_k = \sum_{j=1}^n x_j y_{k-j} , \] then \[ \mathcal{F}(\mathbf{z}) = \mathcal{F}(\mathbf{x}) \odot \mathcal{F}(\mathbf{y}) . \] That is, $\mathcal{F}(\mathbf{z})_k = \mathcal{F}(\mathbf{x})_k \mathcal{F}(\mathbf{y})_k$. For computing the Fourier transform, \app{gretl} uses the external library \texttt{fftw3}: see Frigo and Johnson (2003). This guarantees extreme speed and accuracy. In fact, the CPU time needed to perform the transform is $O(n \log n)$ for any $n$. This is why the array of numerical techniques employed in \texttt{fftw3} is commonly known as the \emph{Fast} Fourier Transform. \app{Gretl} provides two matrix functions\footnote{See chapter \ref{chap:matrices}.} for performing the Fourier transform and its inverse: \texttt{fft} and \texttt{ffti}. In fact, \app{gretl}'s implementation of the Fourier transform is somewhat more specialized: the input to the \texttt{fft} function is understood to be real. Conversely, \texttt{ffti} takes a complex argument and delivers a real result. For example: \begin{code} x1 = { 1 ; 2 ; 3 } # perform the transform f = fft(a) # perform the inverse transform x2 = ffti(f) \end{code} yields \[ x_1 = \left[ \begin{array}{c} 1 \\ 2 \\ 3 \end{array} \right] \qquad f = \left[ \begin{array}{rr} 6 & 0 \\ -1.5 & 0.866 \\ -1.5 & -0.866 \end{array} \right] \qquad x_2 = \left[ \begin{array}{c} 1 \\ 2 \\ 3 \end{array} \right] \] where the first column of \emph{f} holds the real part and the second holds the complex part. In general, if the input to \texttt{fft} has $n$ columns, the output has $2n$ columns, where the real parts are stored in the odd columns and the complex parts in the even ones. Should it be necessary to compute the Fourier transform on several vectors with the same number of elements, it is numerically more efficient to group them into a matrix rather than invoking \texttt{fft} for each vector separately. As an example, consider the multiplication of two polynomals: \begin{eqnarray*} a(x) & = & 1 + 0.5 x \\ b(x) & = & 1 + 0.3 x - 0.8 x^2 \\ c(x) = a(x) \cdot b(x) & = & 1 + 0.8 x - 0.65 x^2 - 0.4 x^3 \end{eqnarray*} The coefficients of the polynomial $c(x)$ are the convolution of the coefficents of $a(x)$ and $b(x)$; the following \app{gretl} code fragment illustrates how to compute the coefficients of $c(x)$: \begin{code} # define the two polynomials a = { 1, 0.5, 0, 0 }' b = { 1, 0.3, -0.8, 0 }' # perform the transforms fa = fft(a) fb = fft(b) # complex-multiply the two transforms fc = cmult(fa, fb) # compute the coefficients of c via the inverse transform c = ffti(fc) \end{code} Maximum efficiency would have been achieved by grouping \texttt{a} and \texttt{b} into a matrix. The computational advantage is so little in this case that the exercise is a bit silly, but the following alternative may be preferable for a large number of rows/columns: \begin{code} # define the two polynomials a = { 1 ; 0.5; 0 ; 0 } b = { 1 ; 0.3 ; -0.8 ; 0 } # perform the transforms jointly f = fft(a ~ b) # complex-multiply the two transforms fc = cmult(f[,1:2], f[,3:4]) # compute the coefficients of c via the inverse transform c = ffti(fc) \end{code} Traditionally, the Fourier tranform in econometrics has been mostly used in time-series analysis, the periodogram being the best known example. Example script \ref{scr:pergm-fft} shows how to compute the periodogram of a time series via the \texttt{fft} function. \begin{script}[htbp] \caption{Periodogram via the Fourier transform} \label{scr:pergm-fft} \begin{scode} nulldata 50 # generate an AR(1) process series e = normal() series x = 0 x = 0.9*x(-1) + e # compute the periodogram scale = 2*pi*$nobs X = { x } F = fft(X) S = sumr(F.^2) S = S[2:($nobs/2)+1]/scale omega = seq(1,($nobs/2))' .* (2*pi/$nobs) omega = omega ~ S # compare the built-in command pergm x print omega \end{scode} \end{script} %%% Local Variables: %%% mode: latex %%% TeX-master: "gretl-guide" %%% End: