\chapter{Gretl and R} \label{chap:gretlR} \section{Introduction} \label{R-intro} \app{R} is, by far, the largest free statistical project.\footnote{\app{R}'s homepage is at \url{http://www.r-project.org/}.} Like \app{gretl}, it is a GNU project and the two have a lot in common; however, \app{gretl}'s approach focuses on ease of use much more than \app{R}, which instead aims to encompass the widest possible range of statistical procedures. As is natural in the free software ecosystem, we don't view ourselves as competitors to \app{R},\footnote{OK, who are we kidding? But it's \emph{friendly} competition!} but rather as projects sharing a common goal who should support each other whenever possible. For this reason, \app{gretl} provides a way to interact with \app{R} and thus enable users to pool the capabilities of the two packages. In this chapter, we will explain how to exploit \app{R}'s power from within \app{gretl}. We assume that the reader has a working installation of \app{R} available and a basic grasp of \app{R}'s syntax.\footnote{The main reference for \app{R} documentation is \url{http://cran.r-project.org/manuals.html}. In addition, \app{R} tutorials abound on the Net; as always, Google is your friend.} Despite several valiant attempts, no graphical shell has gained wide acceptance in the \app{R} community: by and large, the standard method of working with \app{R} is by writing scripts, or by typing commands at the \app{R} prompt, much in the same way as one would write \app{gretl} scripts or work with the \app{gretl} console. In this chapter, the focus will be on the methods available to execute \app{R} commands without leaving \app{gretl}. \section{Starting an interactive \app{R} session} \label{sec:R-interactive} The easiest way to use \app{R} from \app{gretl} is in interactive mode. Once you have your data loaded in \app{gretl}, you can select the menu item ``Tools, Start GNU R'' and an interactive \app{R} session will be started, with your dataset automatically pre-loaded. \subsection{A simple example: OLS on cross-section data} \label{sec:R-ols-ex} For this example we use Ramanathan's dataset \texttt{data4-1}, one of the sample files supplied with \app{gretl}. We first run, in \app{gretl}, an OLS regression of \texttt{price} on \texttt{sqft}, \texttt{bedrms} and \texttt{baths}. The basic results are shown in Table \ref{tab:data4-1-gretlOLS}. \begin{table}[htbp] \caption{OLS house price regression via \app{gretl}} \label{tab:data4-1-gretlOLS} \begin{center} \begin{tabular*}{0.75\textwidth}{@{\extracolsep{\fill}} l% col 1: varname D{.}{.}{-1}% col 2: coeff D{.}{.}{-1}% col 3: sderr D{.}{.}{-1}% col 4: t-stat D{.}{.}{4}}% col 5: p-value (or slope) Variable & \multicolumn{1}{c}{Coefficient} & \multicolumn{1}{c}{Std.\ Error} & \multicolumn{1}{c}{$t$-statistic} & \multicolumn{1}{c}{p-value} \\[1ex] const & 129.062 & 88.3033 & 1.4616 & 0.1746 \\ sqft & 0.154800 & 0.0319404 & 4.8465 & 0.0007 \\ bedrms & -21.587 & 27.0293 & -0.7987 & 0.4430 \\ baths & -12.192 & 43.2500 & -0.2819 & 0.7838 \\ \end{tabular*} \end{center} \end{table} We will now replicate the above results using \app{R}. Select the menu item ``Tools, Start GNU R''. A window similar to the one shown in figure \ref{fig:Rwind1} should appear. \begin{figure}[htbp] \centering \includegraphics[scale=0.7]{figures/Rwindow-1} \caption{\app{R} window} \label{fig:Rwind1} \end{figure} The actual look of the \app{R} window may be somewhat different from what you see in Figure~\ref{fig:Rwind1} (especially for Windows users), but this is immaterial. The important point is that you have a window where you can type commands to \app{R}. If the above procedure doesn't work and no \app{R} window opens, it means that \app{gretl} was unable to launch \app{R}. You should ensure that \app{R} is installed and working on your system and that \app{gretl} knows where it is. The relevant settings can be found by selecting the ``Tools, Preferences, General'' menu entry, under the ``Programs'' tab. Assuming \app{R} was launched successfully, you will notice that two commands have been executed automatically: \begin{code} gretldata <- read.table("/home/jack/.gretl/Rdata.tmp", header=TRUE) attach(gretldata) \end{code} These commands have the effect of loading our dataset into the \app{R} workspace in the form of a \emph{data frame} (one of several forms in which \app{R} can store data). Use of a data frame enables the subsequent \texttt{attach()} command, which sets things up so that the variable names defined in the \app{gretl} workspace are available as valid identifiers within \app{R}. In order to replicate \app{gretl}'s OLS estimation, go into the \app{R} window and type at the prompt \begin{code} model <- lm(price ~ sqft + bedrms + baths) summary(model) \end{code} \begin{figure}[htbp] \centering \includegraphics[scale=0.7]{figures/Rwindow-2} \caption{OLS regression on house prices via \app{R}} \label{fig:Rwind2} \end{figure} You should see something similar to Figure~\ref{fig:Rwind2}. Surprise --- the estimates coincide! To get out, just close the \app{R} window or type \verb|q()| at the \app{R} prompt. \subsection{Time series data} \label{sec:R-ols-arma} We now turn to an example which uses time series data: we will compare \app{gretl}'s and \app{R}'s estimates of Box and Jenkins' immortal ``airline'' model. The data are contained in the \texttt{bjg} sample dataset. The following \app{gretl} code \begin{code} open bjg arima 0 1 1 ; 0 1 1 ; lg --nc \end{code} produces the estimates shown in Table \ref{tab:airline-gretl}. \begin{table}[htbp] \caption{Airline model from Box and Jenkins (1976) --- selected portion of \app{gretl}'s estimates} \label{tab:airline-gretl} \begin{center} \begin{tabular*}{0.75\textwidth}{@{\extracolsep{\fill}} l% col 1: varname D{.}{.}{-1}% col 2: coeff D{.}{.}{-1}% col 3: sderr D{.}{.}{-1}% col 4: t-stat D{.}{.}{4}}% col 5: p-value (or slope) Variable & \multicolumn{1}{c}{Coefficient} & \multicolumn{1}{c}{Std.\ Error} & \multicolumn{1}{c}{$t$-statistic} & \multicolumn{1}{c}{p-value} \\[1ex] $\theta_{1}$ & -0.401824 & 0.0896421 & -4.4825 & 0.0000 \\ $\Theta_{1}$ & -0.556936 & 0.0731044 & -7.6184 & 0.0000 \\ \end{tabular*} \begin{tabular}{lD{.}{.}{-1}} Variance of innovations & 0.00134810 \\ Log-likelihood & 244.696 \\ Akaike information criterion & -483.39 \end{tabular} \end{center} \end{table} If we now open an \app{R} session as described in the previous subsection, the data-passing mechanism is slightly different. The \app{R} commands that read the data from \app{gretl} are in this case \begin{code} # load data from gretl gretldata <- read.table("/home/jack/.gretl/Rdata.tmp", header=TRUE) gretldata <- ts(gretldata, start=c(1949, 1), frequency = 12) \end{code} Since our data were defined in \app{gretl} as time series, we use an \app{R} \emph{time-series} object (\emph{ts} for short) for the transfer. In this way we can retain in \app{R} useful information such as the periodicity of the data and the sample limits. The downside is that the names of individual series, as defined in \app{gretl}, are not valid identifiers. In order to extract the variable \texttt{lg}, one needs to use the syntax \verb|lg <- gretldata[, "lg"]|. ARIMA estimation can be carried out by issuing the following two \app{R} commands: \begin{code} lg <- gretldata[, "lg"] arima(lg, c(0,1,1), seasonal=c(0,1,1)) \end{code} which yield \begin{code} Coefficients: ma1 sma1 -0.4018 -0.5569 s.e. 0.0896 0.0731 sigma^2 estimated as 0.001348: log likelihood = 244.7, aic = -483.4 \end{code} Happily, the estimates again coincide. \section{Running an \app{R} script} \label{sec:R-scripts} Opening an \app{R} window and keying in commands is a convenient method when the job is small. In some cases, however, it would be preferable to have \app{R} execute a script prepared in advance. One way to do this is via the \texttt{source()} command in \app{R}. Alternatively, \app{gretl} offers the facility to edit an \app{R} script and run it, having the current dataset pre-loaded automatically. This feature can be accessed via the ``File, Script Files'' menu entry. By selecting ``User file'', one can load a pre-existing \app{R} script; if you want to create a new script instead, select the ``New script, R script'' menu entry. \begin{figure}[htbp] \centering \includegraphics[scale=0.7]{figures/R-edit1} \caption{Editing window for \app{R} scripts} \label{fig:R-edit1} \end{figure} In either case, you are presented with a window very similar to the editor window used for ordinary \app{gretl} scripts, as in Figure~\ref{fig:R-edit1}. There are two main differences. First, you get syntax highlighting for \app{R}'s syntax instead of \app{gretl}'s. Second, clicking on the Execute button (the gears icon), launches an instance of \app{R} in which your commands are executed. Before \app{R} is actually run, you are asked if you want to run \app{R} interactively or not (see Figure~\ref{fig:R-exec-mode}). \begin{figure}[htbp] \centering \includegraphics[scale=0.7]{figures/R-exec-mode} \caption{Editing window for \app{R} scripts} \label{fig:R-exec-mode} \end{figure} An interactive run opens an \app{R} instance similar to the one seen in the previous section: your data will be pre-loaded (if the ``pre-load data'' box is checked) and your commands will be executed. Once this is done, you will find yourself at the \app{R} prompt, where you can enter more commands. A non-interactive run, on the other hand, will execute your script, collect the output from \app{R} and present it to you in an output window; \app{R} will be run in the background. If, for example, the script in Figure~\ref{fig:R-edit1} is run non-interactively, a window similar to Figure~\ref{fig:R-output1} will appear. \begin{figure}[htbp] \centering \includegraphics[scale=0.7]{figures/R-output1} \caption{Output from a non-interactive \app{R} run} \label{fig:R-output1} \end{figure} \section{Taking stuff back and forth} \label{sec:R-passing-data} As regards the passing of data between the two programs, so far we have only considered passing series from \app{gretl} to \app{R}. In order to achieve a satisfactory degree of interoperability, more is needed. In the following sub-sections we see how matrices can be exchanged, and how data can be passed from \app{R} back to \app{gretl}. \subsection{Passing matrices from \app{gretl} to \app{R}} For passing matrices from \app{gretl} to \app{R}, you can use the \texttt{mwrite} matrix function described in section \ref{sec:matrix-csv}. For example, the following \app{gretl} code fragment generates the matrix \[ A = \left[ \begin{array}{ccc} 3 & 7 & 11 \\ 4 & 8 & 12 \\ 5 & 9 & 13 \\ 6 & 10 & 14 \end{array} \right] \] and stores it into the file \texttt{mymatfile.mat}. \begin{code} matrix A = mshape(seq(3,14),4,3) err = mwrite(A, "mymatfile.mat") \end{code} In order to retrieve this matrix from \app{R}, all you have to do is \begin{code} A <- as.matrix(read.table("mymatfile.mat", skip=1)) \end{code} Although in principle you can give your matrix file any valid filename, a couple of conventions may prove useful. First, you may want to use an informative file suffix such as ``\texttt{.mat}'', but this is a matter of taste. More importantly, the exact location of the file created by \texttt{mwrite} could be an issue. By default, if no path is specified in the file name, \app{gretl} stores matrix files in the current work directory. However, it may be wise for the purpose at hand to use the directory in which \app{gretl} stores all its temporary files, whose name is stored in the built-in string \texttt{dotdir} (see section \ref{sec:named-strings}). The value of this string is automatically passed to \app{R} as the string variable \texttt{gretl.dotdir}, so the above example may be rewritten more cleanly as \app{Gretl} side: \begin{code} matrix A = mshape(seq(3,14),4,3) err = mwrite(A, "@dotdir/mymatfile.mat") \end{code} \app{R} side: \begin{code} fname <- paste(gretl.dotdir, "mymatfile.mat", sep="") A <- as.matrix(read.table(fname, skip=1)) \end{code} \subsection{Passing data from \app{R} to \app{gretl}} \label{sec:Rpassing-data} For passing data in the opposite direction, \app{gretl} defines a special function that can be used in the \app{R} environment. An \app{R} object will be written as a temporary file in \app{gretl}'s \texttt{dotdir} directory, from where it can be easily retrieved from within \app{gretl}. The name of this function is \texttt{gretl.export()}, and it accepts one argument, the object to be exported. At present, the objects that can be exported with this method are matrices, data frames and time-series objects. The function creates a text file, with the same name as the exported object, in \app{gretl}'s temporary directory. Data frames and time-series objects are stored as CSV files, and can be retrieved by using \app{gretl}'s \texttt{append} command. Matrices are stored in a special text format that is understood by \app{gretl} (see section~\ref{sec:matrix-csv}); the file suffix is in this case \texttt{.mat}, and to read the matrix in \app{gretl} you must use the \texttt{mread()} function. As an example, we take the airline data and use them to estimate a structural time series model \`a la Harvey (1989). The model we will use is the \emph{Basic Structural Model} (BSM), in which a time series is decomposed into three terms: \[ y_t = \mu_t + \gamma_t + \varepsilon_t \] where $\mu_t$ is a trend component, $\gamma_t$ is a seasonal component and $\varepsilon_t$ is a noise term. In turn, the following is assumed to hold: \begin{eqnarray*} \Delta \mu_t & = & \beta_{t-1} + \eta_t \\ \Delta \beta_t & = & \zeta_t \\ \Delta_s \gamma_t & = & \Delta \omega_t \end{eqnarray*} where $\Delta_s$ is the seasonal differencing operator, $(1-L^s)$, and $\eta_t$, $\zeta_t$ and $\omega_t$ are mutually uncorrelated white noise processes. The object of the analysis is to estimate the variances of the noise components (which may be zero) and to recover estimates of the latent processes $\mu_t$ (the ``level''), $\beta_t$ (the ``slope'') and $\gamma_t$. \app{Gretl} does not provide (yet) a command for estimating this class of models, so we will use \app{R}'s \texttt{StructTS} command and import the results back into \app{gretl}. Once the \texttt{bjg} dataset is loaded in \app{gretl}, we pass the data to \app{R} and execute the following script: \begin{code} # extract the log series y <- gretldata[, "lg"] # estimate the model strmod <- StructTS(y) # save the fitted components (smoothed) compon <- as.ts(tsSmooth(strmod)) # save the estimated variances vars <- as.matrix(strmod$coef) # export into gretl's temp dir gretl.export(compon) gretl.export(vars) \end{code} %$ In this case, running the above in \app{R} produces nothing more that the echoing of commands: \begin{code} > # load data from gretl > gretldata <- read.table("/home/jack/.gretl/Rdata.tmp", header=TRUE) > gretldata <- ts(gretldata, start=c(1949, 1), frequency = 12) > # load script from gretl > # extract the log series > y <- gretldata[, "lg"] > # estimate the model > strmod <- StructTS(y) > # save the fitted components (smoothed) > compon <- as.ts(tsSmooth(strmod)) > # save the estimated variances > vars <- as.matrix(strmod$coef) > # export into gretl's temp dir > gretl.export(compon) > gretl.export(vars) \end{code} %$ However, we see from the output that the two \texttt{gretl.export()} commands ran without errors. Hence, we are ready to pull the results back into \app{gretl} by executing the following commands, either from the console or by creating a small script:\footnote{This example will work on Linux and presumably on OSX without modifications. On the Windows platform, you may have to substitute the ``\texttt{/}'' character with ``$\backslash$''.} \begin{code} append @dotdir/compon.csv vars = mread("@dotdir/vars.mat") \end{code} The first command reads the estimated time-series components from a CSV file, which is the format that the passing mechanism employs for series. The matrix \texttt{vars} is read from the file \texttt{vars.mat}. \begin{figure}[htbp] \centering \includegraphics{figures/BSM-output} \caption{Estimated components from BSM} \label{fig:BSM-output} \end{figure} After the above commands have been executed, three new series will have appeared in the \app{gretl} workspace, namely the estimates of the three components; by plotting them together with the original data, you should get a graph similar to Figure~\ref{fig:BSM-output}. The estimates of the variances can be seen by printing the \texttt{vars} matrix, as in \begin{code} ? print vars vars (4 x 1) 0.00077185 0.0000 0.0013969 0.0000 \end{code} That is, \begin{equation*} \hat{\sigma}^2_{\eta} = 0.00077185, \quad \hat{\sigma}^2_{\zeta} = 0, \quad \hat{\sigma}^2_{\omega} = 0.0013969, \quad \hat{\sigma}^2_{\varepsilon} = 0 \end{equation*} Notice that, since $\hat{\sigma}^2_{\zeta} = 0$, the estimate for $\beta_t$ is constant and the level component is simply a random walk with a drift. \section{Interacting with \app{R} from the command line} \label{sec:foreign-command} Up to this point we have spoken only of interaction with \app{R} via the GUI program. In order to do the same from the command line interface, \app{gretl} provides the \texttt{foreign} command. This enables you to embed non-native commands within a \app{gretl} script.\footnote{In future this facility may be extended to handle interaction with other programs, but for the present only \app{R} commands are accepted.} A ``foreign'' block takes the form \begin{code} foreign language=R [--send-data] [--quiet] ... R commands ... end foreign \end{code} and achieves the same effect as submitting the enclosed \app{R} commands via the GUI in the non-interactive mode (see section~\ref{sec:R-scripts} above). The \option{send-data} option arranges for auto-loading of the data present in the \app{gretl} session. The \option{quiet} option prevents the output from \app{R} from being echoed in the \app{gretl} output. \begin{script}[htbp] \caption{Estimation of the Basic Structural Model --- simple} \begin{scode} open bjg.gdt foreign language=R --send-data y <- gretldata[, "lg"] strmod <- StructTS(y) compon <- as.ts(tsSmooth(strmod)) vars <- as.matrix(strmod$coef) gretl.export(compon) gretl.export(vars) end foreign append @dotdir/compon.csv rename level lg_level rename slope lg_slope rename sea lg_seas vars = mread("@dotdir/vars.mat") \end{scode} \label{RStructTS-simple} \end{script} %$ Using this method, replicating the example in the previous subsection is rather easy: basically, all it takes is encapsulating the content of the \app{R} script in a \texttt{foreign}\ldots\texttt{end foreign} block; see example \ref{RStructTS-simple}. \begin{script}[htbp] \caption{Estimation of the Basic Structural Model --- via a function} \begin{scode} function RStructTS(series myseries) smpl ok(myseries) --restrict sx = argname(myseries) foreign language=R --send-data --quiet @sx <- gretldata[, "myseries"] strmod <- StructTS(@sx) compon <- as.ts(tsSmooth(strmod)) gretl.export(compon) end foreign append @dotdir/compon.csv rename level @sx_level rename slope @sx_slope rename sea @sx_seas list ret = @sx_level @sx_slope @sx_seas return list ret end function # ------------ main ------------------------- open bjg.gdt list X = RStructTS(lg) \end{scode} \label{RStructTS-func} \end{script} The above syntax, despite being already quite useful by itself, shows its full power when it is used in conjunction with user-written functions. Example~\ref{RStructTS-func} shows how to define a \app{gretl} function that calls \app{R} internally. A note on performance: at present, when \app{R} is called from within \app{gretl} using a \texttt{foreign} block, the \app{R} program is started up on each invocation, which can be quite time consuming. For maximum performance, you should organize your script so as to group together as many \app{R} operations as possible, hence minimizing the number of distinct \texttt{foreign} blocks.\footnote{In future we may be able to improve on this, using calls to the \app{R} shared library in place of invocations of the program.} %%% Local Variables: %%% mode: latex %%% TeX-master: "gretl-guide" %%% End: