Goals
This tutorial demonstrates the GMM estimation of a simple OLS model using the gmmFit
and gmmFitIV
procedures. After completing this tutorial you should be able to estimate an instrumental variables model using:
Introduction
In this example, we will expand on the OLS model to estimate an instrumental variables model. We will again demonstrate how to estimate the model using both gmmFit
and gmmFitIV
. The linear model will examine the relationship between the dependent variable rent
and housing values hsngval
and the percentage of the population living in urban areas pcturban
.
$$rent = \alpha + \beta_1*hsngval + \beta_2*pcturban$$
The data for this model is stored in the GAUSS dataset "hsng.dat".
The new addition to this model is the endogeneity of the variable hsngval
. As a solution for the endogeneity, we will instrument for hsngval
using pcturban
, family income (faminc
) and three regional dummies (reg2
, reg3
, reg4
).
Estimation with gmmFitIV
The gmmFitIV
procedure uses the GAUSS formula string syntax to set up estimation. In the case of the instrumental variables model you must include three pieces of information to set up the model:
- The dataset name.
- A formula string representing the model.
- An instrumental variable string.
//Dataset
dataset = getGAUSShome $+ "examples/hsng.dat";
//Model formula
formula = "rent ~ hsngval + pcturban";
//String of instrumental variables
inst_var = "pcturban + faminc + reg2 + reg3 + reg4";
call gmmFitIV(dataset, formula, inst_var);
+
, variable_1 + variable_2 + ... + variable_k
. The output from our gmmFitIV
estimation reads
Dependent Variable: rent Number of Observations: 50 Number of Moments: 0 Number of Parameters: 3 Degrees of freedom: 47 Standard Prob Variable Estimate Error t-value >|t| ----------------------------------------------------------- CONSTANT 112.122713 10.545763 10.632 0.000 hsngval 0.001464 0.000404 3.627 0.001 pcturban 0.761548 0.264387 2.880 0.006 Instruments: pcturban, faminc, reg2, reg3, reg4, Constant Hansen Test Statistic of the Moment Restrictions Chi-Sq(3) = 6.9753314 P-value of J-stat: 0.072688216
Estimation with gmmFit
Load data
When using the gmmFit
procedure, we must start our estimation by loading our data into data matrices and separating our data into three different data matrices y
, x
, and z
.
//Load data file
data = loadd(getGAUSShome $+ "examples/hsng.dat","rent + hsngval +
pcturban + faminc +
reg2 + reg3 + reg4");
//Extract x and y matrix
y = data[., 1];
x = data[., 2:3];
//Extract instrumental variables matrix
z = data[., 3:7];
//Add constant to z
z = ones(rows(z), 1)~z;
Write moment equation
The next step for our gmmFit
estimation is to define our moment procedure. The instrumental variable model uses moments based on $E[z_tu_t(\theta_0)] = 0$ with $u_t(\theta_0) = y_t - \beta_tx_t$. Note that the resulting moment equation now has four total inputs because of the addition of z to the inputs.
proc meqn(b, yt, xt, zt);
local ut, dt;
/** OLS resids **/
ut = yt - b[1] - b[2]*xt[., 1] - b[3]*xt[., 2];
/** Moment conditions **/
dt = ut .* zt;
retp(dt);
endp;
Set model parameters
For this example, rather than setting specific starting values for the parameters, we will specify the number of parameters to be estimated using gctl.numParams
. This specification will allow GAUSS to find starting parameters.
//Declare gctl to be a gmmControl struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();
//Set starting values
gctl.numParams = 3;
We will also set up the initial weight matrix for the gmmFit
estimation so it will replicate the default model of the gmmFitIV
procedure.
In this model, the exogenous variables are contained in the data matrix z
and the default initial weight matrix used by gmmFitIV
will be equal to $\frac{1}{N}(Z'Z)^{-1}$. We can specify for gmmFit
to use the same matrix using the gmmControl
member gctl.wInitMat
//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(z))*(z'z));
Finally, we add variable names. This time we wish to add both the model variable names using gctl.varNames
and the instrument names using gctl.instNames
//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };
//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };
Call gmmFit
We are now ready to call gmmFit
. Notice that this time z
must be included as an input into gmmFit
call gmmFit(&meqn, y, x, z, gctl);
The output from our gmmFit
estimation reads
Dependent Variable: rent Number of Observations: 50 Number of Moments: 6 Number of Parameters: 3 Degrees of freedom: 47 Standard Prob Variable Estimate Error t-value >|t| ----------------------------------------------------------- CONSTANT 112.122790 10.545745 10.632 0.000 hsngval 0.001464 0.000404 3.627 0.001 pcturban 0.761552 0.264387 2.880 0.006
Conclusion
Congratulations! You have:
- Estimated an instrumental variables model using
gmmFitIV
. - Estimated an instrumental variables model using
gmmFit
.
For convenience, the full program text is reproduced below.
//Dataset
dataset = getGAUSShome $+ "examples/hsng.dat";
//Model formula
formula = "rent ~ hsngval + pcturban";
//String of instrumental variables
inst_var = "pcturban + faminc + reg2 + reg3 + reg4";
call gmmFitIV(dataset, formula, inst_var);
//Load data file
data = loadd(getGAUSShome $+ "examples/hsng.dat","rent + hsngval +
pcturban + faminc +
reg2 + reg3 + reg4");
//Extract x and y matrix
y = data[., 1];
x = data[., 2:3];
//Extract instrumental variables matrix
z = data[., 3:7];
//Add constant to z
z = ones(rows(z),1)~z;
//Declare gctl to be a gmmControl struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();
//Set starting values
gctl.numParams = 3;
//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(z))*(z'z));
//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };
//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };
call gmmFit(&meqn, y, x, z, gctl);
proc meqn(b, yt, xt, zt);
local ut,dt;
/** OLS resids **/
ut = yt - b[1] - b[2]*xt[.,1] - b[3]*xt[.,2];
/** Moment conditions **/
dt = ut .* zt;
retp(dt);
endp;