Goals
This tutorial demonstrates the GMM estimation of a simple OLS model using the gmmFit and gmmFitIV procedures. After completing this tutorial you should be able to estimate an OLS model with exogenous regressors using:
Introduction
In this example, we will estimate a simple OLS model using GMM. Because this model is a linear model, we can and will estimate the model using both gmmFit and gmmFitIV. The linear model we will estimate examines the relationship between gas mileage and vehicle weight and length:
$$mpg = \alpha + \beta_1*weight + \beta_2*length$$
The data for this model is stored in the dataset auto2.dta, located in the GAUSS examples folder.
Estimation with gmmFitIV
While the gmmFit procedure minimizes the GMM objective function to estimate the model parameters, gmmFitIV computes the analytic GMM estimates for instrumental variables. gmmFitIV provides a compact method for estimating IV and OLS models. In fact, we can estimate the model using gmmFitIV in one line:
//Create dataset file name with full path
dset_name = getGAUSShome() $+ "examples/auto2.dta";
//Perform estimation
call gmmFitIV(dset_name, "mpg ~ weight + length");
The output from our gmmFitIV estimation reads:
Dependent Variable: mpg
Number of Observations: 74
Number of Moments: 3
Number of Parameters: 3
Degrees of freedom: 71
Standard Prob
Variable Estimate Error t-value >|t|
-----------------------------------------------------------
CONSTANT 47.884873 7.506021 6.380 0.000
weight -0.003851 0.001947 -1.978 0.052
length -0.079593 0.067753 -1.175 0.244
The estimates from gmmFitIV are the same as the estimates from gmmFit, as you will see. However, note that the gmmFitIV table includes variable names. This occurs because GAUSS is able to extract variable names from the formula string used to identify the model in gmmFitIV.
Estimation with gmmFit
Load data
In order to estimate our model using gmmFit we must first load our data into data matrices. For this example, we will use just three variables from the auto2.dta dataset, mpg, weight, and length.
//Create dataset file name with full path
dset_name = getGAUSShome() $+ "examples/auto2.dta";
//Load variables 'mpg', 'weight' and 'length'
//into matrix 'data'
data = loadd(dset_name , "mpg + weight + length");
+, variable_1 + variable_2 + ... + variable_k The columns in the matrix data will be in the order the variables are specified in the formula string. We can use this information to create two separate data matrices, y for our dependent variable and X for or independent variables.
//Declare 'y' variable
y = data[., 1];
//'X' variables
X = data[., 2:3];
Finally, we want to include a constant in this model. This is not done automatically with the gmmFit procedure and a column of ones must be concatenated to the beginning of the already defined data matrix X:
//Concatenate a column of ones to the 'X' data
X = ones(rows(data), 1) ~ data[., 2:3];
Write the moment equation
The next step for our gmmFit estimation is to define our moment procedure. For this example, we will estimate a linear model with moments based on $E[x_tu_t(\theta_0)] = 0$ with $u_t(\theta_0) = y_t - \beta_t x_t$ :
proc meqn(b, yt, xt);
local ut,dt;
/** OLS resids **/
ut = yt - b[1] - b[2]*xt[., 2] - b[3]*xt[., 3];
/** Moment conditions **/
dt = ut.*xt;
retp(dt);
endp;
Set Model Parameters
Model parameters are controlled using a gmmControl structure. Therefore, prior to setting model parameters we must declare an instance of the gmmControl structure and fill the instance with default values.
//Declare `gctl` to be a `gmmControl` struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();
The first thing we must set in the gmmControl structure is the start values of the parameters, using gctl.bStart.
//Set starting values
gctl.bStart = { 41, -0.005, -0.001 };
Finally, we will set up the initial weight matrix for the gmmFit estimation so it will replicate the default model of the gmmFitIV procedure. Because the variables weight and length are assumed to be exogenous in this model, the initial weight matrix used by gmmFitIV will be equal to $\frac{1}{N}(X'X)^{-1}$. We can specify for gmmFit to use the same matrix using the gmmControl member gctl.wInitMat:
//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(X))*(X'X));
Call gmmFit
We are finally ready to call gmmFit. For this example, we will use the GAUSS keyword call to run gmmFit and print results directly to the input/output screen.
call gmmFit(&meqn, y, x, gctl);
The output from our gmmFit estimation reads
Dependent Variable: Y
Number of Observations: 74
Number of Moments: 3
Number of Parameters: 3
Degrees of freedom: 71
Standard Prob
Variable Estimate Error t-value >|t|
-----------------------------------------------------------
Beta1 47.884629 7.506023 6.379 0.000
Beta2 -0.003852 0.001947 -1.978 0.052
Beta3 -0.079591 0.067753 -1.175 0.244
which is the same, other than the variable names, as our results from gmmFitIV earlier in this tutorial.
gmmFit table using the gmmControl structure member, gctl.varNames. The gctl.varNames structure member must be a string array that lists all independent variables first and includes the dependent variable as the last element. You do NOT have to name the constant, this is done automatically by GAUSS. Conclusion
Congratulations! You have:
- Estimated an OLS model using
gmmFitIV. - Estimates an OLS model using
gmmFit.
For convenience, the full program text is reproduced below.
Our next tutorial will demonstrate the estimation of an OLS model with endogenous variables.
//Create dataset file name with full path
dset_name = getGAUSShome() $+ "examples/auto2.dta";
//Perform estimation
call gmmFitIV(dset_name, "mpg ~ weight + length");
//Create dataset file name with full path
dset_name = getGAUSShome() $+ "examples/auto2.dta";
//Load variables 'mpg', 'weight' and 'length'
//into matrix 'data'
data = loadd(dset_name , "mpg + weight + length");
//Declare 'y' variable
y = data[.,1];
//'X' variables
X = data[.,2:3];
//Concatenate a column of ones to the 'X' data
X = ones(rows(data),1) ~ data[.,2:3];
//Declare `gctl` to be a `gmmControl` struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();
//Set starting values
gctl.bStart = { 41, -0.005, -0.001 };
//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(X))*(X'X));
call gmmFit(&meqn, y, x, gctl);
proc meqn(b, yt, xt);
local ut,dt;
/** OLS resids **/
ut = yt - b[1] - b[2]*xt[.,2] - b[3]*xt[.,3];
/** Moment conditions **/
dt = ut.*xt;
retp(dt);
endp; 