 # OLS Estimation with Endogenous Regressors

### Goals

This tutorial demonstrates the GMM estimation of a simple OLS model using the gmmFit and gmmFitIV procedures. After completing this tutorial you should be able to estimate an instrumental variables model using:

## Introduction

In this example, we will expand on the OLS model to estimate an instrumental variables model. We will again demonstrate how to estimate the model using both gmmFit and gmmFitIV. The linear model will examine the relationship between the dependent variable rent and housing values hsngval and the percentage of the population living in urban areas pcturban.

$$rent = \alpha + \beta_1*hsngval + \beta_2*pcturban$$

The data for this model is stored in the GAUSS dataset "hsng.dat".

The new addition to this model is the endogeneity of the variable hsngval. As a solution for the endogeneity, we will instrument for hsngval using pcturban, family income (faminc) and three regional dummies (reg2, reg3, reg4).

## Estimation with gmmFitIV

The gmmFitIV procedure uses the GAUSS formula string syntax to set up estimation. In the case of the instrumental variables model you must include three pieces of information to set up the model:

1. The dataset name.
2. A formula string representing the model.
3. An instrumental variable string.
//Dataset
dataset = getGAUSShome $+ "examples/hsng.dat"; //Model formula formula = "rent ~ hsngval + pcturban"; //String of instrumental variables inst_var = "pcturban + faminc + reg2 + reg3 + reg4"; call gmmFitIV(dataset, formula, inst_var); The output from our gmmFitIV estimation reads Dependent Variable: rent Number of Observations: 50 Number of Moments: 0 Number of Parameters: 3 Degrees of freedom: 47 Standard Prob Variable Estimate Error t-value >|t| ----------------------------------------------------------- CONSTANT 112.122713 10.545763 10.632 0.000 hsngval 0.001464 0.000404 3.627 0.001 pcturban 0.761548 0.264387 2.880 0.006 Instruments: pcturban, faminc, reg2, reg3, reg4, Constant Hansen Test Statistic of the Moment Restrictions Chi-Sq(3) = 6.9753314 P-value of J-stat: 0.072688216 ## Estimation with gmmFit ### Load data When using the gmmFit procedure, we must start our estimation by loading our data into data matrices and separating our data into three different data matrices y, x, and z. //Load data file data = loadd(getGAUSShome$+ "examples/hsng.dat","rent + hsngval +
pcturban + faminc +
reg2 + reg3 + reg4");

//Extract x and y matrix
y = data[., 1];
x = data[., 2:3];

//Extract instrumental variables matrix
z = data[., 3:7];

z = ones(rows(z), 1)~z;

### Write moment equation

The next step for our gmmFit estimation is to define our moment procedure. The instrumental variable model uses moments based on $E[z_tu_t(\theta_0)] = 0$ with $u_t(\theta_0) = y_t - \beta_tx_t$. Note that the resulting moment equation now has four total inputs because of the addition of z to the inputs.

proc meqn(b, yt, xt, zt);

local ut, dt;

/**  OLS resids         **/
ut = yt - b - b*xt[., 1] - b*xt[., 2];

/**  Moment conditions  **/
dt = ut .* zt;

retp(dt);

endp;

### Set model parameters

For this example, rather than setting specific starting values for the parameters, we will specify the number of parameters to be estimated using gctl.numParams. This specification will allow GAUSS to find starting parameters.

//Declare gctl to be a gmmControl struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();

//Set starting values
gctl.numParams = 3;

We will also set up the initial weight matrix for the gmmFit estimation so it will replicate the default model of the gmmFitIV procedure. In this model, the exogenous variables are contained in the data matrix z and the default initial weight matrix used by gmmFitIV will be equal to $\frac{1}{N}(Z'Z)^{-1}$. We can specify for gmmFit to use the same matrix using the gmmControl member gctl.wInitMat

//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(z))*(z'z));

Finally, we add variable names. This time we wish to add both the model variable names using gctl.varNames and the instrument names using gctl.instNames

//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };

//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };

### Call gmmFit

We are now ready to call gmmFit. Notice that this time z must be included as an input into gmmFit

call gmmFit(&meqn, y, x, z, gctl);

The output from our gmmFit estimation reads

Dependent Variable:                      rent
Number of Observations:                    50
Number of Moments:                          6
Number of Parameters:                       3
Degrees of freedom:                        47

Standard                Prob
Variable     Estimate      Error     t-value     >|t|
-----------------------------------------------------------

CONSTANT   112.122790   10.545745    10.632     0.000
hsngval      0.001464    0.000404     3.627     0.001
pcturban     0.761552    0.264387     2.880     0.006

### Conclusion

Congratulations! You have:

• Estimated an instrumental variables model using gmmFitIV.
• Estimated an instrumental variables model using gmmFit.

For convenience, the full program text is reproduced below.

//Dataset
dataset = getGAUSShome $+ "examples/hsng.dat"; //Model formula formula = "rent ~ hsngval + pcturban"; //String of instrumental variables inst_var = "pcturban + faminc + reg2 + reg3 + reg4"; call gmmFitIV(dataset, formula, inst_var); //Load data file data = loadd(getGAUSShome$+ "examples/hsng.dat","rent + hsngval +
pcturban + faminc +
reg2 + reg3 + reg4");

//Extract x and y matrix
y = data[., 1];
x = data[., 2:3];

//Extract instrumental variables matrix
z = data[., 3:7];
z = ones(rows(z),1)~z;

//Declare gctl to be a gmmControl struct
//and fill with default settings
struct gmmControl gctl;
gctl = gmmControlCreate();

//Set starting values
gctl.numParams = 3;

//Set initial weight matrix
gctl.wInitMat = invpd((1/rows(z))*(z'z));

//Variable names
gctl.varNames = { "hsngval", "pcturban", "rent" };

//Instrument names
gctl.instNames = { "pcturban", "faminc", "reg2", "reg3", "reg4" };

call gmmFit(&meqn, y, x, z, gctl);

proc meqn(b, yt, xt, zt);

local ut,dt;

/**  OLS resids         **/
ut = yt - b - b*xt[.,1] - b*xt[.,2];

/**  Moment conditions  **/
dt = ut .* zt;

retp(dt);

endp;

### Have a Specific Question?

Get a real answer from a real person

### Need Support?

Get help from our friendly experts.