Hello,

I have an cross-sectional individual-level data which could be structured as a panel data, but for now let's ignore this potential panel data structure . Based on this I want to allow for correlations among individuals at the state (U.S. states) level (the most aggregate level) so that I am wondering whether you could please illustrate how to compute the one-way cluster-robust covariance matrix (clustering by state) for a linear model in the cross-sectional context.

## 2 Answers

0

The one-way cluster robust standard errors can be computed using the "sandwich" estimator method for covariance:

VCE( β ) = (X'X)^{-1}Ω(X'X)^{-1}

In the case of panel series where we have N groups and T time periods per a group

NT*Ω is found by summing i from 1 to N

NT*Ω_{i} = X_{i}'u_{i}u_{i}'X_{i}

where

u_{i} ≡ (u_{i1} . . . u_{iT})

X_{i} ≡ (x_{i1} . . . x_{iT}).

Thinking in a pooled ols framework, u_{i} is a T x 1 vector of pooled ols residuals for group *i*.

In GAUSS, this can be achieved using the **olsmt** procedure and the results stored in the **olsmtOut** structure member. Within the **olsmtOut** structure named *oOut* the member *oOut.resid* houses estimate residuals.

A complete tutorial on using both structures and the **olsmt** procedure can be found on the Aptech tutorial page click here.

This guidance in finding the cluster-robust covariance matrix in GAUSS. However, I will update this post with a complete example of code that can be used to compute the one-way cluster-robust covariance matrix for panel data within the next 24 hours.

0

## Data Generation

This example demonstrates the use of the "sandwich" covariance method to compute a one-way cluster robust convariance matrix in GAUSS. The randomly generated data used in this example can be replicated using the **rndseed** and the **rndn** procedure. To generate a panel of 6 independent regressors across 30 groups each having 20 observations each:

rndseed 1046823; //Dimensions n_groups = 30; n_obs_each = 20; n_variables = 6; //Generate pooled Y data y =rndn(n_obs_each*n_groups,1); //Generate pooled X data x =rndn(n_obs_each*n_groups,6);

## Model Estimation

Once the data is generated the pooled ols model is estimated using **olsmt**:

//Set up olsmt struct olsmtControl oc0; oc0 =olsmtControlCreate;//Compute residuals oc0.res=1; //Print output oc0.output=1; //Turn constant off oc0.con=0; //Output structure struct olsmtOut oOut; oOut =olsmt(oc0 , 0 , y , x);

## One-way Cluster Robust Covariance

For clarity, this example uses a **GAUSS** *for* loop to calculate the the one-way cluster robust covariance matrix. This is done by first creating a vector of group indicators. Since the generated data has n_groups, first construct a sequential vector ranging from one to n_groups using **seqa**:

```
//Generate sequential vector
group = seqa(1 , 1 , n_groups);
```

Next, reshape *group* such that there are *n_obs_each* in each group:

//Reshape group vector group =vec(reshape(group , n_obs_each , n_groups));

Next, using a *for* loop, the within group covariance matrix is calculated from the residuals from the **olsmtOut** structure, *oOut.resid*. These covariances are stored in a three dimensional GAUSS array, *Vcx_i*:

//Initialize the variance-covariance matrix Vcx_i =arrayinit(n_groups|n_vars|n_vars , 0); //Loop through groups for i(1 , n_groups , 1); x_i =selif(x , group .== i); u_i =selif(oOut.resid, group .== i); Vcx_i[i,.,.] = x_i'*(u_i*u_i')*x_i; endfor;

Sum across the groups:

//Sum across the groups (note groups are held in the 3 dimension) Vcx_tot =arraytomat(asum(Vcx_i , 3));

Finally, using the "sandwich" method find panel covariance matrix and scale:

//Find avar_beta avar_beta2 = inv(X'X)*Vcx_tot*inv(X'X); //Small sample correction c = (n_groups/(n_groups-1))*((t-1)/(t-n_vars)); avar_oneway = c*avar_beta2;

## Eliminating the *for* Loop

Though the *for loop* is intuitive, replacing the loop with matrix operations can be much more computationally efficient. The first step is to create an indicator matrix to separate group :

//Create giant block X matrix Eg =ones(t,t); Sg =eye(n_groups).*.Eg;

The indicator matrix can then be used to compute a "giant" matrix of group-specific covariances:

```
//Giant matrix of covariances
sig = (oOut.resid*oOut.resid').*Sg;
```

Finally, the "sandwich" equation and the small sample correction constant are used to calculate the one-way cluster robust covariance matrix:

//Find covariance matrix //Sandwich "bread" inv_xx = (inv(X'X)); //Sandwich "meat" x_sig_x = (X'*sig*X); //Small sample correction c = (n_groups/(n_groups-1))*((t-1)/(t-n_vars)); //Cluster robust standard error avar_beta = c*((inv_xx*x_sig_x)*inv_xx); //Print solution print "One-way cluster robust covariance matrix with small sample correction:"; print avar_oneway; print ; print "One-way cluster robust se:"; printsqrt(diag(avar_oneway)); print ; print "Non-adjusted se:"; printsqrt(diag(oOut.vc));

## Creating a GAUSS Procedure

One final step that can be taken is to create GAUSS procedure for finding the one-way cluster robust covariance matrix. The code below creates and demonstrates the use of the **clusterCov** procedure which can be called either from the command line or from within a program file. This procedure is to be used post-estimation and requires four inputs: residuals (*pooled_res*), a "giant" stacked matrix of independent regressors (*bigX*), the number of groups (*n_groups*), and the number of observations per a group (*t*):

//Setup example //Use ols residuals from past example pooled_res = oOut.resid; //Call function avar_beta3 =covCluster(pooled_res , x , n_groups , t); //Declare procedure proc(1) =covCluster(pooled_res , big_x , n_groups , t); local Eg ,Sg , sig , xxi , x_sig_x , avar_beta , avar_oneway , c , n_vars; n_vars = cols(big_x); //Create giant block X matrix Eg =ones(t , t); Sg =eye(n_groups) .*. Eg; //Group specific covariance matrix sig = (pooled_res*pooled_res') .* Sg; xxi = (inv(big_x'big_x)); x_sig_x = (big_x'*sig*big_x); avar_beta = (inv_xx*x_sig_x)*inv_xx; //Small sample correction c = (n_groups/(n_groups-1))*((t-1)/(t-n_vars)); avar_oneway = c*avar_beta; retp(avar_oneway); endp;