### Introduction

In this blog, we examine one of the fundamentals of panel data analysis, the one-way error component model. Today we will:

- Explain the theoretical one-way error component model.
- Consider fixed effects vs. random effects.
- Estimate models using an empirical example.

## The theoretical one-way error component model

The one-way error-component model is a panel data model which allows for individual-specific or temporal-specific error components

$$ \begin{equation}y_{it} = \alpha + X_{it} \beta + u_{it} \label{OWEM}\end{equation}$$ $$ u_{it} = \mu_{i} + \nu_{it} $$

where the subscript *i* indicates cross-sections of households, individuals, firms, countries, etc. and the subscript *t* indicates time periods.

In this model, the individual-specific error component, $\mu_{i}$, captures any unobserved effects that are different across individuals but fixed across time.

The one-way error component model | |
---|---|

$\alpha$ | Variable of interest which measures an intercept that is constant across all individuals and time periods. |

$\beta$ | Variable of interest which measures the effect of x on y. It is constant across all individuals and time periods. |

$\mu_i$ | Individual-specific variation in y which stays constant across time for each individual.In the fixed effects model this is an individual-specific effect to be estimated.In the random effects model this follows a random distribution with parameters that must be estimated. |

$\nu_{it}$ | Usual stochastic regression disturbance which varies across time and individuals. |

## Fixed effects vs. random effects

The two most common approaches to modeling individual-specific error components are the fixed effects model and the random effects model.

The key difference between these two approaches is how we believe the individual error component behaves.

### The fixed effects model

In the fixed effects model the individual error component:

- Can be thought of as an individual-specific intercept term.
- Captures any omitted variables that are not included in the regression.
- Is correlated with other variables included in the model.

Given these assumptions, the fixed effects model can be thought of as a pooled OLS model with individual specific intercepts:

$$\begin{equation}y_{it} = \delta_{i} + X_{it} \beta + \nu_{it}\label{FEM}\end{equation}$$

The intercept term, $\delta_i$, varies across individuals but is constant across time for each individual. This term is composed of the constant intercept term, $\alpha$, and the individual-specific error terms, $\mu_i$.

The distinguishing feature of the fixed effects model is that $\delta_i$ has a true, but unobservable, effect which we must estimate.

### The random effects model

In the random effects model the individual-specific error component, $\mu_i$:

- Is distributed randomly and is independent of $\nu_{it}$.
- Occurs in cases where individuals are drawn randomly from a large population, such as household studies (Baltagi, 2008).
- Is assumed to be uncorrelated with all other variables in the model.
- Random effects impact our model through the covariance structure of the error term.

For example, consider the total error disturbance in the model, $ u_{it} = \mu_{i} + \nu_{it} $. The covariance of the error at time *t* and time *s* depends on the variance of both $\mu_{i}$ and $\nu_{it}:$

$$\begin{equation}cov(u_{it}, u_{is}) = \left\{ \begin{array}{ll} \sigma_{\mu}^2 & \text{for } t \neq s \\ \sigma_{\mu}^2 + \sigma_{\nu}^2 & \text{for } t = s \\ \end{array} \right. \label{REM}\end{equation} $$

The distinguishing feature of the random effects model is that $\mu_i$ does not have a true value but rather follows a random distribution with parameters that we must estimate.

## Estimation

### The fixed effects model

In the fixed effects model, the individual effects introduce an endogeneity that will result in biased estimates if not properly accounted for.

Fortunately, we can make consistent estimates using one of three estimation techniques:

- Within-group estimation
- First differences estimation
- Least squares dummy variable (LSDV) estimation

The first two of these techniques focuses on eliminating the individual effects before estimation. The LSDV method directly incorporates these effects using dummy variables.

Within-group estimator | LSDV estimator | First differences estimator | |
---|---|---|---|

Datatransformation | Demean the data. | Use dummy variables. | Difference the data. |

Regression equation | $$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i} $$ | $$Y_{it} = X_{it} \beta_{fe} +\\ \alpha D_{i} + \nu_{it}$$ | $$\Delta{Y}_{it} = \Delta{X}_{it} \beta_{fe} + \Delta{\nu}_{it} $$ |

**Let's consider an example panel dataset** with three individuals and three time periods shown in the table below.

Individual | Time Period | Y_{it} | Within Group Ave. Y_{i} | X_{it} | Within Group Ave. X_{i} |
---|---|---|---|---|---|

1 | 1 | 3.901 | 2.744 | 0.978 | 1.174 |

1 | 2 | 2.345 | 2.744 | 1.798 | 1.174 |

1 | 3 | 1.987 | 2.744 | 0.745 | 1.174 |

2 | 1 | 1.250 | 1.715 | 1.652 | 1.425 |

2 | 2 | 0.654 | 1.715 | 0.438 | 1.425 |

2 | 3 | 3.240 | 1.715 | 2.185 | 1.425 |

3 | 1 | 0.901 | 2.077 | 2.119 | 1.653 |

3 | 2 | 1.341 | 2.077 | 1.516 | 1.653 |

3 | 3 | 3.989 | 2.077 | 1.324 | 1.653 |

** Example within-group estimation **

We will estimate the fixed effects model using the within-group method. This can be done in three steps:

- Find the within-subject means.
- Demean the dependent and independent variables using the within-subject means.
- Run a linear regression using the demeaned variables.

** Finding the within-subject means **

To find the within-subject mean of Y for individual one we compute:

$$ \bar{Y_{1}} = \frac{(3.901 + 2.345 + 1.987)}{3} = 2.7443 .$$

We can find the within-subject means using the `withinMeans`

procedure from the `pdlib`

library. The `withinMeans`

procedure requires two inputs:

- grps
- (T*N) x 1 matrix, group identifier.
- data
- (T*N) x k, panel data.

Using our sample data stored in the GAUSS data file simple_data.dat:

```
// Load data
data = loadd("simple_data.dat");
// Assign groups variable
grps = data[., 1];
// Assign y~x matrix
reg_data = data[.,3:4];
// Find group means
grp_means = withinMeans(grp, reg_data);
print "Group means for Y and X:";
grp_means;
```

Our output reads:

Group means for Y and X: 2.7443 1.1737 1.7147 1.4250

** Demeaning the data **

The next step is to demean the data. This removes any time-invariant effects. After finding the within-subject means, the data is demeaned:

$$ \widetilde{Y_1} = Y_{1t} - \overline{Y}_1 =\\ 3.901 - 2.744 = 1.157,\\ 2.345 - 2.744 = -0.399,\\ 1.987 - 2.744 = -0.757 .$$

In **GAUSS** we can demean data using the `demeanData`

procedure from the `pdlib`

library. The `demeanData`

procedure requires two inputs:

- grps
- (T*N) x 1 matrix, group identifier.
- data
- (T*N) x k, panel data.

The `demeanData`

procedure internally computes the within-subject means and requires just the the `reg_data`

and `grps`

variables that we created in the first step:

```
// Remove time-invariant group means
data_tilde = demeanData(grps, reg_data);
print "Demeaned data:";
data_tilde;
print;
```

Our demeaned data is printed in the output:

Demeaned data: 1.1567 -0.1957 -0.3993 0.6243 -0.7573 -0.4287 -0.4647 0.2270 -1.0607 -0.9870 1.5253 0.7600 -1.1760 0.4660 -0.7360 -0.1370 1.9120 -0.3290

** Performing the regression **

Once we have transformed our *x* and *y* data we are ready to estimate the parameters of the fixed effects regression model:

$$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i} $$

where

$$\widehat{\beta}_{fe} = (\widetilde{X_i}'\widetilde{X_i})^{-1}(\widetilde{X_i}'\widetilde{Y_i}) .$$

Using the data we previously demeaned:

```
// Extract variables
y_tilde = data_tilde[., 1];
x_tilde = data_tilde[., 2];
// Regress independent on dependent variables
coeff = inv(x_tilde'x_tilde)*(x_tilde'y_tilde);
// Print the fixed effects coefficient
print "Fixed effects coefficient:";
coeff;
```

The result reads:

Fixed effects coefficient: 0.3413

** Using the fixedEffects procedure **

As an alternative to computing these three steps separately, we can use the `fixedEffects`

procedure from the GAUSS panel data library, `pdlib`

. This procedure runs all three steps in a single call. The `fixedEffects`

procedure takes four inputs:

- y
- (T*N) x 1 matrix, the panel of stacked dependent variables.
- x
- (T*N) x k matrix, the panel of stacked independent variables.
- grps
- (T*N) x 1 matrix, group identifier.
- robust
- Scalar, an indicator variable of whether to use robust standard errors.

```
// Use fixedEffects procedure
call fixedEffects(reg_data[.,1], reg_data[.,2], grps, 1);
```

This prints:

------------------- FIXED EFFECTS (WITHIN) RESULTS ------------------- Observations : 9 Number of Groups : 3 Degrees of freedom : 2 R-squared : 0.026 Adj. R-squared : -0.558 Residual SS : 11.021 Std error of est : 1.485 Total SS (corrected) : 11.319 F = 0.054 with 1,2 degrees of freedom P-value = 0.838 Variable Coef. Std. Error t-Stat P-Value ---------------------------------------------------------------------- X1 0.341276 1.011041 0.337549 0.768

### The random effects model

The covariance structure of the random effects model means that pooled OLS will result in inefficient estimates. Instead, the random effects model is estimated using pooled feasible generalized least squares (FGLS).

The pooled FGLS method estimates the model

$$\widetilde{Y_i} = \widetilde{W_i} \delta_{re} + \widetilde{\epsilon_i}$$

where the data is transformed using $\Omega = E[\epsilon_i \epsilon_i']$

$$\widetilde{Y_i} = \Omega^{-\frac{1}{2}}Y_{i},$$ $$\widetilde{W_i} = \Omega^{-\frac{1}{2}}W_{i},$$ $$\widetilde{\epsilon_i} = \Omega^{-\frac{1}{2}}\epsilon_{i},$$

and

$$W_i = [1, X_i],$$ $$\delta = [\alpha, \beta']',$$ $$\epsilon_i = \mu_i i_T + \nu_i .$$

The most difficult part of estimating this model is estimating $\Omega$ and there are a number of different proposed methods.

** Example random effects estimation **

One of the most common approaches for estimating the random effects model:

- Estimates the between-group regression to obtain $\sigma_u^2$.
- Estimates the within-group regression to obtain $\sigma_{\nu}^2$.
- Transforms the data using $\sigma_u^2$ and $\sigma_{\nu}^2$.
- Finds the pooled OLS estimator using the transformed data.

We can perform these steps in one procedure call using the `randomEffects`

procedure in `pdlib`

GAUSS library.

** Using the randomEffects procedure **

The `randomEffects`

procedure takes four inputs:

- y
- (T*N) x 1 matrix, the panel of stacked dependent variables.
- x
- (T*N) x k matrix, the panel of stacked independent variables.
- grps
- (T*N) x 1 matrix, group identifier.
- robust
- Scalar, an indicator variable of whether to use robust standard errors.

Continuing with our fixed effects example, we will use our sample data stored in the GAUSS data file simple_data.dat.

```
// Use randomEffects procedure
call randomEffects(reg_data[., 1], reg_data[., 2], grps, 1);
```

---------------------- GLS RANDOM EFFECTS RESULTS ---------------------- Observations : 9 Number of Groups : 3 Degrees of freedom : 2 R-squared : 0.004 Adj. R-squared : -2.985 Residual SS : 12.907 Std error of est : 1.358 Total SS (corrected) : 12.956 F = 3.314 with 2,2 degrees of freedom P-value = 0.232

Variable Coef. Std. Error t-Stat P-Value ---------------------------------------------------------------------- CONSTANT 1.994513 1.720996 1.158930 0.366 X1 0.129940 1.053423 0.123350 0.913

### Conclusion

In today's blog we have covered the fundamentals of the individual error component models:

- The theoretical one-way error component model.
- Fixed effects vs. random effects.
- Estimating fixed effects and random effects.

The code and data for this blog can be found at our Aptech Blog Github code repository.

## References

Baltagi, B.(https://www.wiley.com/en-us/Econometric+Analysis+of+Panel+Data%2C+5th+Edition-p-9781118672327) (2008). *Econometric analysis of panel data.* John Wiley & Sons.

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.