# Panel Data Basics: One-way Individual Effects

### Introduction

In this blog, we examine one of the fundamentals of panel data analysis, the one-way error component model. Today we will:

• Explain the theoretical one-way error component model.
• Consider fixed effects vs. random effects.
• Estimate models using an empirical example.

## The theoretical one-way error component model

The one-way error-component model is a panel data model which allows for individual-specific or temporal-specific error components

$$$$y_{it} = \alpha + X_{it} \beta + u_{it} \label{OWEM}$$$$ $$u_{it} = \mu_{i} + \nu_{it}$$

where the subscript i indicates cross-sections of households, individuals, firms, countries, etc. and the subscript t indicates time periods.

In this model, the individual-specific error component, $\mu_{i}$, captures any unobserved effects that are different across individuals but fixed across time.

The one-way error component model
$\alpha$ Variable of interest which measures an intercept that is constant across all individuals and time periods.
$\beta$ Variable of interest which measures the effect of x on y. It is constant across all individuals and time periods.
$\mu_i$ Individual-specific variation in y which stays constant across time for each individual.
In the fixed effects model this is an individual-specific effect to be estimated.
In the random effects model this follows a random distribution with parameters that must be estimated.
$\nu_{it}$ Usual stochastic regression disturbance which varies across time and individuals.

## Fixed effects vs. random effects

The two most common approaches to modeling individual-specific error components are the fixed effects model and the random effects model.

The key difference between these two approaches is how we believe the individual error component behaves.

### The fixed effects model

In the fixed effects model the individual error component:

• Can be thought of as an individual-specific intercept term.
• Captures any omitted variables that are not included in the regression.
• Is correlated with other variables included in the model.

Given these assumptions, the fixed effects model can be thought of as a pooled OLS model with individual specific intercepts:

$$$$y_{it} = \delta_{i} + X_{it} \beta + \nu_{it}\label{FEM}$$$$

The intercept term, $\delta_i$, varies across individuals but is constant across time for each individual. This term is composed of the constant intercept term, $\alpha$, and the individual-specific error terms, $\mu_i$.

The distinguishing feature of the fixed effects model is that $\delta_i$ has a true, but unobservable, effect which we must estimate.

### The random effects model

In the random effects model the individual-specific error component, $\mu_i$:

• Is distributed randomly and is independent of $\nu_{it}$.
• Occurs in cases where individuals are drawn randomly from a large population, such as household studies (Baltagi, 2008).
• Is assumed to be uncorrelated with all other variables in the model.
• Random effects impact our model through the covariance structure of the error term.

For example, consider the total error disturbance in the model, $u_{it} = \mu_{i} + \nu_{it}$. The covariance of the error at time t and time s depends on the variance of both $\mu_{i}$ and $\nu_{it}:$

$$$$cov(u_{it}, u_{is}) = \left\{ \begin{array}{ll} \sigma_{\mu}^2 & \text{for } t \neq s \\ \sigma_{\mu}^2 + \sigma_{\nu}^2 & \text{for } t = s \\ \end{array} \right. \label{REM}$$$$

The distinguishing feature of the random effects model is that $\mu_i$ does not have a true value but rather follows a random distribution with parameters that we must estimate.

## Estimation

### The fixed effects model

In the fixed effects model, the individual effects introduce an endogeneity that will result in biased estimates if not properly accounted for.

Fortunately, we can make consistent estimates using one of three estimation techniques:

1. Within-group estimation
2. First differences estimation
3. Least squares dummy variable (LSDV) estimation

The first two of these techniques focuses on eliminating the individual effects before estimation. The LSDV method directly incorporates these effects using dummy variables.

Within-group estimator LSDV estimator First differences estimator
Data
transformation
Demean the data. Use dummy variables. Difference the data.
Regression equation $$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i}$$ $$Y_{it} = X_{it} \beta_{fe} +\\ \alpha D_{i} + \nu_{it}$$ $$\Delta{Y}_{it} = \Delta{X}_{it} \beta_{fe} + \Delta{\nu}_{it}$$

Let's consider an example panel dataset with three individuals and three time periods shown in the table below.

Individual Time Period Yit Within Group Ave. Yi Xit Within Group Ave. Xi
1 1 3.901 2.744 0.978 1.174
1 2 2.345 2.744 1.798 1.174
1 3 1.987 2.744 0.745 1.174
2 1 1.250 1.715 1.652 1.425
2 2 0.654 1.715 0.438 1.425
2 3 3.240 1.715 2.185 1.425
3 1 0.901 2.077 2.119 1.653
3 2 1.341 2.077 1.516 1.653
3 3 3.989 2.077 1.324 1.653

Example within-group estimation
We will estimate the fixed effects model using the within-group method. This can be done in three steps:

1. Find the within-subject means.
2. Demean the dependent and independent variables using the within-subject means.
3. Run a linear regression using the demeaned variables.

Finding the within-subject means
To find the within-subject mean of Y for individual one we compute:

$$\bar{Y_{1}} = \frac{(3.901 + 2.345 + 1.987)}{3} = 2.7443 .$$

We can find the within-subject means using the withinMeans procedure from the pdlib library. The withinMeans procedure requires two inputs:

grps
(T*N) x 1 matrix,  group identifier.
data
(T*N) x k,  panel data.

Using our sample data stored in the GAUSS data file simple_data.dat:

// Load data

// Assign groups variable
grps = data[., 1];

// Assign y~x matrix
reg_data = data[.,3:4];

// Find group means
grp_means = withinMeans(grp, reg_data);

print "Group means for Y and X:";
grp_means;

Group means for Y and X:

2.7443  1.1737
1.7147  1.4250 

Demeaning the data
The next step is to demean the data. This removes any time-invariant effects. After finding the within-subject means, the data is demeaned:

$$\widetilde{Y_1} = Y_{1t} - \overline{Y}_1 =\\ 3.901 - 2.744 = 1.157,\\ 2.345 - 2.744 = -0.399,\\ 1.987 - 2.744 = -0.757 .$$

In GAUSS we can demean data using the demeanData procedure from the pdlib library. The demeanData procedure requires two inputs:

grps
(T*N) x 1 matrix,  group identifier.
data
(T*N) x k,  panel data.

The demeanData procedure internally computes the within-subject means and requires just the the reg_data and grps variables that we created in the first step:

// Remove time-invariant group means
data_tilde = demeanData(grps, reg_data);

print "Demeaned data:";
data_tilde;
print;

Our demeaned data is printed in the output:

Demeaned data:

1.1567 -0.1957
-0.3993  0.6243
-0.7573 -0.4287
-0.4647  0.2270
-1.0607 -0.9870
1.5253  0.7600
-1.1760  0.4660
-0.7360 -0.1370
1.9120 -0.3290 

Performing the regression
Once we have transformed our x and y data we are ready to estimate the parameters of the fixed effects regression model:

$$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i}$$

where

$$\widehat{\beta}_{fe} = (\widetilde{X_i}'\widetilde{X_i})^{-1}(\widetilde{X_i}'\widetilde{Y_i}) .$$

Using the data we previously demeaned:

// Extract variables
y_tilde = data_tilde[., 1];
x_tilde = data_tilde[., 2];

// Regress independent on dependent variables
coeff = inv(x_tilde'x_tilde)*(x_tilde'y_tilde);

// Print the fixed effects coefficient
print "Fixed effects coefficient:";
coeff;

Fixed effects coefficient:
0.3413 

Using the fixedEffects procedure
As an alternative to computing these three steps separately, we can use the fixedEffects procedure from the GAUSS panel data library, pdlib. This procedure runs all three steps in a single call. The fixedEffects procedure takes four inputs:

y
(T*N) x 1 matrix,  the panel of stacked dependent variables.
x
(T*N) x k matrix,  the panel of stacked independent variables.
grps
(T*N) x 1 matrix,  group identifier.
robust
Scalar,  an indicator variable of whether to use robust standard errors.

// Use fixedEffects procedure
call fixedEffects(reg_data[.,1], reg_data[.,2], grps, 1);

This prints:

------------------- FIXED EFFECTS (WITHIN) RESULTS -------------------

Observations          :  9
Number of Groups      :  3
Degrees of freedom    :  2
R-squared             :  0.026
Residual SS           :  11.021
Std error of est      :  1.485
Total SS (corrected)  :  11.319
F                     =  0.054        with 1,2 degrees of freedom
P-value               =  0.838

Variable            Coef.       Std. Error       t-Stat       P-Value
----------------------------------------------------------------------
X1                0.341276       1.011041       0.337549       0.768

### The random effects model

The covariance structure of the random effects model means that pooled OLS will result in inefficient estimates. Instead, the random effects model is estimated using pooled feasible generalized least squares (FGLS).

The pooled FGLS method estimates the model

$$\widetilde{Y_i} = \widetilde{W_i} \delta_{re} + \widetilde{\epsilon_i}$$

where the data is transformed using $\Omega = E[\epsilon_i \epsilon_i']$

$$\widetilde{Y_i} = \Omega^{-\frac{1}{2}}Y_{i},$$ $$\widetilde{W_i} = \Omega^{-\frac{1}{2}}W_{i},$$ $$\widetilde{\epsilon_i} = \Omega^{-\frac{1}{2}}\epsilon_{i},$$

and

$$W_i = [1, X_i],$$ $$\delta = [\alpha, \beta']',$$ $$\epsilon_i = \mu_i i_T + \nu_i .$$

The most difficult part of estimating this model is estimating $\Omega$ and there are a number of different proposed methods.

Example random effects estimation
One of the most common approaches for estimating the random effects model:

1. Estimates the between-group regression to obtain $\sigma_u^2$.
2. Estimates the within-group regression to obtain $\sigma_{\nu}^2$.
3. Transforms the data using $\sigma_u^2$ and $\sigma_{\nu}^2$.
4. Finds the pooled OLS estimator using the transformed data.

We can perform these steps in one procedure call using the randomEffects procedure in pdlib GAUSS library.

Using the randomEffects procedure
The randomEffects procedure takes four inputs:

y
(T*N) x 1 matrix,  the panel of stacked dependent variables.
x
(T*N) x k matrix,  the panel of stacked independent variables.
grps
(T*N) x 1 matrix,  group identifier.
robust
Scalar,  an indicator variable of whether to use robust standard errors.

Continuing with our fixed effects example, we will use our sample data stored in the GAUSS data file simple_data.dat.

// Use randomEffects procedure
call randomEffects(reg_data[., 1], reg_data[., 2], grps, 1);
---------------------- GLS RANDOM EFFECTS RESULTS  ----------------------

Observations          :  9
Number of Groups      :  3
Degrees of freedom    :  2
R-squared             :  0.004
Residual SS           :  12.907
Std error of est      :  1.358
Total SS (corrected)  :  12.956
F                     =  3.314        with 2,2 degrees of freedom
P-value               =  0.232
Variable            Coef.       Std. Error       t-Stat       P-Value
----------------------------------------------------------------------
CONSTANT          1.994513       1.720996       1.158930       0.366
X1                0.129940       1.053423       0.123350       0.913

### Conclusion

In today's blog we have covered the fundamentals of the individual error component models:

• The theoretical one-way error component model.
• Fixed effects vs. random effects.
• Estimating fixed effects and random effects.

The code and data for this blog can be found at our Aptech Blog Github code repository.

## References

Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons.

### Have a Specific Question?

Get a real answer from a real person

### Need Support?

Get help from our friendly experts.