Introduction
In this blog, we examine one of the fundamentals of panel data analysis, the oneway error component model. Today we will:
 Explain the theoretical oneway error component model.
 Consider fixed effects vs. random effects.
 Estimate models using an empirical example.
The theoretical oneway error component model
The oneway errorcomponent model is a panel data model which allows for individualspecific or temporalspecific error components
$$ \begin{equation}y_{it} = \alpha + X_{it} \beta + u_{it} \label{OWEM}\end{equation}$$ $$ u_{it} = \mu_{i} + \nu_{it} $$
where the subscript i indicates crosssections of households, individuals, firms, countries, etc. and the subscript t indicates time periods.
In this model, the individualspecific error component, $\mu_{i}$, captures any unobserved effects that are different across individuals but fixed across time.
The oneway error component model  
$\alpha$  Variable of interest which measures an intercept that is constant across all individuals and time periods. 
$\beta$  Variable of interest which measures the effect of x on y. It is constant across all individuals and time periods. 
$\mu_i$  Individualspecific variation in y which stays constant across time for each individual. In the fixed effects model this is an individualspecific effect to be estimated. In the random effects model this follows a random distribution with parameters that must be estimated. 
$\nu_{it}$  Usual stochastic regression disturbance which varies across time and individuals. 
Fixed effects vs. random effects
The two most common approaches to modeling individualspecific error components are the fixed effects model and the random effects model.
The key difference between these two approaches is how we believe the individual error component behaves.
The fixed effects model
In the fixed effects model the individual error component:
 Can be thought of as an individualspecific intercept term.
 Captures any omitted variables that are not included in the regression.
 Is correlated with other variables included in the model.
Given these assumptions, the fixed effects model can be thought of as a pooled OLS model with individual specific intercepts:
$$\begin{equation}y_{it} = \delta_{i} + X_{it} \beta + \nu_{it}\label{FEM}\end{equation}$$
The intercept term, $\delta_i$, varies across individuals but is constant across time for each individual. This term is composed of the constant intercept term, $\alpha$, and the individualspecific error terms, $\mu_i$.
The distinguishing feature of the fixed effects model is that $\delta_i$ has a true, but unobservable, effect which we must estimate.
The random effects model
In the random effects model the individualspecific error component, $\mu_i$:
 Is distributed randomly and is independent of $\nu_{it}$.
 Occurs in cases where individuals are drawn randomly from a large population, such as household studies (Baltagi, 2008).
 Is assumed to be uncorrelated with all other variables in the model.
 Random effects impact our model through the covariance structure of the error term.
For example, consider the total error disturbance in the model, $ u_{it} = \mu_{i} + \nu_{it} $. The covariance of the error at time t and time s depends on the variance of both $\mu_{i}$ and $\nu_{it}:$
$$\begin{equation}cov(u_{it}, u_{is}) = \left\{ \begin{array}{ll} \sigma_{\mu}^2 & \text{for } t \neq s \\ \sigma_{\mu}^2 + \sigma_{\nu}^2 & \text{for } t = s \\ \end{array} \right. \label{REM}\end{equation} $$
The distinguishing feature of the random effects model is that $\mu_i$ does not have a true value but rather follows a random distribution with parameters that we must estimate.
Estimation
The fixed effects model
In the fixed effects model, the individual effects introduce an endogeneity that will result in biased estimates if not properly accounted for.
Fortunately, we can make consistent estimates using one of three estimation techniques:
 Withingroup estimation
 First differences estimation
 Least squares dummy variable (LSDV) estimation
The first two of these techniques focuses on eliminating the individual effects before estimation. The LSDV method directly incorporates these effects using dummy variables.
Withingroup estimator 
LSDV estimator 
First differences estimator 

Data transformation 
Demean the data.  Use dummy variables.  Difference the data. 
Regression equation  $$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i} $$  $$Y_{it} = X_{it} \beta_{fe} +\\ \alpha D_{i} + \nu_{it}$$  $$\Delta{Y}_{it} = \Delta{X}_{it} \beta_{fe} + \Delta{\nu}_{it} $$ 
Let's consider an example panel dataset with three individuals and three time periods shown in the table below.
Individual  Time Period 
Y_{it}  Within Group Ave. Y_{i} 
X_{it}  Within Group Ave. X_{i} 
1  1  3.901  2.744  0.978  1.174 
1  2  2.345  2.744  1.798  1.174 
1  3  1.987  2.744  0.745  1.174 
2  1  1.250  1.715  1.652  1.425 
2  2  0.654  1.715  0.438  1.425 
2  3  3.240  1.715  2.185  1.425 
3  1  0.901  2.077  2.119  1.653 
3  2  1.341  2.077  1.516  1.653 
3  3  3.989  2.077  1.324  1.653 
Example withingroup estimation
We will estimate the fixed effects model using the withingroup method. This can be done in three steps:
 Find the withinsubject means.
 Demean the dependent and independent variables using the withinsubject means.
 Run a linear regression using the demeaned variables.
Finding the withinsubject means
To find the withinsubject mean of Y for individual one we compute:
$$ \bar{Y_{1}} = \frac{(3.901 + 2.345 + 1.987)}{3} = 2.7443 .$$
We can find the withinsubject means using the withinMeans
procedure from the pdlib
library. The withinMeans
procedure requires two inputs:
 grps
 (T*N) x 1 matrix, group identifier.
 data
 (T*N) x k, panel data.
Using our sample data stored in the GAUSS data file simple_data.dat:
// Load data
data = loadd("simple_data.dat");
// Assign groups variable
grps = data[., 1];
// Assign y~x matrix
reg_data = data[.,3:4];
// Find group means
grp_means = withinMeans(grp, reg_data);
print "Group means for Y and X:";
grp_means;
Our output reads:
Group means for Y and X: 2.7443 1.1737 1.7147 1.4250
Demeaning the data
The next step is to demean the data. This removes any timeinvariant effects. After finding the withinsubject means, the data is demeaned:
$$ \widetilde{Y_1} = Y_{1t}  \overline{Y}_1 =\\ 3.901  2.744 = 1.157,\\ 2.345  2.744 = 0.399,\\ 1.987  2.744 = 0.757 .$$
In GAUSS we can demean data using the demeanData
procedure from the pdlib
library. The demeanData
procedure requires two inputs:
 grps
 (T*N) x 1 matrix, group identifier.
 data
 (T*N) x k, panel data.
The demeanData
procedure internally computes the withinsubject means and requires just the the reg_data
and grps
variables that we created in the first step:
// Remove timeinvariant group means
data_tilde = demeanData(grps, reg_data);
print "Demeaned data:";
data_tilde;
print;
Our demeaned data is printed in the output:
Demeaned data: 1.1567 0.1957 0.3993 0.6243 0.7573 0.4287 0.4647 0.2270 1.0607 0.9870 1.5253 0.7600 1.1760 0.4660 0.7360 0.1370 1.9120 0.3290
Performing the regression
Once we have transformed our x and y data we are ready to estimate the parameters of the fixed effects regression model:
$$\widetilde{Y_i} = \widetilde{X_i} \beta_{fe} + \widetilde{\nu_i} $$
where
$$\widehat{\beta}_{fe} = (\widetilde{X_i}'\widetilde{X_i})^{1}(\widetilde{X_i}'\widetilde{Y_i}) .$$
Using the data we previously demeaned:
// Extract variables
y_tilde = data_tilde[., 1];
x_tilde = data_tilde[., 2];
// Regress independent on dependent variables
coeff = inv(x_tilde'x_tilde)*(x_tilde'y_tilde);
// Print the fixed effects coefficient
print "Fixed effects coefficient:";
coeff;
The result reads:
Fixed effects coefficient: 0.3413
Using the fixedEffects procedure
As an alternative to computing these three steps separately, we can use the fixedEffects
procedure from the GAUSS panel data library, pdlib
. This procedure runs all three steps in a single call. The fixedEffects
procedure takes four inputs:
 y
 (T*N) x 1 matrix, the panel of stacked dependent variables.
 x
 (T*N) x k matrix, the panel of stacked independent variables.
 grps
 (T*N) x 1 matrix, group identifier.
 robust
 Scalar, an indicator variable of whether to use robust standard errors.
// Use fixedEffects procedure
call fixedEffects(reg_data[.,1], reg_data[.,2], grps, 1);
This prints:
 FIXED EFFECTS (WITHIN) RESULTS  Observations : 9 Number of Groups : 3 Degrees of freedom : 2 Rsquared : 0.026 Adj. Rsquared : 0.558 Residual SS : 11.021 Std error of est : 1.485 Total SS (corrected) : 11.319 F = 0.054 with 1,2 degrees of freedom Pvalue = 0.838 Variable Coef. Std. Error tStat PValue  X1 0.341276 1.011041 0.337549 0.768
The random effects model
The covariance structure of the random effects model means that pooled OLS will result in inefficient estimates. Instead, the random effects model is estimated using pooled feasible generalized least squares (FGLS).
The pooled FGLS method estimates the model
$$\widetilde{Y_i} = \widetilde{W_i} \delta_{re} + \widetilde{\epsilon_i}$$
where the data is transformed using $\Omega = E[\epsilon_i \epsilon_i']$
$$\widetilde{Y_i} = \Omega^{\frac{1}{2}}Y_{i},$$ $$\widetilde{W_i} = \Omega^{\frac{1}{2}}W_{i},$$ $$\widetilde{\epsilon_i} = \Omega^{\frac{1}{2}}\epsilon_{i},$$
and
$$W_i = [1, X_i],$$ $$\delta = [\alpha, \beta']',$$ $$\epsilon_i = \mu_i i_T + \nu_i .$$
The most difficult part of estimating this model is estimating $\Omega$ and there are a number of different proposed methods.
Example random effects estimation
One of the most common approaches for estimating the random effects model:
 Estimates the betweengroup regression to obtain $\sigma_u^2$.
 Estimates the withingroup regression to obtain $\sigma_{\nu}^2$.
 Transforms the data using $\sigma_u^2$ and $\sigma_{\nu}^2$.
 Finds the pooled OLS estimator using the transformed data.
We can perform these steps in one procedure call using the randomEffects
procedure in pdlib
GAUSS library.
Using the randomEffects procedure
The randomEffects
procedure takes four inputs:
 y
 (T*N) x 1 matrix, the panel of stacked dependent variables.
 x
 (T*N) x k matrix, the panel of stacked independent variables.
 grps
 (T*N) x 1 matrix, group identifier.
 robust
 Scalar, an indicator variable of whether to use robust standard errors.
Continuing with our fixed effects example, we will use our sample data stored in the GAUSS data file simple_data.dat.
// Use randomEffects procedure
call randomEffects(reg_data[., 1], reg_data[., 2], grps, 1);
 GLS RANDOM EFFECTS RESULTS  Observations : 9 Number of Groups : 3 Degrees of freedom : 2 Rsquared : 0.004 Adj. Rsquared : 2.985 Residual SS : 12.907 Std error of est : 1.358 Total SS (corrected) : 12.956 F = 3.314 with 2,2 degrees of freedom Pvalue = 0.232
Variable Coef. Std. Error tStat PValue  CONSTANT 1.994513 1.720996 1.158930 0.366 X1 0.129940 1.053423 0.123350 0.913
Conclusion
In today's blog we have covered the fundamentals of the individual error component models:
 The theoretical oneway error component model.
 Fixed effects vs. random effects.
 Estimating fixed effects and random effects.
The code and data for this blog can be found at our Aptech Blog Github code repository.
References
Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons.
Erica has been working to build, distribute, and strengthen the GAUSS universe since 2012. She is an economist skilled in data analysis and software development. She has earned a B.A. and MSc in economics and engineering and has over 15 years combined industry and academic experience in data analysis and research.