### Introduction

The validity of many time series models and panel data models requires that the underlying data is stationary. As such, reliable unit root testing is an important step of any time series analysis or panel data analysis.

However, standard time series unit root tests and panel data unit root tests aren’t reliable when structural breaks are present. Because of this, when structural breaks are suspected, we must employ unit root tests that properly incorporate these breaks.

Today we will examine one of those tests, the Carrion-i-Silvestre, et al. (2005) panel data test for stationarity in the presence of multiple structural breaks.

## Why Panel Data Unit Root Testing?

We may be tempted when working with panel data to treat the data as individual time-series, performing unit root testing on each one separately. However, one of the fundamental ideas of panel data is that there is a shared underlying component that connects the group.

It is this shared component, that suggests that there are advantages to be gained from testing the panel data collectively:

- Panel data contains more combined information and variation than pure time-series data or cross-sectional data.
- Collectively testing for unit roots in panels provides more power than testing individual series.
- Panel data unit root tests are more likely than time series unit root tests to have standard asymptotic distributions.

Put simply, when dealing with panel data, using tests designed specifically for panel data and testing the panel collectively, can lead to more reliable results.

## Why do we Need to Worry About Structural Breaks?

It is important to properly address structural breaks when conducting unit root testing because most **standard unit root tests will bias towards non-rejection** of the unit root test. We discuss this in greater detail in our “Unit Root Tests with Structural Breaks” blog.

## Panel Data Stationarity Test with Structural Breaks

The Carrion-i-Silvestre, *et al.* (2005) panel data stationarity test introduces a number of important testing features:

- Tests the null hypothesis of stationarity against the alternative of non-stationarity.
- Allows for multiple, unknown structural breaks.
- Accommodates shifts in the mean and/or trend of the individual time series.
- Does not require the same breaks across the entire panel but, rather, allows for each individual to have a different number of breaks at different dates.
- Allows for homogeneous or heterogeneous long-run variances across individuals.

## Conducting Panel Data Stationarity Tests in GAUSS

### Where can I Find the Tests?

The panel data stationarity test with structural breaks is implemented by the `pankpss`

procedure in the GAUSS Carrionlib library.

The library can be directly installed using the GAUSS Package Manager or the GAUSS Application Installation Wizard, depending on your version of GAUSS.

### What Format Should my Data be in?

The `pankpss`

procedure takes panel data in wide format - this means that each column of your data matrix should contain the time series observations for a different individual in the panel.

For example, if we have 100 observations of real GDP for 3 countries, our test data will be 100 x 3 matrix.

Observation # | Country A | Country B | Country C |
---|---|---|---|

1 | 1.11 | 1.40 | 1.39 |

2 | 1.14 | 1.37 | 1.34 |

3 | 1.27 | 1.45 | 1.28 |

4 | 1.19 | 1.51 | 1.35 |

$\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ |

99 | 1.53 | 1.75 | 1.65 |

100 | 1.68 | 1.78 | 1.67 |

### How do I Call the Test Procedure?

The first step to implementing the panel date stationarity test with structural breaks in GAUSS is to load the `Carrionlib`

library. We will also use the `PDlib`

library to help load our data:

`library carrionlib, pdlib;`

This statement provides access to all the procedures in the `Carrionlib`

and `PDlib`

libraries. After loading the libraries, the `pankpss`

procedure can be called directly from the command line or within a program file.

The `pankpss`

procedure takes 5 inputs:

```
{ test_hom, test_het, kpss_test, breaks_array } = pankpss(data,
model_breaks,
model_nobreaks,
kernel,
maxlags,
b_ctl);
```

- data
- $T \times N$ matrix of panel data to be tested.
- model_breaks
- Scalar, model to be used when there are structural breaks found:
1 Constant (Hadri test) 2 Constant + trend (Hadri test) 3 Constant + shift (in mean) 4 Constant + trend + shift (in mean and trend) - model_nobreaks
- Scalar, model to be used when there are structural breaks found:
1 Constant (Hadri test) 2 Constant + trend (Hadri test) 3 Constant + shift (in mean) 4 Constant + trend + shift (in mean and trend) - kernel
- Scalar, kernel used for long-run variance computation:
0 Sul, Phillips, and Choi (2003) with the Bartlett kernel 1 Sul, Phillips, and Choi (2003) with quadratic spectral kernel - maxlags
- Scalar, denotes the number of maximum lags that is used in the estimation of the AR(p) model. The final number of lags is chosen using the BIC criterion.
- b_ctl
- An instance of the
`breakControl`

structure controlling the setting for the Bai and Perron structural break estimation.

The `pankpss`

procedure provides 4 returns :

- test_hom
- Scalar, stationarity test statistic with structural breaks and homogeneous variance.
- test_het
- Scalar, stationarity test statistic with structural breaks and heterogeneous variance.
- kpss_test
- Matrix, individual tests. This matrix contains the test statistics in the first column, the number of breaks in the second column, the BIC chosen optimal lags, and the LWZ chosen optimal lags.
- breaks_array
- $N \times m \times m$ array, estimated breakpoints for 1 through
*m*breaks.

## Empirical Example

Let’s look further into testing for panel data stationarity with structural breaks using an empirical example.

### Data Description

The dataset contains government deficit as a percentage of GDP for nine OECD countries. The time span ranges from 1995 to 2019. This gives us a balanced panel of 9 individuals and 25 time observations each.

### Loading our data into GAUSS

Our first step is to load the data from `govt-deficit-oecd.csv`

using `loadd`

. This `.csv`

file contains three variables, `Country`

, `Year`

, and `Gov_deficit`

.

We will load all three, converting `Country`

from a string to numerical categories using the `cat`

keyword in the formula string as shown below:

```
// Load all variables and convert country to numeric categories
data = loadd("govt-deficit-oecd.csv", "Year + cat(Country) + Govt_deficit");
```

This loads our data in long format (a 225x1 matrix) so we must convert this to wide-format using the `pdWide`

procedure. This procedure requires the time variable in the first column and the group indicator in the second column.

```
// Convert from long to wide format
wide_data = pdWide(data);
```

The first column of `wide_data`

is the date vector which should not be passed to `pankpss`

. We remove this column using `delcols`

:

```
// Delete first column which contains the year variable
govt_def = delcols(wide_data, 1);
```

### Setting up our Model Parameters

Next, we will set the following parameters for our test:

```
// Use quadratic spectral kernel for
// long-run variance computation
kernel = 1;
// Set maximum number of lags used to
// estimate the AR(p) model
maxlags = 5;
// Maximum number of structural changes allowed
m = 5;
// Specify which model to use when no structural
// breaks are present.
// Allow for both constant and trend.
model_nobreaks = 2;
// Specify which model to use when structural
// breaks are present.
// Allow for changes in the mean and the slope.
model_breaks = 4;
/*
** Settings for structural break estimation
*/
// Declare structural break control structure
// and fill with default settings
struct breakControl b_ctl;
b_ctl = breakControlCreate(rows(wide_data));
// Print iteration output to the screen
b_ctl.printd = 1;
// Allow for the variance of the residuals
// to be different across segments.
b_ctl.hetvar = 1;
// Use LWZ to select the number of breaks.
b_ctl.estimbic = 1;
// Don't use sequential procedure to estimate breaks
b_ctl.estimseq = 0;
```

### Calling the `pankpss`

Procedure

Finally, we call the `pankpss`

procedure:

```
{ test_hom, test_het, kpss_test, breaks_array } = pankpss(govt_def,
model_breaks,
model_nobreaks,
kernel,
maxlags,
b_ctl);
```

## Empirical Results

The `pankpss`

test prints the `test_hom`

and `test_het`

outputs:

Stationarity test with structural breaks Homogeneous : 10.711 with p-val: 0.0000 Heterogeneous: 53.452 with p-val: 0.0000

as well as the matrix of individual test results:

0.162 2 3 2 0.050 2 3 3 0.265 0 4 0 0.043 2 4 2 0.322 3 3 3 1.636 3 3 3 0.129 1 5 1 0.323 4 5 4 0.044 2 3 2

This matrix contains four columns:

- The KPSS stationarity test for each independent group.
- The number of estimated breaks.
- The number of breaks estimated by the BIC.
- The number of breaks estimated by the LWZ.

Finally, the `pankpss`

procedure also prints the estimated breakpoints for each individual in the panel. For the sake of brevity, these are not included here.

`pankpss`

see our data viewing tutorial.## Interpreting the Results

When interpreting the results from `pankpss`

test, it helps to remember a few key things:

- The test considers the null hypothesis of stationarity against the alternative of non-stationarity.
- We reject the null hypothesis of stationarity at
- Large values of the test statistic.
- Small p-values.

### Panel Data Test Statistic

The test statistic for our panel, assuming homogeneous variances:

- Is equal to 10.711 with a p-value of 0.0009.
- Suggests that we reject the null hypothesis of stationarity at the 1% level.

The test statistic for our panel, assuming heterogeneous variances:

- Is equal to 53.452 with a p-value of 0.000.
- Suggests that we reject the null hypothesis of stationarity at the 1% level.

These results tell us that regardless of whether we assume heterogeneous or homogenous variances, we can reject the null hypothesis of stationarity for the panel. Given this, we must make proper adjustments to account for non-stationarity when modeling our data.

### Individual Test Results

Country | Statistic | Breaks |
---|---|---|

Austria | 0.691 | 2003;2008 |

France | 0.181 | 2001;2008 |

Germany | 0.704 | None |

Ireland | 0.044 | 2007;2010 |

Italy | 0.173 | 1997;2006;2009 |

Luxemberg | 0.037 | 1999;2004;2008 |

Norway | 0.185 | 2008 |

Spain | 0.092 | 1999;2006;2009;2012 |

United Kingdom | 0.035 | 2000;2008 |

## Conclusion

Todays's blog considers the panel data stationarity test proposed by Carrion-i-Silvestre, et al. (2005). This test is built upon two crucial aspects of unit root testing:

- Panel data specific tests should be used with panel data.
- Structural breaks should be accounted for.

Ignoring these two facts can result in unreliable results.

After today, you should have a stronger understanding of how to implement the panel data stationarity test with structural breaks in GAUSS and how to interpret the results.

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.

jamelsVery nice and clear blog. A blog with the "Cointegration in panel data with structural breaks and cross-section dependence" would be great!

Best,

JS