### Introduction

Economists are increasingly exploring the potential for machine learning models in economic forecasting. This blog offers an introduction to using three different machine learning regression techniques for economic modeling, using an empirical application to the real U.S. GDP output gap.

We look specifically at:

- Measuring the output gap.
- The fundamentals of three machine learning regression models.
- Model estimation using the GAUSS Machine Learning library.

## Measuring GDP Output Gap

The GDP output gap is a macroeconomic indicator that measures the difference between potential GDP and actual GDP. It is an interesting and useful economic statistic:

- It indicates whether the economy is operating with unemployment, inefficiencies, or inflationary pressures making it useful for policymaking.
- The potential GDP is unobservable and must be estimated, with a large literature devoted to what is the best estimate of potential GDP.
- Positive output gaps indicate that the economy is operating over potential GDP and at risk of inflation.
- Negative output gaps indicate that the economy is operating below potential GDP and possibly in recession.

Our goal today is to demonstrate different machine learning regression techniques. For simplicity, we're going to use the output gap based on the Congressional Budget Office's estimate of real potential GDP to train our model.

## The Models

Today we will look at three machine learning models used specifically for predicting continuous data:

- Decision forest regression (also known as Random forest regression).
- LASSO regression.
- Ridge regression.

### Decision Forest Regression

#### Decision Trees

Decision forest regression utilizes decision trees for continuous data which:

- Segment the data into subsets using data-based
*splitting rules*. - Assign the average of the target variable within a subset as the prediction for all observations that fall inside that subset.

To implement a single decision tree, a sample is split into segments using *recursive binary splitting*. This iterative approach determines where and how to split the data based on what leads to the lowest residual sum of squares (RSS).

#### Decision Forests

Single decision trees can have low, non-robust predictive power and suffer from high variance. This can be overcome using random decision forests that offer performance improvements by combining results from groups, or "forests", of trees.

The random decision forest algorithm:

- Randomly chooses $m$ predictors to be used as candidates for splitting the data.
- Constructs a decision tree from a bootstrapped training set.
- Repeats the decision tree formation for a specified number of iterations.
- Averages the results from all trees to make a final prediction.

### LASSO and Ridge Regression

LASSO and ridge regression aim to reduce prediction variances using a modified least squares approach. Let's look a little more closely at how this works.

Recall that ordinary least squares estimates coefficients through the minimization of the residual sum of squares (RSS):

$$ RSS = \bigg[\sum_{i=1}^n (y_i - \beta_0 - \sum_{j=1}^p \beta_j x_{ij})\bigg]^2$$

Penalized least squares estimates coefficients using a modified function:

$$ S_{\lambda} = \bigg[\sum_{i=1}^n (y_i - \beta_0 - \sum_{j=1}^p \beta_j x_{ij})\bigg]^2 + \lambda J_2 $$

where $\lambda$ is the tuning parameter and $\lambda J_2$ is the penalty term.

Method | Description | Penalty term |
---|---|---|

LASSO Regression | $L1$ penalized linear regression model. | $\lambda \sum_{j=1}^p |\beta_j|$ |

Ridge Regression | $L2$ penalized linear regression model. | $\lambda \sum_{j=1}^p \beta_j^2$ |

## Our Prediction Process

Our prediction process is motivated by the idea that as new information becomes available, it should be used to improve our forecasting model.

Based on this motivation, we use an expanding training window to make one-step ahead forecasts:

- Train the model using all observed data in the training window, features and output gap, up to time $t$.
- Predict the output gap at time $t + 1$ using the observed features at time $t + 1$.
- Expand the training window to include all observed data up to time $t + 1$.
- Repeat model training and prediction.

It's worth noting that while this method utilizes the most information available for prediction there is a trade-off in timeliness. If we were using this method in a real-world setting, it means we only forecast output gap one-quarter ahead. This may not be far enough in advance if we're using this forecast to guide business or investment decisions.

## Predictors

Today we will use a combination of common economic indicators and GDP subcomponents as predictors.

Variable | Description | Transformations |
---|---|---|

UMCSENT | University of Michigan consumer sentiment, quarterly average. | None |

UNRATE | Civilian unemployment rate as a percentage, quarterly average. | None. |

CR | The credit spread between Moody's BAA and AAA corporate bond yields. | None. |

TS | The difference between the yield on the 10-year treasury bond and the 1-yr treasury bill. | None |

FEDFUNDS | The Federal Funds rate. | First differences. |

SP500 | The S&P 500 index value at market closing. | Percent change, computed as difference in natural logs. |

CPIAUCSL | Consumer price index for all urban consumers. | Percent change, computed as difference in natural logs. |

INDPRO | The industrial production (IP) index. | Percent change, computed as difference in natural logs. |

HOUST | New privately-owned housing unit starts. | Percent change, computed as difference in natural logs. |

GAP_CH | The change in output gap. | None. |

For our model:

- All predictors are available from FRED in levels.
- Monthly variables are aggregated to quarterly data using averages.
- Four lags of all variables are included.

## Estimation in GAUSS

### Data Loading

Because we want to primarily focus on the models, rather than data cleaning, we don't go into the details of our data cleaning process here. Instead, the cleaned and prepped data is available for download here.

• Importing FRED Data To GAUSS

• Getting to Know Your Data With GAUSS

• Preparing And Cleaning FRED Data In GAUSS

Prior to estimating any model, we load the data and separate our outcome and feature data:

```
library gml;
rndseed 23423;
/*
** Load data and prepare data
*/
// Load dataset
dataset = __FILE_DIR $+ "reg_data.gdat";
data = loadd(dataset);
// Trim rows from the top of data to account
// for lagged and differenced data
max_lag = 4;
data = trimr(data, max_lag + 1, 0);
/*
** Extract outcome and features
*/
// Extract outcome variable
y = data[., "CBO_GAP"];
// Extract features
X = delcols(data, "date"$|"CBO_GAP");
```

### General One-Step-Ahead Process

The full data sample ranges from 1967Q1 to 2022Q4. We'll start computing one-step-ahead forecasts in 1995Q1, using an initial training period of 1967Q1 to 1994Q4.

To implement the expanding window one-step-ahead forecasts, we use a GAUSS `do while`

loop:

```
// Specify starting date
st_date = asDate("1994-Q4", "%Y-Q%q");
// Find the index of 'st_date'
st_indx = indnv(st_date, data[., "date"]);
// Iterate over remaining observations
// using expanding window to fit model
do while st_indx < rows(x)-1;
// Get y_train and x_train
y_train = y[1:st_indx];
x_train = X[1:st_indx, .];
x_test = X[st_indx+1, .];
// Fit model
...
// Compute one-step-ahead prediction
...
// Update st_indx
st_indx = st_indx + 1;
endo;
```

### Model and Prediction Procedures

The GAUSS machine learning library offers all the procedures we need for our model training and prediction.

Model | Fitting Procedure | Prediction Procedure |
---|---|---|

Decision Forest | decForestRFit | decForestPredict |

LASSO Regression | lassoFit | lmPredict |

Ridge Regression | ridgeFit | lmPredict |

To simplify our code we will will use three GAUSS procedures that combine the fitting and prediction for each method.

We define one procedure for the one-step ahead prediction for the LASSO model:

```
proc (1) = osaLasso(y_train, x_train, x_test, lambda);
local lasso_prediction;
/*
** Lasso Model
*/
// Declare 'mdl' to be an instance of a
// lassoModel structure to hold the estimation results
struct lassoModel mdl;
// Estimate the model with default settings
mdl = lassoFit(y_train, x_train, lambda);
// Make predictions using test data
lasso_prediction = lmPredict(mdl, x_test);
retp(lasso_prediction);
endp;
```

The second procedure performs fitting and prediction for the ridge model:

```
proc (1) = osaRidge(y_train, x_train, x_test, lambda);
local ridge_prediction;
/*
** Ridge Model
*/
// Declare 'mdl' to be an instance of a
// ridgeModel structure to hold the estimation results
struct ridgeModel mdl;
// Estimate the model with default settings
mdl = ridgeFit(y_train, x_train, lambda);
// Make predictions using test data
ridge_prediction = lmPredict(mdl, x_test);
retp(ridge_prediction);
endp;
```

The final procedure performs fitting and prediction for the decision forest model:

```
proc (1) = osaDF(y_train, x_train, x_test, struct dfControl dfc);
local df_prediction;
/*
** Decision Forest Model
*/
// Declare 'mdl' to be an instance of a
// dfModel structure to hold the estimation results
struct dfModel mdl;
// Estimate the model with default settings
mdl = decForestRFit(y_train, x_train, dfc);
// Make predictions using test data
df_prediction = decForestPredict(mdl, x_test);
retp(df_prediction);
endp;
```

### Computing Predictions

Finally we are ready to begin computing our predictions. First, we set the necessary tuning parameters:

```
/*
** Set up tuning parameters
*/
// L2 and L1 regularization penalty
lambda = 0.3;
/*
** Settings for decision forest
*/
// Use control structure for settings
struct dfControl dfc;
dfc = dfControlCreate();
// Turn on variable importance
dfc.variableImportanceMethod = 1;
// Turn on out-of-bag error calculation
dfc.oobError = 1;
```

Next, we initialize the starting point for our loop and our prediction storage matrix.

```
/*
** Initialize starting point and
** storage matrix for expanding
** window loop
*/
st_date = asDate("1994-Q4", "%Y-Q%q");
st_indx = indnv(st_date, data[., "date"]);
// Set up storage dataframe for predictions
// using one column for each model
osa_pred = asDF(zeros(rows(X), 3), "LASSO", "Ridge", "Decision Forest");
```

Finally, we implement our expanding window `do while`

loop:

```
do while st_indx < rows(X)-1;
// Get y and x subsets for
// fitting and prediction
y_train = Y[1:st_indx];
X_train = X[1:st_indx, .];
X_test = X[st_indx+1, .];
// LASSO Model
osa_pred[st_indx+1, "LASSO"] = osaLasso(y_train, X_train, X_test, lambda);
// Ridge Model
osa_pred[st_indx+1, "Ridge"] = osaRidge(y_train, X_train, X_test, lambda);
// Decision Forest Model
osa_pred[st_indx+1, "Decision Forest"] = osaDF(y_train, X_train, X_test, dfc);
// Update st_indx
st_indx = st_indx + 1;
endo;
```

## Results

### Prediction Visualization

The graph above plots the predictions from all three of our models against the actual CBO implied output gap. There are a few things worth noting about these results:

- All three models fail to predict the output decline associated with start of the COVID pandemic. This isn't a surprise as the onset of COVID was a hard to predict shock to the economy.
- The models underestimate the persistent effects of the 2008 global financial crisis. While all three trend in the same direction as the observed output gap, they all predict better economic performance than actually obtained. This tells us that our feature set doesn't contain the information needed to capture the ongoing effects of the financial crisis. We could potentially improve our model by incorporating more features like bank balances or home foreclosures.
- The ridge model overestimates the short-term impacts of the 2008 global financial crisis, predicting a larger drop in the output gap than both the other models and the actual output gap.

### Model Performance

We can also compare the performance of our models using the mean squared error (MSE). This can easily be calculated from our predictions and our observed output gap:

```
/*
** Computing MSE
*/
// Compute residuals
residuals = osa_pred - y;
// Filter for prediction window
residuals = selif(residuals, data[., "date"] .>= st_date);
// Compute the MSE for prediction window
mse = meanc((residuals).^2);
```

A comparison of the MSE shows that models perform similarly, with our decision forest model offering a slight advantage in MSE over LASSO and ridge.

Model | MSE |
---|---|

LASSO | 2.08 |

Ridge | 2.36 |

Decision Forest | 1.80 |

### Conclusion

In today's blog we examined the performance of several machine learning regression models used to predict output gap. This blog is meant to provide an introduction to these models and leaves room to discuss model selection and optimization in future blogs.

After today's blog, you should have a better understanding of:

- The foundations of decision forest regression models.
- LASSO and ridge regression models.
- How machine learning models can be used to help predict economic and financial outcomes.

### Further Machine Learning Reading

- Fundamentals of Tuning Machine Learning Hyperparameters
- Applications of Principal Components Analysis in Finance
- Predicting The Output Gap With Machine Learning Regression Models
- Classification with Regularized Logistic Regression
- Understanding Cross-Validation
- Machine Learning With Real-World Data

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.

010581Respected Eric:

I use GAUSS Machine Learning 0.0.2 to run the code of this example. Error output shows "G0025 : Undefined symbol: 'lmPredict' [output_gap_ml.gss, line 141]"

Thanks

EricPost authorHi,

Please email me directly at [email protected] and I will provide you with the latest version.

Best,

Eric