Predicting Recessions with Machine Learning Techniques

by Eric · Published February 21, 2023 · Updated April 17, 2024

Introduction

Forecasts have become a valuable commodity in today's data-driven world. Unfortunately, not all forecasting models are of equal caliber, and incorrect predictions can lead to costly decisions.

Today we will compare the performance of several prediction models used to predict recessions. In particular, we’ll look at how a traditional baseline econometric model compares to machine learning models.

Our models will include:

The aim of today’s blog isn’t to provide a definitive answer on what model is best, but rather to provide background and context for different models. We will look more closely at model tuning and optimization in a later blog.

Background

Before diving into estimating our models, let's look more closely at the data and models we will be using.

Recession dating

Today we will focus on predicting recessions, using the NBER recession indicator. The NBER indicator:

Uses a dummy variable to represent periods of expansion and recessions.
Takes a value of 1 during a recession and 0 during an expansion.
Can be directly imported from FRED using the series ID "USREC".

Because the NBER recession data is binary data, our forecasting exercise becomes one of classification. In other words, we want to identify whether an observation is more likely to fall into the non-recession or recession category.

For this reason, we will use need to use models that are suitable for discrete data and classification.

Models

Probit

The probit model is a discrete choice model which:

Is commonly used in classical econometrics to model binary or ordered data.
Estimates the probability that an outcome falls into a specific category.
Has a simple log-likelihood function, which can be used to estimate the model parameters with maximum likelihood.

K-Nearest Neighbor

The k-nearest neighbor (KNN) method is one of the simplest non-parametric techniques for classification and regression.

KNN relies on the intuition that if an observation is "near" another it is likely to fall within the same category.

The KNN model:

Locates the $k$ nearest neighbors using the observed features and a measure of distance, such as euclidian.
Finds the most common "class" among the $k$ nearest neighbors.
Assigns the most common "class" as the predicted category for the unknown outcome.

Decision Trees

Decision trees are a machine learning model which can be used to predict discrete or continuous data.

Tree-based methods rely on a fairly simple process:

Split the data into subsets, using the characteristics of the data. For example, if “Married” is one of our observed characteristics, we can split the sample into "Yes" and "No". We can ask multiple "questions" about our data to create branches that break our data into smaller and smaller subsets.
The mostly frequently occuring outcome within the subset is then used as the outcome classifier prediction for all observations that fall inside those subsets.

Ridge Regression

Ridge regression is part of a family of linear regression models that aim to improve on the standard least squares fitting model. These methods use a modified least squares approach to shrink coefficient estimates towards zero, which in turn, reduces the estimates’ variances.

Like OLS, these methods rely on minimizing the residual sum of squares (RSS) to estimate coefficients. However, they add a penalty, based on cumulative coefficient size, to the RSS objective function.

Model Setup

Today we will include a number of variables in our model. These are chosen based on commonly used predictors in the recession modeling literature:

Recession Model Predictors
Variable	Description
INDPRO	Monthly growth rates of industrial production. Included in the level and 1-month lag.
PAYEMS	Monthly growth rates of nonfarm payrolls. Included in the level and 1-month lag.
RPI	Monthly growth rates of real personal income excluding transfer payments. Included in the level and 1-month lag.
UNRATE	Annual growth rate of headline unemployment. Included in the level and 1-month lag.
YLD	The yield curve slope, computed as the difference between the yield on the 10-year treasury bond and the 3-month treasury bill. Included in the level, 6-month lag, and 12-month lag.
CORP	The credit spread between between Moody's BAA and AAA corporate bond yields. Included in the level, 6-month lag, and 12-month lag.

Our complete dataset ranges from January, 1963 to December, 2022.

Training period	January, 1963 to December, 1998
Testing period	January, 1999 to December, 2022

The complete dataset, including lags, is available here.

Model Comparison

There are many components to evaluating how well a classification model performs. To compare models, we will use a set of binary class metrics including:

Model Comparison Measures
Tool	Description
Confusion matrix	Summarizes the performance of a classification algorithm. Compares the number of predicted outcomes to actual outcomes in tabular form.
Accuracy	Overall model accuracy. Equal to the number of correct predictions divided by the total number of predictions.
Precision	How good a model is at correctly identifying positive outcomes. Equal to the number of true positives divided by the number of false positives plus true positives.
Recall	How good a model is at correctly predicting all the positive outcomes. Equal to the number of true positives divided by the number of false negatives plus true positives.
F-score	The harmonic mean of the precision and recall. A score of 1 indicates perfect precision and recall.
Specificity	Ability to predict a true negative. Equal to the number of true negatives divided by the number of true negatives plus false positives.
Area under the ROC	Reflects the probability that a model ranks a random positive more highly than a random negative.

It's important to view these metrics in the context of the data being modeled. For example, our data is not very balanced across classes. There are 263 non-recession observations and 28 recession observations. This implies that:

Model accuracy is not a very informative metric. If we predict that all observations are non-recession, our accuracy is 90%.
F-score is a better metric for us to consider. It gives a more balanced picture of how our model performs across both the recession and non-recession class.

Estimation

We will use two GAUSS libraries to estimate our models:

Constrained Maximum Likelihood MT (CMLMT) to estimate the probit model.
GAUSS Machine Learning (GML) to estimate our machine learning models.

Loading our data and libraries

To start we will load our data directly from the url:

// Load libraries
library gml, cmlmt;

/*
** Load data from url
*/
url = "https://github.com/aptech/gauss_blog/blob/master/machine-learning/recession-predicting/data/final_data.gdat?raw=true";
reg_data = loadd(url);

// Compute summary statistics
dstatmt(reg_data);

This loads our regression dataset and prints a table of summary statistics to the Command Window:

----------------------------------------------------------------------------------------
Variable         Mean     Std Dev     Variance     Minimum     Maximum    Valid  Missing
----------------------------------------------------------------------------------------

date            -----       -----        -----  1963-01-01  2022-12-01      720     0
USREC          0.1181      0.3229       0.1043           0           1      720     0
INDPRO         0.1976      0.9403       0.8842       -13.2       6.275      720     0
PAYEMS         0.1428      0.5746       0.3302      -13.59       3.431      720     0
RPI            0.2627       1.253        1.569      -13.55          20      720     0
UNRATE       -0.03208       1.393        1.941        -8.6        11.1      720     0
corp           -1.021      0.4389       0.1926       -3.38       -0.32      720     0
yld             1.496       1.221        1.492       -2.65        4.42      720     0
yld_l6          1.504       1.215        1.475       -2.65        4.42      720     0
yld_l12           1.5       1.215        1.475       -2.65        4.42      720     0
corp_l6        -1.017      0.4397       0.1933       -3.38       -0.32      720     0
corp_l12       -1.015      0.4403       0.1939       -3.38       -0.32      720     0
ip_l           0.1986      0.9397        0.883       -13.2       6.275      720     0
nfp_l          0.1425      0.5747       0.3302      -13.59       3.431      720     0
rpi_l          0.2632       1.253        1.569      -13.55          20      720     0
un_l         -0.03222       1.393        1.942        -8.6        11.1      720     0

The file final_data.gdat is a GAUSS data file format introduced in GAUSS 23. The dataset is compiled using raw date from FRED. You can view the data import, transformation, and merging here.

Splitting Data

Next, we will use the trainTestSplit function to split the data into a test and training set.

/*
** Split data
*/

// Dependent data 
y = reg_data[., "USREC"];

// Load independent variables
x = reg_data[., 3:cols(reg_data)];

// Split data into (60%) training
// and (40%) test sets
shuffle = "False";
{ y_train, y_test, x_train, x_test } = 
     trainTestSplit(y, x, 0.6, shuffle);

Because our data is time series data, it is important to keep the sequential ordering. To do this, we turn "shuffling" off when splitting the data.

Probit Model Results

To estimate the probit model we will rely on the probit likelihood function:

$$LL(\beta|y;X) = \sum^N_{i=1} \big[y_i ln(F(x_i \beta)) + (1 - y_i)ln(1 - F(x_i \beta))\big]$$

/*
** Likelihood Function
*/
proc (1) = probit(beta_, y, X, ind);
    local mu;

    // Declare 'mm' to be a modelResults
    // structure to hold the function value
    struct modelResults mm;

    // Compute mu
    mu = X * beta_;

    // Assign the log-likelihood value to the
    // 'function' member of the modelResults structure
    mm.function = y.*lncdfn(mu) + (1-y).*lncdfnc(mu);

    // Return the model results structure
    retp(mm);
endp;

We can quickly estimate this model using the GAUSS cmlmt procedure:

/*
** Estimate model
*/
// Assign starting values for estimation
beta_strt = 0.5*ones(cols(x), 1);

// Declare 'out' to be a cmlmtResults structure
// to hold the results of the estimation
struct cmlmtResults cout;

// Perform estimation and print results
cout = cmlmt(&probit, beta_strt, y_train, x_train);
call cmlmtPrt(cout);

The fitted probit model can be used to predict the probability that an outcome lies in a recessionary period given the observed data. Using a cutoff of 50% we will sort predictions into recession/non-recession periods

/*
** Predictions
*/
// Extract parameters
beta_hat = pvUnpack(cout.par, "x");

// Predicted probability of recession 
y_prob = cdfn(x_test * beta_hat);

// Classify data as recession or non-recession
y_hat = where(y_prob .>= 0.5, 1, 0);

Plotted against the observed recession dates, the estimated probability of recession looks fairly good:

However, we can get a more robust evaluation of the model performance using the classificationMetrics from the GML library:

call binaryClassMetrics(y_test, y_hat);

The first portion of this report is the Confusion Matrix:

Probit model with 50% cutoff.
==================================
                  Confusion matrix
==================================
                   Predicted class
                   ---------------
                         +       -
       True class
       ----------
            1 (+)       22       6
            0 (-)       17     243

The confusion matrix provides a summary of how many predictions our model got "right" and how many it got "wrong", based on which category they fall in:

The confusion matrix for our estimated probit model shows:

There are 22 correctly predicted recession periods and 6 incorrectly predicted recession periods.
There are 243 correctly predicted non-recession and 17 incorrectly predicted non-recession periods.

The remaining statistics help quantify these outcomes more clearly:

             Accuracy           0.9201
            Precision           0.5641
               Recall           0.7857
              F-score           0.6567
          Specificity           0.9346
    Balanced Accuracy           0.8692

Overall for the probit model:

Has an F-score of 66%.
Is better at predicting negative outcomes (93% specificity) than positive outcomes (precision 56%).

KNN Model Results

We will start our machine learning models with the KNN model. We will fit the model on the same training data using the knnFit procedure:

/*
** Train the model
*/

// Specify the number of neighbors
k = 5;

// The knnModl structure 
// holds the trained model
struct knnModel mdl;

// Train model using KNN
mdl = knnFit(y_train, X_train, k);

After fitting the model, the knnClassify procedure can be used to predict outcomes and metrics for the test data:

/*
** Predictions on the test set
*/
y_hat = knnClassify(mdl, X_test);

// Print out model quality 
// evaluation statistics
print "KNN Model";
call binaryClassMetrics(y_test, y_hat);

KNN Model
==================================
                  Confusion matrix
==================================
                   Predicted class
                   ---------------
                         +       -
       True class
       ----------
            1 (+)       20       8
            0 (-)        3     257

The confusion matrix for our estimated KNN model shows:

There are 20 correctly predicted recession periods and 8 incorrectly predicted recession periods.
There are 257 correctly predicted non-recession periods and 3 incorrectly predicted non-recession periods.

         Accuracy           0.9618
        Precision           0.8696
           Recall           0.7143
          F-score           0.7843
      Specificity           0.9885
Balanced Accuracy           0.8514

The KNN model:

Has an F-score of 78%.
Is better at predicting negative outcomes than positive outcomes.

Compared to our baseline probit model the KNN model:

Shows improved performance when balancing performance across both classes, with a better F-score.
Is better at predicting negative outcomes (99% specificity) but worse at predicting positive outcomes (precision 87%).

Decision Forest Classification

Next, we fit our decision forest classification model using the decForestCFit procedure:

/*
** Train the model
*/

// The dfModel structure 
// holds the trained model
struct dfModel dfm;

// Fit training data 
// using decision forest classification
dfm = decForestCFit(y_train, x_train);

After fitting the model, the decForestPredict procedure can be used to predict outcomes and metrics for the test data:

/*
** Predictions on the test set
*/
y_hat = decForestPredict(dfm, x_test);

// Print out model quality 
// evaluation statistics
print "Decision Forest";
call binaryClassMetrics(y_test, y_hat);

Decision Forest
==================================
                  Confusion matrix
==================================
                   Predicted class
                   ---------------
                         +       -
       True class
       ----------
            1 (+)       25       3
            0 (-)        1     259

The confusion matrix for our estimated decision forest model shows:

There are 25 correctly predicted recession periods and 3 false positives.
There are 259 correctly predicted non-recession periods and 1 false negatives.

         Accuracy           0.9861
        Precision           0.9615
           Recall           0.8929
          F-score           0.9259
      Specificity           0.9962
Balanced Accuracy           0.9445

The decision forest model:

Has an F-score of 93%.
Is better at predicting negative outcomes (99% specificity) than positive outcomes (96% precision).

Compared to our baseline probit model the decision forest model:

Is much better than our baseline probit model when balancing performance across both classes.
Is better at predicting negative outcomes and positive outcomes.

Ridge Classification

Finally, we estimate the ridge classification model using the ridgeCFit procedure:

/*
** Train the model
*/

// L2 regularization penalty
lambda = 0.5;

// Declare 'mdl' to be an instance of a
// ridgeModel structure to hold the estimation results
struct ridgeModel mdl;

// Train the model
// using the ridge classification
mdl = ridgeCFit(y_train, X_train, lambda);

The ridgeCPredict procedure can be used to predict outcomes and metrics for the test data:

/*
** Predictions on the test set
*/

// Compute test mse
predictions = ridgeCPredict(mdl, x_test);

// Print out model quality 
// evaluation statistics
print "Ridge Classification";
call binaryClassMetrics(y_test, predictions);

Ridge Classification
==================================
                  Confusion matrix
==================================
                   Predicted class
                   ---------------
                         +       -
       True class
       ----------
            1 (+)       22       6
            0 (-)        4     256

The confusion matrix for our estimated ridge classification model shows:

There are 22 correctly predicted recession periods and 6 incorrectly predicted recession periods.
There are 256 correctly predicted non-recession periods and 4 incorrectly predicted non-recession periods.

         Accuracy           0.9653
        Precision           0.8462
           Recall           0.7857
          F-score           0.8148
      Specificity           0.9846
Balanced Accuracy           0.8852

The ridge classification model:

Has an F-score of 81%.
Is better at predicting negative outcomes (98% specificity) than positive outcomes (84% precision).

Compared to our baseline probit model the ridge classification model:

Is better than our baseline probit model when balancing performance across both classes.
Is better at predicting negative outcomes and at predicting positive outcomes.

Results Summary

	Probit	KNN	Decision Forest	Ridge Classification
True Positives	22	20	25	22
False Positives	17	3	1	4
True Negatives	243	257	259	256
False Negatives	6	8	3	6
Accuracy	92%	96%	99%	96%
Precision	56%	87%	96%	84%
Recall	79%	71%	89%	79%
F-score	66%	78%	92%	81%
Specificity	93%	99%	99%	98%
Balanced Accuracy	87%	85%	94%	89%

From the summary table, we see clearly that, even without tuning, the decision forest classification is superior by all evaluation standards to our other models. While all models perform strongly when predicting non-recession periods, the decision forest model is the clear winner for predicting recession periods.

Conclusion

In today's blog, we examined the performance of several prediction models used to predict recessions. After today's blog, you should have a better understanding of:

How to implement machine learning models in GAUSS.
How to compare model classification models.
How machine learning models can be used to improve prediction.