### Introduction

Forecasts have become a valuable commodity in today's data-driven world. Unfortunately, not all forecasting models are of equal caliber, and incorrect predictions can lead to costly decisions.

Today we will compare the performance of several prediction models used to predict recessions. In particular, we’ll look at how a traditional baseline econometric model compares to machine learning models.

Our models will include:

- A baseline probit model.
- K-nearest neighbors.
- Decision forests.
- Ridge classification.

## Background

Before diving into estimating our models, let's look more closely at the data and models we will be using.

### Recession dating

Today we will focus on predicting recessions, using the NBER recession indicator. The NBER indicator:

- Uses a dummy variable to represent periods of expansion and recessions.
- Takes a value of 1 during a recession and 0 during an expansion.
- Can be directly imported from FRED using the series ID
`"USREC"`

.

Because the NBER recession data is binary data, our forecasting exercise becomes one of classification. In other words, we want to identify whether an observation is more likely to fall into the non-recession or recession category.

For this reason, we will use need to use models that are suitable for discrete data and classification.

### Models

#### Probit

The probit model is a discrete choice model which:

- Is commonly used in classical econometrics to model binary or ordered data.
- Estimates the probability that an outcome falls into a specific category.
- Has a simple log-likelihood function, which can be used to estimate the model parameters with maximum likelihood.

#### K-Nearest Neighbor

The k-nearest neighbor (KNN) method is one of the simplest non-parametric techniques for classification and regression.

KNN relies on the intuition that if an observation is "near" another it is likely to fall within the same category.

The KNN model:

- Locates the $k$ nearest neighbors using the observed features and a measure of distance, such as euclidian.
- Finds the most common "class" among the $k$ nearest neighbors.
- Assigns the most common "class" as the predicted category for the unknown outcome.

#### Decision Trees

Decision trees are a machine learning model which can be used to predict discrete or continuous data.

Tree-based methods rely on a fairly simple process:

- Split the data into subsets, using the characteristics of the data. For example, if “Married” is one of our observed characteristics, we can split the sample into "Yes" and "No". We can ask multiple "questions" about our data to create branches that break our data into smaller and smaller subsets.
- The mostly frequently occuring outcome within the subset is then used as the outcome classifier prediction for all observations that fall inside those subsets.

#### Ridge Regression

Ridge regression is part of a family of linear regression models that aim to improve on the standard least squares fitting model. These methods use a modified least squares approach to shrink coefficient estimates towards zero, which in turn, reduces the estimates’ variances.

Like OLS, these methods rely on minimizing the residual sum of squares (RSS) to estimate coefficients. However, they add a penalty, based on cumulative coefficient size, to the RSS objective function.

## Model Setup

Today we will include a number of variables in our model. These are chosen based on commonly used predictors in the recession modeling literature:

## Recession Model Predictors |
||
---|---|---|

Variable | Description | |

INDPRO | Monthly growth rates of industrial production. Included in the level and 1-month lag. | |

PAYEMS | Monthly growth rates of nonfarm payrolls. Included in the level and 1-month lag. | |

RPI | Monthly growth rates of real personal income excluding transfer payments. Included in the level and 1-month lag. | |

UNRATE | Annual growth rate of headline unemployment. Included in the level and 1-month lag. | |

YLD | The yield curve slope, computed as the difference between the yield on the 10-year treasury bond and the 3-month treasury bill. Included in the level, 6-month lag, and 12-month lag. | |

CORP | The credit spread between between Moody's BAA and AAA corporate bond yields. Included in the level, 6-month lag, and 12-month lag. |

Our complete dataset ranges from January, 1963 to December, 2022.

Training period | January, 1963 to December, 1998 |

Testing period | January, 1999 to December, 2022 |

The complete dataset, including lags, is available here.

### Model Comparison

There are many components to evaluating how well a classification model performs. To compare models, we will use a set of binary class metrics including:

## Model Comparison Measures |
||
---|---|---|

Tool | Description | |

Confusion matrix | Summarizes the performance of a classification algorithm. Compares the number of predicted outcomes to actual outcomes in tabular form. | |

Accuracy | Overall model accuracy. Equal to the number of correct predictions divided by the total number of predictions. | |

Precision | How good a model is at correctly identifying positive outcomes. Equal to the number of true positives divided by the number of false positives plus true positives. | |

Recall | How good a model is at correctly predicting all the positive outcomes. Equal to the number of true positives divided by the number of false negatives plus true positives. | |

F-score | The harmonic mean of the precision and recall. A score of 1 indicates perfect precision and recall. | |

Specificity | Ability to predict a true negative. Equal to the number of true negatives divided by the number of true negatives plus false positives. | |

Area under the ROC | Reflects the probability that a model ranks a random positive more highly than a random negative. |

It's important to view these metrics in the context of the data being modeled. For example, our data is not very balanced across classes. There are 263 non-recession observations and 28 recession observations. This implies that:

- Model accuracy is not a very informative metric. If we predict that all observations are non-recession, our accuracy is 90%.
- F-score is a better metric for us to consider. It gives a more balanced picture of how our model performs across both the recession and non-recession class.

## Estimation

We will use two GAUSS libraries to estimate our models:

- Constrained Maximum Likelihood MT (CMLMT) to estimate the probit model.
- GAUSS Machine Learning (GML) to estimate our machine learning models.

### Loading our data and libraries

To start we will load our data directly from the url:

```
// Load libraries
library gml, cmlmt;
/*
** Load data from url
*/
url = "https://github.com/aptech/gauss_blog/blob/master/machine-learning/recession-predicting/data/final_data.gdat?raw=true";
reg_data = loadd(url);
// Compute summary statistics
dstatmt(reg_data);
```

This loads our regression dataset and prints a table of summary statistics to the **Command Window**:

---------------------------------------------------------------------------------------- Variable Mean Std Dev Variance Minimum Maximum Valid Missing ---------------------------------------------------------------------------------------- date ----- ----- ----- 1963-01-01 2022-12-01 720 0 USREC 0.1181 0.3229 0.1043 0 1 720 0 INDPRO 0.1976 0.9403 0.8842 -13.2 6.275 720 0 PAYEMS 0.1428 0.5746 0.3302 -13.59 3.431 720 0 RPI 0.2627 1.253 1.569 -13.55 20 720 0 UNRATE -0.03208 1.393 1.941 -8.6 11.1 720 0 corp -1.021 0.4389 0.1926 -3.38 -0.32 720 0 yld 1.496 1.221 1.492 -2.65 4.42 720 0 yld_l6 1.504 1.215 1.475 -2.65 4.42 720 0 yld_l12 1.5 1.215 1.475 -2.65 4.42 720 0 corp_l6 -1.017 0.4397 0.1933 -3.38 -0.32 720 0 corp_l12 -1.015 0.4403 0.1939 -3.38 -0.32 720 0 ip_l 0.1986 0.9397 0.883 -13.2 6.275 720 0 nfp_l 0.1425 0.5747 0.3302 -13.59 3.431 720 0 rpi_l 0.2632 1.253 1.569 -13.55 20 720 0 un_l -0.03222 1.393 1.942 -8.6 11.1 720 0

*final_data.gdat*is a GAUSS data file format introduced in GAUSS 23. The dataset is compiled using raw date from FRED. You can view the data import, transformation, and merging here.

### Splitting Data

Next, we will use the `trainTestSplit`

function to split the data into a test and training set.

```
/*
** Split data
*/
// Dependent data
y = reg_data[., "USREC"];
// Load independent variables
x = reg_data[., 3:cols(reg_data)];
// Split data into (60%) training
// and (40%) test sets
shuffle = "False";
{ y_train, y_test, x_train, x_test } =
trainTestSplit(y, x, 0.6, shuffle);
```

## Probit Model Results

To estimate the probit model we will rely on the probit likelihood function:

$$LL(\beta|y;X) = \sum^N_{i=1} \big[y_i ln(F(x_i \beta)) + (1 - y_i)ln(1 - F(x_i \beta))\big]$$

```
/*
** Likelihood Function
*/
proc (1) = probit(beta_, y, X, ind);
local mu;
// Declare 'mm' to be a modelResults
// structure to hold the function value
struct modelResults mm;
// Compute mu
mu = X * beta_;
// Assign the log-likelihood value to the
// 'function' member of the modelResults structure
mm.function = y.*lncdfn(mu) + (1-y).*lncdfnc(mu);
// Return the model results structure
retp(mm);
endp;
```

We can quickly estimate this model using the GAUSS `cmlmt`

procedure:

```
/*
** Estimate model
*/
// Assign starting values for estimation
beta_strt = 0.5*ones(cols(x), 1);
// Declare 'out' to be a cmlmtResults structure
// to hold the results of the estimation
struct cmlmtResults cout;
// Perform estimation and print results
cout = cmlmt(&probit, beta_strt, y_train, x_train);
call cmlmtPrt(cout);
```

The fitted probit model can be used to predict the probability that an outcome lies in a recessionary period given the observed data. Using a cutoff of 50% we will sort predictions into recession/non-recession periods

```
/*
** Predictions
*/
// Extract parameters
beta_hat = pvUnpack(cout.par, "x");
// Predicted probability of recession
y_prob = cdfn(x_test * beta_hat);
// Classify data as recession or non-recession
y_hat = where(y_prob .>= 0.5, 1, 0);
```

Plotted against the observed recession dates, the estimated probability of recession looks fairly good:

However, we can get a more robust evaluation of the model performance using the `classificationMetrics`

from the GML library:

`call binaryClassMetrics(y_test, y_hat);`

The first portion of this report is the Confusion Matrix:

Probit model with 50% cutoff. ================================== Confusion matrix ================================== Predicted class --------------- + - True class ---------- 1 (+) 22 6 0 (-) 17 243

The confusion matrix provides a summary of how many predictions our model got "right" and how many it got "wrong", based on which category they fall in:

The confusion matrix for our estimated probit model shows:

- There are 22 correctly predicted recession periods and 6 incorrectly predicted recession periods.
- There are 243 correctly predicted non-recession and 17 incorrectly predicted non-recession periods.

The remaining statistics help quantify these outcomes more clearly:

Accuracy 0.9201 Precision 0.5641 Recall 0.7857 F-score 0.6567 Specificity 0.9346 Balanced Accuracy 0.8692

Overall for the probit model:

- Has an F-score of 66%.
- Is better at predicting negative outcomes (93% specificity) than positive outcomes (precision 56%).

## KNN Model Results

We will start our machine learning models with the KNN model. We will fit the model on the same training data using the `knnFit`

procedure:

```
/*
** Train the model
*/
// Specify the number of neighbors
k = 5;
// The knnModl structure
// holds the trained model
struct knnModel mdl;
// Train model using KNN
mdl = knnFit(y_train, X_train, k);
```

After fitting the model, the `knnClassify`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
y_hat = knnClassify(mdl, X_test);
// Print out model quality
// evaluation statistics
print "KNN Model";
call binaryClassMetrics(y_test, y_hat);
```

KNN Model ================================== Confusion matrix ================================== Predicted class --------------- + - True class ---------- 1 (+) 20 8 0 (-) 3 257

The confusion matrix for our estimated KNN model shows:

- There are 20 correctly predicted recession periods and 8 incorrectly predicted recession periods.
- There are 257 correctly predicted non-recession periods and 3 incorrectly predicted non-recession periods.

Accuracy 0.9618 Precision 0.8696 Recall 0.7143 F-score 0.7843 Specificity 0.9885 Balanced Accuracy 0.8514

The KNN model:

- Has an F-score of 78%.
- Is better at predicting negative outcomes than positive outcomes.

Compared to our baseline probit model the KNN model:

- Shows improved performance when balancing performance across both classes, with a better F-score.
- Is better at predicting negative outcomes (99% specificity) but worse at predicting positive outcomes (precision 87%).

## Decision Forest Classification

Next, we fit our decision forest classification model using the `decForestCFit`

procedure:

```
/*
** Train the model
*/
// The dfModel structure
// holds the trained model
struct dfModel dfm;
// Fit training data
// using decision forest classification
dfm = decForestCFit(y_train, x_train);
```

After fitting the model, the `decForestPredict`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
y_hat = decForestPredict(dfm, x_test);
// Print out model quality
// evaluation statistics
print "Decision Forest";
call binaryClassMetrics(y_test, y_hat);
```

Decision Forest ================================== Confusion matrix ================================== Predicted class --------------- + - True class ---------- 1 (+) 25 3 0 (-) 1 259

The confusion matrix for our estimated decision forest model shows:

- There are 25 correctly predicted recession periods and 3 false positives.
- There are 259 correctly predicted non-recession periods and 1 false negatives.

Accuracy 0.9861 Precision 0.9615 Recall 0.8929 F-score 0.9259 Specificity 0.9962 Balanced Accuracy 0.9445

The decision forest model:

- Has an F-score of 93%.
- Is better at predicting negative outcomes (99% specificity) than positive outcomes (96% precision).

Compared to our baseline probit model the decision forest model:

- Is much better than our baseline probit model when balancing performance across both classes.
- Is better at predicting negative outcomes and positive outcomes.

## Ridge Classification

Finally, we estimate the ridge classification model using the `ridgeCFit`

procedure:

```
/*
** Train the model
*/
// L2 regularization penalty
lambda = 0.5;
// Declare 'mdl' to be an instance of a
// ridgeModel structure to hold the estimation results
struct ridgeModel mdl;
// Train the model
// using the ridge classification
mdl = ridgeCFit(y_train, X_train, lambda);
```

The `ridgeCPredict`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
// Compute test mse
predictions = ridgeCPredict(mdl, x_test);
// Print out model quality
// evaluation statistics
print "Ridge Classification";
call binaryClassMetrics(y_test, predictions);
```

Ridge Classification ================================== Confusion matrix ================================== Predicted class --------------- + - True class ---------- 1 (+) 22 6 0 (-) 4 256

The confusion matrix for our estimated ridge classification model shows:

- There are 22 correctly predicted recession periods and 6 incorrectly predicted recession periods.
- There are 256 correctly predicted non-recession periods and 4 incorrectly predicted non-recession periods.

Accuracy 0.9653 Precision 0.8462 Recall 0.7857 F-score 0.8148 Specificity 0.9846 Balanced Accuracy 0.8852

The ridge classification model:

- Has an F-score of 81%.
- Is better at predicting negative outcomes (98% specificity) than positive outcomes (84% precision).

Compared to our baseline probit model the ridge classification model:

- Is better than our baseline probit model when balancing performance across both classes.
- Is better at predicting negative outcomes and at predicting positive outcomes.

## Results Summary

Probit | KNN | Decision Forest | Ridge Classification | |
---|---|---|---|---|

True Positives | 22 | 20 | 25 | 22 |

False Positives | 17 | 3 | 1 | 4 |

True Negatives | 243 | 257 | 259 | 256 |

False Negatives | 6 | 8 | 3 | 6 |

Accuracy | 92% | 96% | 99% | 96% |

Precision | 56% | 87% | 96% | 84% |

Recall | 79% | 71% | 89% | 79% |

F-score | 66% | 78% | 92% | 81% |

Specificity | 93% | 99% | 99% | 98% |

Balanced Accuracy | 87% | 85% | 94% | 89% |

From the summary table, we see clearly that, even without tuning, the decision forest classification is superior by all evaluation standards to our other models. While all models perform strongly when predicting non-recession periods, the decision forest model is the clear winner for predicting recession periods.

## Conclusion

In today's blog, we examined the performance of several prediction models used to predict recessions. After today's blog, you should have a better understanding of:

- How to implement machine learning models in GAUSS.
- How to compare model classification models.
- How machine learning models can be used to improve prediction.

### Further Reading

- Applications of Principal Components Analysis in Finance
- Predicting The Output Gap With Machine Learning Regression Models
- Fundamentals of Tuning Machine Learning Hyperparameters
- Understanding Cross-Validation
- Machine Learning With Real-World Data
- Classification with Regularized Logistic Regression

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.