### Introduction

Forecasts have become a valuable commodity in today's data-driven world. Unfortunately, not all forecasting models are of equal caliber, and incorrect predictions can lead to costly decisions.

Today we will compare the performance of several prediction models used to predict recessions. In particular, we’ll look at how a traditional baseline econometric model compares to machine learning models.

Our models will include:

- A baseline probit model.
- K-nearest neighbors.
- Decision forests.
- Ridge classification.

## Background

Before diving into estimating our models, let's look more closely at the data and models we will be using.

### Recession dating

Today we will focus on predicting recessions, using the NBER recession indicator. The NBER indicator:

- Uses a dummy variable to represent periods of expansion and recessions.
- Takes a value of 1 during a recession and 0 during an expansion.
- Can be directly imported from FRED using the series ID
`"USREC"`

.

Because the NBER recession data is binary data, our forecasting exercise becomes one of classification. In other words, we want to identify whether an observation is more likely to fall into the non-recession or recession category.

For this reason, we will use need to use models that are suitable for discrete data and classification.

### Models

#### Probit

The probit model is a discrete choice model which:

- Is commonly used in classical econometrics to model binary or ordered data.
- Estimates the probability that an outcome falls into a specific category.
- Has a simple log-likelihood function, which can be used to estimate the model parameters with maximum likelihood.

#### K-Nearest Neighbor

The k-nearest neighbor (KNN) method is one of the simplest non-parametric techniques for classification and regression.

KNN relies on the intuition that if an observation is "near" another it is likely to fall within the same category.

The KNN model:

- Locates the $k$ nearest neighbors using the observed features and a measure of distance, such as euclidian.
- Finds the most common "class" among the $k$ nearest neighbors.
- Assigns the most common "class" as the predicted category for the unknown outcome.

#### Decision Trees

Decision trees are a machine learning model which can be used to predict discrete or continuous data.

Tree-based methods rely on a fairly simple process:

- Split the data into subsets, using the characteristics of the data. For example, if “Married” is one of our observed characteristics, we can split the sample into "Yes" and "No". We can ask multiple "questions" about our data to create branches that break our data into smaller and smaller subsets.
- The outcome average within the subset is then used as the outcome prediction for all observations that fall inside those subsets.

#### Ridge Regression

Ridge regression is part of a family of linear regression models that aim to improve on the standard least squares fitting model. These methods use a modified least squares approach to shrink coefficient estimates towards zero, which in turn, reduces the estimates’ variances.

Like OLS, these methods rely on minimizing the residual sum of squares (RSS) to estimate coefficients. However, they add a penalty, based on cumulative coefficient size, to the RSS objective function.

## Model Setup

Today we will include a number of variables in our model. These are chosen based on commonly used predictors in the recession modeling literature:

## Recession Model Predictors |
||
---|---|---|

Variable | Description | |

INDPRO | Monthly growth rates of industrial production. Included in the level and 1-month lag. | |

PAYEMS | Monthly growth rates of nonfarm payrolls. Included in the level and 1-month lag. | |

RPI | Monthly growth rates of real personal income excluding transfer payments. Included in the level and 1-month lag. | |

UNRATE | Annual growth rate of headline unemployment. Included in the level and 1-month lag. | |

YLD | The yield curve slope, computed as the difference between the yield on the 10-year treasury bond and the 3-month treasury bill. Included in the level, 6-month lag, and 12-month lag. | |

CORP | The credit spread between between Moody's BAA and AAA corporate bond yields. Included in the level, 6-month lag, and 12-month lag. |

Our complete dataset ranges from January, 1963 to December, 2022.

Training period | January, 1963 to December, 1998 |

Testing period | January, 1999 to December, 2022 |

The complete dataset, including lags, is available here.

### Model Comparison

There are many components to evaluating how well a classification model performs. To compare models, we will use a set of binary class metrics including:

## Model Comparison Measures |
||
---|---|---|

Tool | Description | |

Confusion matrix | Summarizes the performance of a classification algorithm. Compares the number of predicted outcomes to actual outcomes in tabular form. | |

Accuracy | Overall model accuracy. Equal to the number of correct predictions divided by the total number of predictions. | |

Precision | How good a model is at correctly identifying positive outcomes. Equal to the number of true positives divided by the number of false positives plus true positives. | |

Recall | How good a model is at correctly predicting all the positive outcomes. Equal to the number of true positives divided by the number of false negatives plus true positives. | |

F-score | The harmonic mean of the precision and recall. A score of 1 indicates perfect precision and recall. | |

Specificity | Ability to predict a true negative. Equal to the number of true negatives divided by the number of true negatives plus false positives. | |

Area under the ROC | Reflects the probability that a model ranks a random positive more highly than a random negative. |

It's important to view these metrics in the context of the data being modeled. For example, our data is not very balanced across classes. There are 263 non-recession observations and 28 recession observations. This implies that:

- Model accuracy is not a very informative metric. If we predict that all observations are non-recession, our accuracy is 90%.
- F-score is a better metric for us to consider. It gives a more balanced picture of how our model performs across both the recession and non-recession class.

## Estimation

We will use two GAUSS libraries to estimate our models:

- Constrained Maximum Likelihood MT (CMLMT) to estimate the probit model.
- GAUSS Machine Learning (GML) to estimate our machine learning models.

### Loading our data and libraries

To start we will load our data directly from the url:

```
// Load libraries
library gml, cmlmt;
/*
** Load data from url
*/
url = "https://github.com/aptech/gauss_blog/blob/master/machine-learning/recession-predicting/data/final_data.gdat?raw=true";
reg_data = loadd(url);
// Compute summary statistics
dstatmt(reg_data);
```

This loads our regression dataset and prints a table of summary statistics to the **Command Window**:

---------------------------------------------------------------------------------------- Variable Mean Std Dev Variance Minimum Maximum Valid Missing ---------------------------------------------------------------------------------------- date ----- ----- ----- 1963-01-01 2022-12-01 720 0 USREC 0.1181 0.3229 0.1043 0 1 720 0 INDPRO 0.1976 0.9403 0.8842 -13.2 6.275 720 0 PAYEMS 0.1428 0.5746 0.3302 -13.59 3.431 720 0 RPI 0.2627 1.253 1.569 -13.55 20 720 0 UNRATE -0.03208 1.393 1.941 -8.6 11.1 720 0 corp -1.021 0.4389 0.1926 -3.38 -0.32 720 0 yld 1.496 1.221 1.492 -2.65 4.42 720 0 yld_l6 1.504 1.215 1.475 -2.65 4.42 720 0 yld_l12 1.5 1.215 1.475 -2.65 4.42 720 0 corp_l6 -1.017 0.4397 0.1933 -3.38 -0.32 720 0 corp_l12 -1.015 0.4403 0.1939 -3.38 -0.32 720 0 ip_l 0.1986 0.9397 0.883 -13.2 6.275 720 0 nfp_l 0.1425 0.5747 0.3302 -13.59 3.431 720 0 rpi_l 0.2632 1.253 1.569 -13.55 20 720 0 un_l -0.03222 1.393 1.942 -8.6 11.1 720 0

*final_data.gdat*is a GAUSS data file format introduced in GAUSS 23. The dataset is compiled using raw date from FRED. You can view the data import, transformation, and merging here.

### Splitting Data

Next, we will use the `trainTestSplit`

function to split the data into a test and training set.

```
/*
** Split data
*/
// Dependent data
y = reg_data[., "USREC"];
// Load independent variables
x = reg_data[., 3:cols(reg_data)];
// Split data into (60%) training
// and (40%) test sets
shuffle = "False";
{ y_train, y_test, x_train, x_test } =
trainTestSplit(y, x, 0.6, shuffle);
```

## Probit Model Results

To estimate the probit model we will rely on the probit likelihood function:

$$LL(\beta|y;X) = \sum^N_{i=1} \big[y_i ln(F(x_i \beta)) + (1 - y_i)ln(1 - F(x_i \beta))\big]$$

```
/*
** Likelihood Function
*/
proc (1) = probit(beta_, y, X, ind);
local mu;
// Declare 'mm' to be a modelResults
// structure to hold the function value
struct modelResults mm;
// Compute mu
mu = X * beta_;
// Assign the log-likelihood value to the
// 'function' member of the modelResults structure
mm.function = y.*lncdfn(mu) + (1-y).*lncdfnc(mu);
// Return the model results structure
retp(mm);
endp;
```

We can quickly estimate this model using the GAUSS `cmlmt`

procedure:

```
/*
** Estimate model
*/
// Assign starting values for estimation
beta_strt = 0.5*ones(cols(x), 1);
// Declare 'out' to be a cmlmtResults structure
// to hold the results of the estimation
struct cmlmtResults cout;
// Perform estimation and print results
cout = cmlmt(&probit, beta_strt, y_train, x_train);
call cmlmtPrt(cout);
```

The fitted probit model can be used to predict the probability that an outcome lies in a recessionary period given the observed data. Using a cutoff of 50% we will sort predictions into recession/non-recession periods

```
/*
** Predictions
*/
// Extract parameters
beta_hat = pvUnpack(cout.par, "x");
// Predicted probability of recession
y_prob = cdfn(x_test * beta_hat);
// Classify data as recession or non-recession
y_hat = where(y_prob .>= 0.5, 1, 0);
```

Plotted against the observed recession dates, the estimated probability of recession looks fairly good:

However, we can get a more robust evaluation of the model performance using the `binaryClassMetrics`

from the GML library:

`call binaryClassMetrics(y_test, y_hat);`

The first portion of this report is the Confusion Matrix:

Probit model with 50% cutoff. Confusion matrix ---------------- Class + 22 6 Class - 17 243

The confusion matrix provides a summary of how many predictions our model got "right" and how many it got "wrong", based on which category they fall in:

The confusion matrix for our estimated probit model shows:

- There are 22 correctly predicted recession periods and 6 incorrectly predicted recession periods.
- There are 243 correctly predicted non-recession and 17 incorrectly predicted non-recession periods.

The remaining statistics help quantify these outcomes more clearly:

Accuracy 0.9201 Precision 0.7857 Recall 0.5641 F-score 0.6567 Specificity 0.9759 AUC 0.77

Overall for the probit model:

- Has an F-score of 66%.
- Is better at predicting negative outcomes (98% specificity) than positive outcomes (precision 79%).

## KNN Model Results

We will start our machine learning models with the KNN model. We will fit the model on the same training data using the `knnFit`

procedure:

```
/*
** Train the model
*/
// Specify the number of neighbors
k = 5;
// The knnModl structure
// holds the trained model
struct knnModel mdl;
// Train model using KNN
mdl = knnFit(y_train, X_train, k);
```

After fitting the model, the `knnClassify`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
y_hat = knnClassify(mdl, X_test);
// Print out model quality
// evaluation statistics
print "KNN Model";
call binaryClassMetrics(y_test, y_hat);
```

KNN Model Confusion matrix ---------------- Class + 20 8 Class - 3 257

The confusion matrix for our estimated KNN model shows:

- There are 20 correctly predicted recession periods and 8 incorrectly predicted recession periods.
- There are 257 correctly predicted non-recession periods and 3 incorrectly predicted non-recession periods.

Accuracy 0.9618 Precision 0.7143 Recall 0.8696 F-score 0.7843 Specificity 0.9698 AUC 0.9197

The KNN model:

- Has an F-score of 78%.
- Is better at predicting negative outcomes than positive outcomes.

Compared to our baseline probit model the KNN model:

- Shows improved performance when balancing performance across both classes, with a better F-score.
- Is better at predicting negative outcomes (97% specificity) but worse at predicting positive outcomes (precision 71%).

## Decision Forest Classification

Next, we fit our decision forest classification model using the `decForestCFit`

procedure:

```
/*
** Train the model
*/
// The dfModel structure
// holds the trained model
struct dfModel dfm;
// Fit training data
// using decision forest classification
dfm = decForestCFit(y_train, x_train);
```

After fitting the model, the `decForestPredict`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
y_hat = decForestPredict(dfm, x_test);
// Print out model quality
// evaluation statistics
print "Decision Forest";
call binaryClassMetrics(y_test, y_hat);
```

Decision Forest Confusion matrix ---------------- Class + 25 3 Class - 0 260

The confusion matrix for our estimated decision forest model shows:

- There are 25 correctly predicted recession periods and 3 false positives.
- There are 260 correctly predicted non-recession periods and 0 false negatives.

Accuracy 0.9896 Precision 0.8929 Recall 1 F-score 0.9434 Specificity 0.9886 AUC 0.9943

The decision forest model:

- Has an F-score of 94%.
- Is better at predicting negative outcomes (98% specificity) than positive outcomes (89% precision).

Compared to our baseline probit model the decision forest model:

- Is much better than our baseline probit model when balancing performance across both classes.
- Is better at predicting negative outcomes and positive outcomes.

## Ridge Classification

Finally, we estimate the ridge classification model using the `ridgeCFit`

procedure:

```
/*
** Train the model
*/
// L2 regularization penalty
lambda = 0.5;
// Declare 'mdl' to be an instance of a
// ridgeModel structure to hold the estimation results
struct ridgeModel mdl;
// Train the model
// using the ridge classification
mdl = ridgeCFit(y_train, X_train, lambda);
```

The `ridgeCPredict`

procedure can be used to predict outcomes and metrics for the test data:

```
/*
** Predictions on the test set
*/
// Compute test mse
predictions = ridgeCPredict(mdl, x_test);
// Print out model quality
// evaluation statistics
print "Ridge Classification";
call binaryClassMetrics(y_test, predictions);
```

Ridge Classification Confusion matrix ---------------- Class + 14 14 Class - 2 258

The confusion matrix for our estimated ridge classification model shows:

- There are 14 correctly predicted recession periods and 14 incorrectly predicted recession periods.
- There are 258 correctly predicted non-recession periods and 2 incorrectly predicted non-recession periods.

Accuracy 0.9444 Precision 0.5000 Recall 0.8750 F-score 0.6364 Specificity 0.9485 AUC 0.9118

The ridge classification model:

- Has an F-score of 63%.
- Is better at predicting negative outcomes (94% specificity) than positive outcomes (50% precision).

Compared to our baseline probit model the ridge classification model:

- Is slightly worse than our baseline probit model when balancing performance across both classes.
- Is better at predicting negative outcomes but worse at predicting positive outcomes.

## Results Summary

Probit | KNN | Decision Forest | Ridge Classification | |
---|---|---|---|---|

True Positives | 22 | 20 | 25 | 14 |

False Positives | 17 | 3 | 0 | 2 |

True Negatives | 243 | 257 | 260 | 258 |

False Negatives | 6 | 8 | 3 | 14 |

Accuracy | 92% | 96% | 99% | 94% |

Precision | 79% | 71% | 89% | 50% |

Recall | 56% | 87% | 100% | 88% |

F-score | 66% | 78% | 94% | 64% |

Specificity | 98% | 97% | 99% | 95% |

Area under the ROC | 77% | 91% | 99% | 91% |

From the summary table, we see clearly that, even without tuning, the decision forest classification is superior by all evaluation standards to our other models. While all models perform strongly when predicting non-recession periods, the decision forest model is the clear winner for predicting recession periods.

## Conclusion

In today's blog we examined the performance of several prediction models used to predict recessions. After today's blog, you should have a better understanding of:

- How to implement machine learning models in GAUSS.
- How to compare model classification models.
- How machine learning models can be used to improve prediction.

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.