Introduction
The Constrained Maximum Likelihood (CML) library was one of the original constrained optimization tools in GAUSS. Like many GAUSS libraries, it was later updated to an "MT" version.
The "MT" version libraries, named for their use of multi-threading, provide significant performance improvements, greater flexibility, and a more intuitive parameter-handling system.
This blog post explores:
- The key features, differences, and benefits of upgrading from CML to CMLMT.
- A practical example to help you transition code from CML to CMLMT.
Key Features Comparison
Before diving into the details of transitioning from CML to CMLMT, it’s useful to understand how these two libraries compare. The table below highlights key differences, from optimization algorithms to constraint handling.
Feature | CML (2.0) | CMLMT (3.0) |
---|---|---|
Optimization Algorithm | Sequential Quadratic Programming (SQP) with BFGS, DFP, and Newton-Raphson methods. | SQP with improved secant algorithms and Cholesky updates for Hessian approximation. |
Parallel Computing Support | No multi-threading support. | Multi-threading enabled for numerical derivatives and bootstrapping. |
Log-Likelihood Computation | Function and derivatives computed separately, requiring redundant calculations. | Unified procedure for computing log-likelihood, first derivatives, and second derivatives, reducing redundant computations. |
Parameter Handling | Supports only a simple parameter vector. | Supports both a simple parameter vector and a PV structure (for advanced parameter management). Additionally, allows an unlimited number of data arguments in the log-likelihood function, simplifying the function and improving computation time. |
Constraints Handling | Supports linear and nonlinear equality/inequality constraints. | Improved constraint handling with an explicit control structure for optimization. |
Line Search Methods | STEPBT (quadratic/cubic fitting), BRENT, HALF, and BHHHSTEP. | Introduces the Augmented Lagrangian Penalty method for constrained models. Also includes STEPBT (quadratic/cubic fitting), BRENT, HALF, and BHHHSTEP. |
Statistical Inference | Basic hypothesis testing. | Enhanced hypothesis testing for constrained models, including profile likelihoods, bootstrapping, and Lagrange multipliers. |
Handling of Fixed Parameters | Global variables used to fix parameters. | Uses the cmlmtControl structure for setting fixed parameters. |
Run-Time Adjustments | Uses global variables to modify settings. | The cmlmtControl structure enables flexible tuning of optimization settings. |
Advantages of CMLMT
Beyond just performance improvements, CMLMT introduces several key advantages that make it a more powerful and user-friendly tool for constrained maximum likelihood estimation. These improvements do more than just support multi-threading, they provide greater flexibility, efficiency, and accuracy in model estimation.
Some of the most notable advantages include:
- Threading & Multi-Core Support: CMLMT enables multi-threading, significantly speeding up numerical derivatives and bootstrapping, whereas CML is single-threaded.
- Simplified Parameter Handling: Only CMLMT supports both a simple parameter vector and the
PV
structure for advanced models. Additionally, CMLMT allows dynamic arguments, making it easier to pass data to the log-likelihood function. - More Efficient Log-Likelihood Computation: CMLMT integrates the analytic computation of log-likelihood, first derivatives, and second derivatives into a user-specified log-likelihood procedure, reducing redundancy.
- Augmented Lagrangian Method: CMLMT introduces an Augmented Lagrangian Penalty Line Search for handling constrained optimization.
- Enhanced Statistical Inference: CMLMT includes bootstrapping, profile likelihoods, and hypothesis testing improvements, which are limited in CML.
Converting a CML Model to CMLMT
Let's use a simple example to walk through the step-by-step transition from CML to CMLMT. In this model, we will perform constrained maximum likelihood estimation for a Poisson model.
The dataset is included in the CMLMT library.
Original CML Code
We will start by estimating the model using CML:
new;
library cml;
#include cml.ext;
cmlset;
// Load data
data = loadd(getGAUSSHome("pkgs/cmlmt/examples/cmlmtpsn.dat"));
// Set constraints for first two coefficients
// to be equal
_cml_A = { 1 -1 0 };
_cml_B = { 0 };
// Specify starting parameters
beta0 = .5|.5|.5;
// Run optimization
{ _beta, f0, g, cov, retcode } = CMLprt(cml(data, 0, &logl, beta0));
// Specify log-likelihood function
proc logl(b, data);
local m, x, y;
// Extract x and y
y = data[., 1];
x = data[., 2:4];
m = x * b;
retp(y .* m - exp(m));
endp;
This code prints the following output:
Mean log-likelihood -0.670058 Number of cases 100 Covariance of the parameters computed by the following method: Inverse of computed Hessian Parameters Estimates Std. err. Gradient ------------------------------------------------------------------ P01 0.1199 0.1010 0.0670 P02 0.1199 0.1010 -0.0670 P03 0.8343 0.2648 0.0000 Number of iterations 5 Minutes to convergence 0.00007
Step One: Switch to CMLMT Library
The first step in updating our program file is to load the CMLMT library instead of the CML library.
Original CML Code |
---|
// Clear workspace and load library
new;
library cml;
New CMLMT Code |
---|
// Clear workspace and load library
new;
library cmlmt;
Step Two: Load Data
Since data loading is handled by GAUSS base procedures, no changes are necessary.
Original CML and CMLMT Code |
---|
// Load data
x = loadd(getGAUSSHome("pkgs/cmlmt/examples/cmlmtpsn.dat"));
// Extract x and y
y = x[., 1];
x = x[., 2:4];
Step Three: Setting Constraints
The next step is to convert the global variables used to control optimization in CML into members of the cmlmtControl
structure. To do this, we need to:
- Declare an instance of the
cmlmtControl
structure. - Initialize the
cmlmtControl
structure with default values usingcmlmtControlCreate
. - Assign the constraint vectors to the corresponding
cmlmtControl
structure members.
Original CML Code |
---|
// Set constraints for first two coefficients
// to be equal
_cml_A = { 1 -1 0 };
_cml_B = { 0 };
New CMLMT Code |
---|
//Declare and initialize control structure
struct cmlmtControl ctl;
ctl = cmlmtControlCreate();
// Set constraints for first two coefficients
// to be equal
ctl.A = { 1 -1 0 };
ctl.B = { 0 };
Step Four: Specify Starting Values
In our original CML code, we specified the starting parameters using a vector of values. In the CMLMT library, we can specify the starting values using either a parameter vector or a PV
structure.
The advantage of the PV
structure is that it allows parameters to be stored in different formats, such as symmetric matrices or matrices with fixed parameters. This, in turn, can simplify calculations inside the log-likelihood function.
If we use the parameter vector option, we don't need to make any changes to our original code:
Original CML and CMLMT Code |
---|
// Specify starting parameters
beta0 = .5|.5|.5;
Using the PV
structure option requires additional steps:
- Declare an instance of the
PV
structure. - Initialize the
PV
structure using thePVCreate
procedure. - Use the
PVpack
functions to create and define specific parameter types within thePV
structure.
New CMLMT Code to use PV |
---|
// Declare instance of 'PV' struct
struct PV p0;
// Initialize p0
p0 = pvCreate();
// Create parameter vector
beta0 = .5|.5|.5;
// Load parameters into p0
p0 = pvPack(p0, beta0, "beta");
Step Five: The Likelihood Function
In CML, the likelihood function takes only two parameters:
- A parameter vector.
- A data matrix.
Original CML Code |
---|
// Specify log-likelihood function
proc logl(b, data);
local m, x, y;
// Extract x and y
y = data[., 1];
x = data[., 2:4];
m = x * b;
retp(y .* m - exp(m));
endp;
The likelihood function in CMLMT is enhanced in several ways:
- We can pass as many arguments as needed to the likelihood function. This allows us to simplify the function, which, in turn, can speed up optimization.
- We return output from the likelihood function in the form of the
modelResults
structure. This makes computations thread-safe and allows us to specify both gradients and Hessians inside the likelihood function:- The likelihood function values are stored in the
mm.function
member. - The gradients are stored in the
mm.gradient
member. - The Hessians are stored in the
mm.hessian
member.
- The likelihood function values are stored in the
- The last input into the likelihood function must be
ind
.ind
is passed to your log-likelihood function when it is called by CMLMT. It tells your function whether CMLMT needs you to compute the gradient and Hessian, or just the function value. (see online examples). NOTE: You are never required to compute the gradient or Hessian if requested byind
. If you do not compute it, CMLMT will compute numerical derivatives.
New CMLMT Code |
---|
// Specify log-likelihood function
// Allows separate arguments for y & x
// Also has 'ind' as last argument
proc logl(b, y, x, ind);
local m;
// Declare modeResult structure
struct modelResults mm;
// Likelihood computation
m = x * b;
// If the first element of 'ind' is not zero,
// CMLMT wants us to compute the function value
// which we assign to mm.function
if ind[1];
mm.function = y .* m - exp(m);
endif;
retp(mm);
endp;
Step Six: Run Optimization
We estimate the maximum likelihood parameters in CML using the cml
procedure. The cml
procedure returns five parameters, and a results table is printed using the cmlPrt
procedure.
Original CML Code |
---|
/*
** Run optimization
*/
// Run optimization
{ _beta, f0, g, cov, retcode } = cml(data, 0, &logl, beta0);
// Print results
CMLprt(_beta, f0, g, cov, retcode);
In CMLMT, estimation is performed using the cmlmt
procedure. The cmlmt
procedure returns a cmlmtResults
structure, and a results table is printed using the cmlmtPrt
procedure.
To convert to cmlmt
, we take the following steps:
- Declare an instance of the
cmlmtResults
structure. - Call the
cmlmt
procedure. Following an initial pointer to the log-likelihood function, the parameter and data inputs are passed tocmlmt
in the exact order they are specified in the log-likelihood function. - The output from
cmlmt
is stored in thecmlmtResults
structure,out
.
New CMLMT Code |
---|
/*
** Run optimization
*/
// Declare output structure
struct cmlmtResults out;
// Run estimation
out = cmlmt(&logl, beta0, y, x, ctl);
// Print output
cmlmtPrt(out);
Conclusion
Upgrading from CML to CMLMT provides faster performance, improved numerical stability, and easier parameter management. The addition of multi-threading, better constraint handling, and enhanced statistical inference makes CMLMT a powerful upgrade for GAUSS users.
If you're still using CML, consider transitioning to CMLMT for a more efficient and flexible modeling experience!
Further Reading
- Beginner's Guide To Maximum Likelihood Estimation
- Maximum Likelihood Estimation in GAUSS
- Ordered Probit Estimation with Constrained Maximum Likelihood
Try out The GAUSS Constrained Maximum Likelihood MT Library
Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.