This example runs the Nested Logit model using GAUSS DC application. It uses the Greene and Hensher dataset with 810 observations of on 4 modes of transportation: air, train, bus, or car. The features in the dataset include terminal waiting time ( ttme ), in vehicle cost for all stages ( invc ), in vehicle time ( invt ), generalized costs measure ( gc ), household income ( hinc), and traveling group size ( psize ).
Load the data
This example uses the formula string syntax to load data using loadd
. The formula string syntax syntax allows users to load, transform and analyze data in one line.
new;
cls;
library dc;
// Load data
fname = getGAUSShome() $+ "pkgs/dc/examples/hensher.dat";
y = loadd(fname);
Set up the model parameters
The Discrete Choice Module uses a suite of dcSet
functions to set various features of the model. An instance of the dcControl
structure must be declared for storing all parameters prior to calling any dcSet
functions.
// Step One: Declare dc dcCtrol structure
struct dcControl dcCt;
// Initialize dc dcCtrol structure
dcCt = dcControlCreate();
// Step Two:Describe data
// Name of dependent variable
dcSetYVar(&dcCt, y[., 1]);
dcSetYLabel(&dcCt, "Mode");
// Y Category Labels
dcSetYCategoryLabels(&dcCt, "Air,Train,Bus,Car");
// Specify reference category (excluded)
dcSetReferenceCategory(&dcCt, "Car");
// Name of independent variable
varlist = "TTME,GC,AIRHINC";
dcSetAttributeVars(&dcCt, y[., 2]~y[., 5]~y[., 8]);
dcSetAttributeLabels(&dcCt, "TTME,GC,AIRHINC");
// Set-up nested levels
dcMakeLogitNests(&dcCt, 2);
// Set attributes and categories for lower nest (Nest One)
dcSetLogitNestAttributes(&dcCt, 1, "TTME,GC");
dcSetLogitNestCategories(&dcCt, 1, "Air,Train,Bus,Car");
// Reference category is car (last column)
mask = { 1 1 1 0,
1 1 1 0,
1 1 1 0};
// Intercepts for three categories at first nest level
b0 = { 1 1 1 0};
dcCt.startValues =
pvPackmi(dcCt.startValues,
b0, "b0", mask[1, .], 1);
// Two attribute variables at first level nest
g1 = { .1,
.1 };
dcCt.startValues =
pvPackmi(dcCt.startValues,
g1, "g1", mask[1:2, 2], 3);
// One attribute variable in second level nest
g2 = .1;
dcCt.startValues =
pvPackmi(dcCt.startValues,
g2, "g2", mask[1, 3], 4);
// Two categories at second level - interaction terms
t2 = { .1,
.1 };
dcCt.startValues =
pvPackmi(dcCt.startValues,
t2, "t2", mask[1:2, 2], 5);
// Set attributes and categories for lower nest (Nest Two)
dcSetLogitNestAttributes(&dcCt, 2, "AIRHINC");
dcSetLogitNestCategories(&dcCt, 2, "Fly,Ground");
// Make nest assignments
dcAssignLogitNests(&dcCt , 1, "Air,Train,Bus,Car", "Fly,Ground,Ground,Ground");
Estimate the Model
The Nested Logit Model can be estimated using the nestedLogit
function. This function takes a dcControl
structure as an input and returns all output to a dcOut
structure. In addition, a complete report of results can be printed to screen using the printDCOut
procedure.
// Step Three: Declare output structure
struct dcout dcout1;
// Step Four: Call NestedLogit
dcout1 = nestedLogit(dcCt);
// Print Results
call printDCOut(dcOut1);
Output
The output from nestedLogit
reads
Nested Logit Results Number of Observations: 210 Degrees of Freedom: 202 1 - Air 2 - Train 3 - Bus 4 - Car Distribution Among Outcome Categories For Mode Dependent Variable Proportion
Air 0.2762
Train 0.3000
Bus 0.1429
Car 0.2810
COEFFICIENTS Coefficient Estimates ----------------------------------------------------------------------------------------------- Variables Coefficient se tstat pval Constant: Air 6.04*** 1.33 4.54 5.66e-06 Constant: Train 5.06*** 0.676 7.49 6.79e-14 Constant: Bus 4.1*** 0.629 6.51 7.33e-11 TTME -0.113*** 0.0118 -9.52 1.68e-21 GC -0.0316*** 0.00743 -4.25 2.15e-05 AIRHINC 0.0153 0.0111 1.38 0.168 Corr: Fly 0.586*** 0.113 5.18 2.18e-07 Corr: Ground 0.389** 0.158 2.46 0.0138 ----------------------------------------------------------------------------------------------- *p-val<0.1 **p-val<0.05 ***p-val<0.001
ODDS RATIO Odds Ratio ---------------------------------------------------------------------------- Variables Odds Ratio 95% Lower Bound 95% Upper Bound TTME 0.89349 0.87302 0.91444 GC 0.96891 0.95489 0.98313 AIRHINC 1.0154 0.99355 1.0378 Corr: Fly 1.7968 1.4397 2.2425 Corr: Ground 1.4754 1.0827 2.0106 ---------------------------------------------------------------------------- MARGINAL EFFECTS
Partial probability with respect to mean attributes Marginal Effects for Attribute: TTME --------------------------------------------------------------------------- Variables Air Train Bus Car
Air -0.013*** 0.00364** 0.00113 0.00384
( 0.00326) ( 0.0013) ( 0.00147) ( 2.93)
Train 0.00549** -0.0216*** 0.00409* 0.0139
( 0.00166) ( 0.00341) ( 0.00214) ( 1.85)
Bus 0.00171** 0.00409** -0.00954*** 0.00432
( 0.000597) ( 0.00125) ( 0.00269) ( 1.4)
Car 0.00579*** 0.0139*** 0.00432** -0.022
( 0.0015) ( 0.00346) ( 0.00192) ( 1.8)
--------------------------------------------------------------------------- Attribute equations in separate columns Estimate se in parentheses. *p-val<0.1 **p-val<0.05 ***p-val<0.001
Marginal Effects for Attribute: GC --------------------------------------------------------------------------- Variables Air Train Bus Car
Air -0.00364** 0.00102* 0.000318 0.00108*
( 0.00135) ( 0.000576) ( 0.000421) ( 0.000596)
Train 0.00154** -0.00606*** 0.00115* 0.00389**
( 0.000488) ( 0.00181) ( 0.000613) ( 0.00158)
Bus 0.000479** 0.00115** -0.00268** 0.00121*
( 0.000233) ( 0.000431) ( 0.00089) ( 0.000616)
Car 0.00162* 0.00389** 0.00121 -0.00618***
( 0.000869) ( 0.00132) ( 0.000942) ( 0.000969)
--------------------------------------------------------------------------- Attribute equations in separate columns Estimate se in parentheses. *p-val<0.1 **p-val<0.05 ***p-val<0.001
MARGINAL EFFECTS
Partial probability with respect to mean attributes Marginal Effects for Attribute: AIRHINC ------------------------------------------ Variables Fly Ground
Fly 0.00302 -0.00302**
( 0.00326) ( 0.0013)
Ground -0.00128 0.00128
( 0.00166) ( 0.00341)
------------------------------------------ Attribute equations in separate columns Estimate se in parentheses. *p-val<0.1 **p-val<0.05 ***p-val<0.001
********************SUMMARY STATISTICS******************** MEASURES OF FIT: -2 Ln(Lu): 387.3123 -2 Ln(Lr): All coeffs equal zero 582.2436 -2 Ln(Lr): J-1 intercepts 567.5175 LR Chi-Square (coeffs equal zero): 194.9313 d.f. 8.0000 p-value = 0.0000 LR Chi-Square (J-1 intercepts): 180.2052 d.f. 5.0000 p-value = 0.0000 Count R2, Percent Correctly Predicted: 148.0000 Adjusted Percent Correctly Predicted: 0.5782 Madalla's pseudo R-square: 0.5760 McFadden's pseudo R-square: 0.3175 Ben-Akiva and Lerman's Adjusted R-square: 0.3175 Cragg and Uhler's pseudo R-square: 0.0976 Akaike Information Criterion: 1.9205 Bayesian Information Criterion: 2.0480 Hannan-Quinn Information Criterion: 1.9721 OBSERVED AND PREDICTED OUTCOMES | Predicted Observed | Air Train Bus Car Total ------------------------------------------------------------------- Air | 37 3 2 16 58 Train | 2 49 1 11 63 Bus | 0 3 23 4 30 Car | 5 14 1 39 59 ------------------------------------------------------------------- Total | 44 69 27 70 210