decForestRFit

Purpose

Fit a decision forest regression model.

Format

dfm = decForestRFit( y_train, x_train );
dfm = decForestRFit( y_train, x_train, dfc );

Input

y_train
Nx1 vector, or NxK matrix of dependent variables.
x_train
NxP matrix of independent variables.
dfc
An instance of a dfControl structure containing the following members:
  dfc.numTrees Scalar, number of decision trees in the forest (must be an integer). Default = 100.
  dfc.obsPerTree Scalar, percentage of observations per tree. Range = 0 - 1.0 Default = 1.0.
  dfc.featurePerNode Scalar, number of features considered as possible splits at each node. Default = nvars/3.
  dfc.maxTreeDepth Scalar, maximum tree depth. Default = unlimited.
  dfc.minObsNode Scalar, minimum observations in each leaf node. Default = 1.
  dfc.impurityThreshold Scalar, if the impurity at a node is less than the impurityThreshold, no more splits will be performed. Default = 0.
  dfc.oobError Scalar, 1 to compute OOB error, 0 otherwise. Default = 0.
  dfc.variableImpurityMethod Scalar, method of calculating variable importance.
0 = none.
1 = mean decrease in impurity.
2 = mean decrease in accuracy (MDA).
3 = scaled MDA.
Default = 0.

Output

dfm
An instance of a dfModel structure containing the following relevant members:
  dfm.variableImportance Matrix, 1 x p, variable importance measure if computation of variable importance is specified, zero otherwise.
  dfm.oobError Scalar, out-of-bag error if OOB error computation is specified, zero otherwise.
  dfm.numClasses Scalar, number of classes if classification model, zero otherwise.

Examples

Basic usage

new;
library gml;

// Get file name with full path
dataset = getGAUSSHome() $+ "pkgs/gml/examples/winequality.csv";

// Split data into (70%) training and (30%) test sets
{ y_train, y_test, X_train, X_test } = trainTestSplit(dataset, "quality ~ .", 0.7);

// Structure to hold the model fit
struct dfModel dfm;

// Fit training data using decision forest
dfm = decForestRFit(y_train, X_train);

// Make predictions
y_hat = decForestPredict(dfm, X_test);

new;
library gml;

// Get file name with full path
dataset = getGAUSSHome() $+ "pkgs/gml/examples/winequality.csv";

// Split data into (70%) training and (30%) test sets
{ y_train, y_test, X_train, X_test } = trainTestSplit(dataset, "quality ~ .", 0.7);

// Declare control structure
// and fill with default settings
struct dfControl ctl;
ctl = dfControlCreate();

// Set variable importance measure to mean Gini index decrease
ctl.variableImportanceMethod = 1;

// Turn on out-of-bag error computation
ctl.oobError = 1;

// Structure to hold the model fit
struct dfModel dfm;

// Fit training data using decision forest
dfm = decForestRFit(y_train, X_train, ctl);

Remarks

The dfModel structure contains a fourth, internally used member, opaqueModel, which contains model details used by decForestPredict.

See also

decForestPredict, decForestCFit

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.

REQUEST A FREE QUOTE

Thank you for your interest in the GAUSS family of products.

© Aptech Systems, Inc. All rights reserved.

Privacy Policy | Sitemap