Random Forest Learners

Heuristics provides learners for training random forests, which we describe on this page along with a guide to their parameters.

Shared Parameters

All of learners provided by Heuristics for training random forests are RandomForestLearners. In addition to the shared learner parameters, these learners support all parameters of Optimal Tree Learners with the following differences:

  • max_depth defaults to 10
  • cp is fixed to 0
  • the default value of minbucket depends on the problem type:
    • classification: 1
    • regression: 5
    • survival: 15

These learners support the following additional parameters to control their behavior.

num_trees

num_trees accepts a non-negative Integer to control the number of trees in the forest. The default value is 100.

max_features

max_features controls the number of features considered at each split when training the random forest. See RelativeParameterInput for the different ways to specify this value. The default value is :auto, which uses :sqrt + 20 features for regression problems, and :sqrt features otherwise (where :sqrt is the square root of the number of features).

Classification Learners

The RandomForestClassifier is used for training random forests for classification problems. The following values for criterion are permitted:

There are no additional parameters beyond the shared parameters.

Regression Learners

The RandomForestRegressor is used for training random forests for regression problems. The following values for criterion are permitted:

In addition to the shared parameters, these learners also support the shared regression parameters as well as the Optimal Regression Tree parameters.