Learner Interface

The core object used throughout all IAI packages is the learner (often abbreviated in code as lnr). This is the model object that holds all of the parameters that configure the behavior of the algorithm, as well as the fitted model.

There are many different types of learner, depending on the type of problem and the specific algorithm being used. Refer to the specific module documentation for information about the corresponding learner types that are available:

Creating Learners

You can make a learner by creating an instance of the learner type. For example, we can create an OptimalTreeClassifier by running:

lnr = IAI.OptimalTreeClassifier()

Unfitted OptimalTreeClassifier

Parameters

Learners have a number of parameters that can be changed to control the behavior of the algorithm. These can be set in the following ways (again with OptimalTreeClassifier as an example):

When creating the learner:

lnr = IAI.OptimalTreeClassifier(random_seed=1)

Unfitted OptimalTreeClassifier:
  random_seed: 1

Using the learner fields to set or modify parameter values:

lnr.criterion = :gini;
lnr

Unfitted OptimalTreeClassifier:
  criterion:   gini
  random_seed: 1

Using the set_params! method to set or modify parameter values:

IAI.set_params!(lnr; random_seed=2)

Unfitted OptimalTreeClassifier:
  criterion:   gini
  random_seed: 2

Warning

Learner objects have some fields that are used internally and should not be modified. Any field with a name ending in _ is for internal use and should not be set.

Common learner parameters

There are some parameters used by all IAI learners that can be used to control the behavior of the algorithm:

normalize_X::Bool: whether to normalize the data so that each numeric feature lies in the interval [0,1]. Typically improves performance when data is poorly-scaled to begin with. Defaults to true.
criterion::Symbol: which scoring criterion to use when training the tree (refer to the documentation on scoring criteria for more information). The default value depends on the learner being used.
treat_unknown_level_missing::Bool: when making a prediction, if a categorical or ordinal feature has an unknown level, the learner treats it as missing. May need to specify the missing data mode for certain learners first. Defaults to false.
random_seed::Union{Nothing,Integer}: the random seed for the local search procedure. Must be a positive integer, or nothing, in which case the random seed is determined each time fit! is called using the current global random state. Defaults to nothing.
parallel_processes::Union{Nothing,Vector{<:Integer}}: controls the parallelization of the learner fitting algorithm. Can be a vector specifying the IDs of processes to use, or nothing which will result in all available processes being used. Defaults to nothing. See Parallelization for more information on parallelizing learner fitting.
show_progress::Bool: controls whether to show progress meters during the learner fitting process. Defaults to true.

Regression learner parameters

The following parameters are used by all regression learners:

normalize_y::Bool: whether to normalize the target data so that it lies in the interval [0,1]. Typically improves performance when data is poorly-scaled to begin with. Defaults to true.
hinge_epsilon::Real: controls the behaviour of the hinge loss criteria. Defaults to 1e-3.
tweedie_variance_power::Real: controls the behaviour of the Tweedie criterion. Defaults to 1.5.

Prescription learner parameters

The following parameter is used by all prescription learners:

prescription_factor::Real: controls the tradeoff between predictive accuracy and prescriptive performance in the Combined Performance criterion. Defaults to 0.5.

Policy learner parameters

The following parameter is used by all prescription learners:

normalize_y::Bool: whether to normalize the rewards so they lie in the interval [0,1]. Typically improves performance when data is poorly-scaled to begin with. Defaults to true.

Fitting and Evaluating Learners

Once the learner is created and the parameters chosen, it can be fit using the training data using fit!:

using CSV, DataFrames
df = CSV.read("iris.csv", DataFrame)
X = df[:, 1:4]
y = df[:, 5]
lnr = IAI.OptimalTreeClassifier(max_depth=2, cp=0, random_seed=1)
IAI.fit!(lnr, X, y)

Optimal Trees Visualization

We can make predictions on new feature data with predict:

IAI.predict(lnr, X)

150-element Array{String,1}:
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 ⋮
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"

We can evaluate the quality of the learner on new data with score:

IAI.score(lnr, X, y)

0.96

There are a number of additional functions that can be used depending on the task type:

Classification
- predict_proba
- ROCCurve
Prescription
- predict_outcomes
Policy
Imputation
- transform
- fit_transform!
Reward Estimation
- fit_predict!

Utilities

You can save a learner to JSON format with write_json:

IAI.write_json("learner.json", lnr)

To read a saved learner back into Julia, use read_json:

lnr = IAI.read_json("learner.json")

Optimal Trees Visualization

The following functions to make it easier to work with learners and their parameters in a general manner:

set_params! allows setting of parameters with other variables
get_params gets all of the parameter values from a learner
clone creates a copy of a learner with the same parameter values