# Quick Start Guide: Optimal Classification Trees

In this example we will use Optimal Classification Trees (OCT) on the banknote authentication dataset. First we load in the data and split into training and test datasets:

using CSV, DataFrames
header=[:variance, :skewness, :curtosis, :entropy, :class])
1372×5 DataFrame
Row │ variance  skewness   curtosis   entropy   class
│ Float64   Float64    Float64    Float64   Int64
──────┼─────────────────────────────────────────────────
1 │  3.6216     8.6661   -2.8073    -0.44699      0
2 │  4.5459     8.1674   -2.4586    -1.4621       0
3 │  3.866     -2.6383    1.9242     0.10645      0
4 │  3.4566     9.5228   -4.0112    -3.5944       0
5 │  0.32924   -4.4552    4.5718    -0.9888       0
6 │  4.3684     9.6718   -3.9606    -3.1625       0
7 │  3.5912     3.0129    0.72888    0.56421      0
8 │  2.0922    -6.81      8.4636    -0.60216      0
⋮   │    ⋮          ⋮          ⋮         ⋮        ⋮
1366 │ -4.5046    -5.8126   10.8867    -0.52846      1
1367 │ -2.41       3.7433   -0.40215   -1.2953       1
1368 │  0.40614    1.3492   -1.4501    -0.55949      1
1369 │ -1.3887    -4.8773    6.4774     0.34179      1
1370 │ -3.7503   -13.4586   17.5932    -2.7771       1
1371 │ -3.5637    -8.3827   12.393     -1.2823       1
1372 │ -2.5419    -0.65804   2.6842     1.1952       1
1357 rows omitted
X = df[:, 1:4]
y = df[:, 5]
(train_X, train_y), (test_X, test_y) = IAI.split_data(:classification, X, y,
seed=1)

### Optimal Classification Trees

We will use a GridSearch to fit an OptimalTreeClassifier:

grid = IAI.GridSearch(
IAI.OptimalTreeClassifier(
random_seed=1,
),
max_depth=1:5,
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
Optimal Trees Visualization

We can make predictions on new data using predict:

IAI.predict(grid, test_X)
412-element Vector{Int64}:
0
0
0
0
0
0
0
0
0
1
⋮
1
1
1
1
1
1
1
1
1

We can evaluate the quality of the tree using score with any of the supported loss functions. For example, the misclassification on the training set:

IAI.score(grid, train_X, train_y, criterion=:misclassification)
0.9989583333333333

Or the AUC on the test set:

IAI.score(grid, test_X, test_y, criterion=:auc)
0.9956331877729259

We can also plot the ROC curve on the test set:

IAI.ROCCurve(grid, test_X, test_y, positive_label=1)
ROC

### Optimal Classification Trees with Hyperplanes

To use Optimal Classification Trees with hyperplane splits (OCT-H), you should set the hyperplane_config parameter:

grid = IAI.GridSearch(
IAI.OptimalTreeClassifier(
random_seed=1,
max_depth=2,
hyperplane_config=(sparsity=:all,)
),
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
Optimal Trees Visualization

Now we can find the performance on the test set with hyperplanes:

IAI.score(grid, test_X, test_y, criterion=:auc)
1.0

It seems that a very small tree with a hyperplane splits is able to model this dataset perfectly.

### Optimal Classification Trees with Logistic Regression in Leaves

You can also train a tree with logistic regression fitted in the leaf after fixing the tree structure with the refit_learner parameter:

grid = IAI.GridSearch(
IAI.OptimalTreeClassifier(
random_seed=1,
minbucket=10,
refit_learner=IAI.GridSearch(
IAI.OptimalFeatureSelectionClassifier(),
sparsity=1:3,
),
),
max_depth=1:2,
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
Optimal Trees Visualization
IAI.score(grid, test_X, test_y, criterion=:auc)
0.9979955616006876

It seems that a tree with a single split and logistic regressions in the leaves is able to model this dataset almost perfectly.

For more details on classification trees with logistic regression, see the guide to classification trees with logistic regression.