Quick Start Guide: Optimal Classification Trees
This is an R version of the corresponding OptimalTrees quick start guide.
In this example we will use Optimal Classification Trees (OCT) on the banknote authentication dataset. First we load in the data and split into training and test datasets:
df <- read.table("data_banknote_authentication.txt", sep = ",",
col.names = c("variance", "skewness", "curtosis", "entropy",
"class"))
variance skewness curtosis entropy class
1 3.62160 8.6661 -2.80730 -0.44699 0
2 4.54590 8.1674 -2.45860 -1.46210 0
3 3.86600 -2.6383 1.92420 0.10645 0
4 3.45660 9.5228 -4.01120 -3.59440 0
5 0.32924 -4.4552 4.57180 -0.98880 0
6 4.36840 9.6718 -3.96060 -3.16250 0
7 3.59120 3.0129 0.72888 0.56421 0
8 2.09220 -6.8100 8.46360 -0.60216 0
9 3.20320 5.7588 -0.75345 -0.61251 0
10 1.53560 9.1772 -2.27180 -0.73535 0
11 1.22470 8.7779 -2.21350 -0.80647 0
12 3.98990 -2.7066 2.39460 0.86291 0
[ reached 'max' / getOption("max.print") -- omitted 1360 rows ]
X <- df[, 1:4]
y <- df[, 5]
split <- iai::split_data("classification", X, y, seed = 1)
train_X <- split$train$X
train_y <- split$train$y
test_X <- split$test$X
test_y <- split$test$y
Optimal Classification Trees
We will use a grid_search
to fit an optimal_tree_classifier
:
grid <- iai::grid_search(
iai::optimal_tree_classifier(
random_seed = 1,
),
max_depth = 1:5,
)
iai::fit(grid, train_X, train_y)
iai::get_learner(grid)
We can make predictions on new data using predict
:
iai::predict(grid, test_X)
[1] 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[39] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[ reached getOption("max.print") -- omitted 352 entries ]
We can evaluate the quality of the tree using score
with any of the supported loss functions. For example, the misclassification on the training set:
iai::score(grid, train_X, train_y, criterion = "misclassification")
[1] 0.9989583
Or the AUC on the test set:
iai::score(grid, test_X, test_y, criterion = "auc")
[1] 0.9956332
We can also plot the ROC curve on the test set:
iai::roc_curve(grid, test_X, test_y, positive_label = 1)
Optimal Classification Trees with Hyperplanes
To use Optimal Classification Trees with hyperplane splits (OCT-H), you should set the hyperplane_config
parameter:
grid <- iai::grid_search(
iai::optimal_tree_classifier(
random_seed = 1,
max_depth = 2,
hyperplane_config = list(sparsity = "all"),
),
)
iai::fit(grid, train_X, train_y)
iai::get_learner(grid)
Now we can find the performance on the test set with hyperplanes:
iai::score(grid, test_X, test_y, criterion = "auc")
[1] 1
It seems that a very small tree with a hyperplane splits is able to model this dataset perfectly.
Optimal Classification Trees with Logistic Regression in Leaves
You can also train a tree with logistic regression fitted in the leaf after fixing the tree structure with the refit_learner
parameter:
grid <- iai::grid_search(
iai::optimal_tree_classifier(
random_seed = 1,
minbucket = 10,
refit_learner = iai::grid_search(
iai::optimal_feature_selection_classifier(),
sparsity = 1:3,
),
),
max_depth = 1:2,
)
iai::fit(grid, train_X, train_y)
iai::get_learner(grid)
iai::score(grid, test_X, test_y, criterion = "auc")
[1] 0.9979956
It seems that a tree with a single split and logistic regressions in the leaves is able to model this dataset almost perfectly.
For more details on classification trees with logistic regression, see the guide to classification trees with logistic regression.