Quick Start Guide: Multi-task Optimal Classification Trees
This is an R version of the corresponding OptimalTrees quick start guide.
In this example we will use Optimal Classification Trees (OCT) on the acute inflammations dataset to solve a multi-task classification problem.
This guide assumes you are familiar with OCTs and focuses on aspects that are unique to the multi-task setting. For a general introduction to OCTs, please refer to the OCT quickstart guide.
First we load in the data and split into training and test datasets:
df <- read.table("diagnosis.data",
sep = "\t",
fileEncoding = "UTF-16LE",
dec = ",",
col.names = c("temp", "nausea", "lumbar_pain", "urine_pushing",
"micturition_pains", "burning", "inflammation", "nephritis"),
stringsAsFactors = T
)
temp nausea lumbar_pain urine_pushing micturition_pains burning inflammation
1 35.5 no yes no no no no
2 35.9 no no yes yes yes yes
3 35.9 no yes no no no no
4 36.0 no no yes yes yes yes
5 36.0 no yes no no no no
6 36.0 no yes no no no no
7 36.2 no no yes yes yes yes
nephritis
1 no
2 no
3 no
4 no
5 no
6 no
7 no
[ reached 'max' / getOption("max.print") -- omitted 113 rows ]
The goal is to predict two diseases of the urinary system: acute inflammations of urinary bladder and acute nephritises. We therefore separate these two targets from the rest of the features, and split for training and testing:
targets = c("inflammation", "nephritis")
X <- df[, !names(df) %in% targets]
y <- df[targets]
split <- iai::split_data("multi_classification", X, y, seed = 1)
train_X <- split$train$X
train_y <- split$train$y
test_X <- split$test$X
test_y <- split$test$y
Multi-task Optimal Classification Trees
We will use a grid_search
to fit an optimal_tree_multi_classifier
:
grid <- iai::grid_search(
iai::optimal_tree_multi_classifier(
random_seed = 1,
),
max_depth = 1:5,
)
iai::fit(grid, train_X, train_y)
iai::get_learner(grid)
We can make predictions on new data using predict
:
iai::predict(grid, test_X)
$inflammation
[1] "no" "no" "yes" "no" "yes" "no" "yes" "yes" "no" "yes" "yes" "no"
[13] "yes" "yes" "no" "yes" "yes" "yes" "yes" "no" "no" "no" "no" "yes"
[25] "no" "yes" "no" "no" "yes" "yes" "no" "no" "yes" "no" "yes" "no"
$nephritis
[1] "no" "no" "no" "no" "no" "no" "no" "no" "no" "no" "no" "no"
[13] "no" "no" "no" "no" "no" "no" "no" "yes" "yes" "no" "yes" "yes"
[25] "yes" "yes" "no" "yes" "yes" "yes" "yes" "yes" "yes" "yes" "yes" "yes"
This returns a list containing the predictions for each of the tasks, and can also be converted to a dataframe easily:
as.data.frame(iai::predict(grid, test_X))
inflammation nephritis
1 no no
2 no no
3 yes no
4 no no
5 yes no
6 no no
7 yes no
8 yes no
9 no no
10 yes no
11 yes no
12 no no
13 yes no
14 yes no
15 no no
16 yes no
17 yes no
18 yes no
19 yes no
20 no yes
21 no yes
22 no no
23 no yes
24 yes yes
25 no yes
26 yes yes
27 no no
28 no yes
29 yes yes
30 yes yes
[ reached 'max' / getOption("max.print") -- omitted 6 rows ]
We can also generate the predictions for a specific task by passing the task label:
iai::predict(grid, test_X, "inflammation")
[1] "no" "no" "yes" "no" "yes" "no" "yes" "yes" "no" "yes" "yes" "no"
[13] "yes" "yes" "no" "yes" "yes" "yes" "yes" "no" "no" "no" "no" "yes"
[25] "no" "yes" "no" "no" "yes" "yes" "no" "no" "yes" "no" "yes" "no"
We can evaluate the quality of the tree using score
with any of the supported loss functions. For multi-task problems, the returned score is the average of the scores of the individual tasks:
iai::score(grid, test_X, test_y, criterion = "misclassification")
[1] 1
We can also calculate the score of a single task by specifying this task:
iai::score(grid, test_X, test_y, "nephritis", criterion = "auc")
[1] 1
The other standard API functions (e.g. predict_proba
, roc_curve
) can be called as normal. As above, by default they will generate output for all tasks, and a task can be specified to return information for a single task.
Extensions
The standard OCT extensions (e.g. hyperplane splits, logistic regression) are also available in the multi-task setting and controlled in the usual way.
For instance, we can use Optimal Classification Trees with hyperplane splits:
grid <- iai::grid_search(
iai::optimal_tree_multi_classifier(
random_seed = 1,
max_depth = 2,
hyperplane_config = list(sparsity = "all"),
),
)
iai::fit(grid, train_X, train_y)
iai::get_learner(grid)