Quick Start Guide: Optimal Regression Trees
In this example we will use Optimal Regression Trees (ORT) on the yacht hydrodynamics dataset. First we load in the data and split into training and test datasets:
using CSV, DataFrames
df = CSV.read(
"yacht_hydrodynamics.data", DataFrame,
delim=' ', # file uses ' ' as separators rather than ','
ignorerepeated=true, # sometimes columns are separated by more than one ' '
header=[:position, :prismatic, :length_displacement, :beam_draught,
:length_beam, :froude, :resistance],
)
308×7 DataFrame
Row │ position prismatic length_displacement beam_draught length_beam fr ⋯
│ Float64 Float64 Float64 Float64 Float64 Fl ⋯
─────┼──────────────────────────────────────────────────────────────────────────
1 │ -2.3 0.568 4.78 3.99 3.17 ⋯
2 │ -2.3 0.568 4.78 3.99 3.17
3 │ -2.3 0.568 4.78 3.99 3.17
4 │ -2.3 0.568 4.78 3.99 3.17
5 │ -2.3 0.568 4.78 3.99 3.17 ⋯
6 │ -2.3 0.568 4.78 3.99 3.17
7 │ -2.3 0.568 4.78 3.99 3.17
8 │ -2.3 0.568 4.78 3.99 3.17
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
302 │ -2.3 0.6 4.34 4.23 2.73 ⋯
303 │ -2.3 0.6 4.34 4.23 2.73
304 │ -2.3 0.6 4.34 4.23 2.73
305 │ -2.3 0.6 4.34 4.23 2.73
306 │ -2.3 0.6 4.34 4.23 2.73 ⋯
307 │ -2.3 0.6 4.34 4.23 2.73
308 │ -2.3 0.6 4.34 4.23 2.73
2 columns and 293 rows omitted
X = df[:, 1:(end - 1)]
y = df[:, end]
(train_X, train_y), (test_X, test_y) = IAI.split_data(:regression, X, y, seed=1)
Optimal Regression Trees
We will use a GridSearch
to fit an OptimalTreeRegressor
:
grid = IAI.GridSearch(
IAI.OptimalTreeRegressor(
random_seed=123,
),
max_depth=1:5,
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
We can make predictions on new data using predict
:
IAI.predict(grid, test_X)
92-element Array{Float64,1}:
0.7884042553191487
0.7884042553191487
0.7884042553191487
3.909555555555555
3.909555555555555
13.35666666666667
22.072222222222226
0.7884042553191487
0.7884042553191487
0.7884042553191487
⋮
3.909555555555555
7.9833333333333325
13.35666666666667
0.7884042553191487
0.7884042553191487
3.909555555555555
13.35666666666667
34.575384615384614
49.91583333333334
We can evaluate the quality of the tree using score
with any of the supported loss functions. For example, the $R^2$ on the training set:
IAI.score(grid, train_X, train_y, criterion=:mse)
0.9912939792003822
Or on the test set:
IAI.score(grid, test_X, test_y, criterion=:mse)
0.9885237962078779
Optimal Regression Trees with Hyperplanes
To use Optimal Regression Trees with hyperplane splits (ORT-H), you should set the hyperplane_config
parameter:
grid = IAI.GridSearch(
IAI.OptimalTreeRegressor(
random_seed=123,
hyperplane_config=(sparsity=:all,)
),
max_depth=1:4,
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
Now we can find the performance on the test set with hyperplanes:
IAI.score(grid, test_X, test_y, criterion=:mse)
0.9861182667312003
It looks like the addition of hyperplane splits did not help too much here. It seems that the main variable affecting the target is froude
, and so perhaps allowing multiple variables per split in the tree is not that useful for this dataset.
Optimal Regression Trees with Linear Predictions
To use Optimal Regression Trees with linear regression in the leaves (ORT-L), you should set the regression_sparsity
parameter to :all
and use the regression_lambda
parameter to control the degree of regularization.
grid = IAI.GridSearch(
IAI.OptimalTreeRegressor(
random_seed=123,
max_depth=2,
regression_sparsity=:all,
),
regression_lambda=[0.005, 0.01, 0.05],
)
IAI.fit!(grid, train_X, train_y)
IAI.get_learner(grid)
We can find the performance on the test set:
IAI.score(grid, test_X, test_y, criterion=:mse)
0.984222547936994
We can see that the ORT-L model is much smaller than the models with constant predictions and has similar performance.