Parameter Tuning

Parameter Tuning

We provide the GridSearch object as an easy way of tuning parameters. The grid search takes a learner and one or more parameters with the range of values to search over. The training data is used to train the learner with each combination of parameters and then select the combination that leads to the best out-of-sample performance. After finding the best combination of parameters, the learner is re-trained using the entire data set with the best parameter combination to find the final model.

Using the Grid Search

You construct a GridSearch by passing a learner along with the parameters that you want to validate over. For example, here we create a grid search to tune the max_depth parameter of an IAI.OptimalTreeClassifier :

grid = IAI.GridSearch(
  IAI.OptimalTreeClassifier(),
  max_depth=1:4,
)
GridSearch - Unfitted OptimalTreeClassifier

GridSearch Params:
  (max_depth=1,)
  (max_depth=2,)
  (max_depth=3,)
  (max_depth=4,)

The grid of parameter values to search over can be specified using keyword arguments to GridSearch as above, or alternatively you can pass a Dict or NamedTuple of key/value pairs to specify the search parameters.

To conduct the grid search on the data, use the IAI.fit! function. This will automatically split the data into training and validation sets before searching for the best parameters:

using CSV
df = CSV.read("iris.csv")
X = df[:, 1:4]
y = df[:, 5]
IAI.fit!(grid, X, y)
All Grid Results:

│ Row │ max_depth │ cp         │ train_score │ valid_score │ rank_valid_score │
│     │ Int64     │ Float64    │ Float64     │ Float64     │ Int64            │
├─────┼───────────┼────────────┼─────────────┼─────────────┼──────────────────┤
│ 1   │ 1         │ 0.25       │ 0.666667    │ 0.666667    │ 4                │
│ 2   │ 2         │ 0.00714286 │ 0.961905    │ 0.955556    │ 3                │
│ 3   │ 3         │ 0.00714286 │ 0.990476    │ 0.964444    │ 2                │
│ 4   │ 4         │ 0.00714286 │ 1.0         │ 0.977778    │ 1                │

Best Params:
  cp => 0.007142857142857143
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalLength < 2.45
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalLength < 4.95
      4) Split: PetalWidth < 1.65
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Split: SepalWidth < 3.1
          7) Predict: virginica (100.00%), [0,0,6], 6 points, error 0
          8) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
      9) Predict: virginica (95.65%), [0,2,44], 46 points, error 2

Alternatively, you can use a pre-specified split of the data into training and validation, such as one created with IAI.split_data. This is especially useful when comparing many methods and ensuring that each is trained using the same training/validation split:

(train_X, train_y), (valid_X, valid_y) = IAI.split_data(:classification, X, y)
IAI.fit!(grid, train_X, train_y, valid_X, valid_y)
All Grid Results:

│ Row │ max_depth │ cp         │ train_score │ valid_score │ rank_valid_score │
│     │ Int64     │ Float64    │ Float64     │ Float64     │ Int64            │
├─────┼───────────┼────────────┼─────────────┼─────────────┼──────────────────┤
│ 1   │ 1         │ 0.25       │ 0.666667    │ 0.666667    │ 4                │
│ 2   │ 2         │ 0.00714286 │ 0.961905    │ 0.955556    │ 3                │
│ 3   │ 3         │ 0.00714286 │ 0.990476    │ 0.964444    │ 2                │
│ 4   │ 4         │ 0.00714286 │ 1.0         │ 0.977778    │ 1                │

Best Params:
  cp => 0.007142857142857143
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalLength < 2.45
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalLength < 4.95
      4) Split: PetalWidth < 1.65
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Split: SepalWidth < 3.1
          7) Predict: virginica (100.00%), [0,0,6], 6 points, error 0
          8) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
      9) Predict: virginica (95.65%), [0,2,44], 46 points, error 2

After fitting, you can get the best learner using IAI.get_learner:

IAI.get_learner(grid)
Optimal Trees Visualization

You can get the best parameter combination using IAI.get_best_params:

IAI.get_best_params(grid)
Dict{Symbol,Any} with 2 entries:
  :cp        => 0.00714286
  :max_depth => 4

The full results of the grid search process are accessible via IAI.get_grid_results:

IAI.get_grid_results(grid)
4×5 DataFrames.DataFrame
│ Row │ max_depth │ cp         │ train_score │ valid_score │ rank_valid_score │
│     │ Int64     │ Float64    │ Float64     │ Float64     │ Int64            │
├─────┼───────────┼────────────┼─────────────┼─────────────┼──────────────────┤
│ 1   │ 1         │ 0.25       │ 0.666667    │ 0.666667    │ 4                │
│ 2   │ 2         │ 0.00714286 │ 0.961905    │ 0.955556    │ 3                │
│ 3   │ 3         │ 0.00714286 │ 0.990476    │ 0.964444    │ 2                │
│ 4   │ 4         │ 0.00714286 │ 1.0         │ 0.977778    │ 1                │
Info

You can also use the fitted grid in any of the API functions where you would normally use a learner (e.g. IAI.predict, IAI.score, etc.).

By default, the criterion used to evaluate the models and select the best parameter combination is the same as the criterion used to train the models. You can specify a different criterion using the validation_criterion keyword argument when fitting:

IAI.fit!(grid, X, y, validation_criterion=:gini)
All Grid Results:

│ Row │ max_depth │ cp         │ train_score │ valid_score │ rank_valid_score │
│     │ Int64     │ Float64    │ Float64     │ Float64     │ Int64            │
├─────┼───────────┼────────────┼─────────────┼─────────────┼──────────────────┤
│ 1   │ 1         │ 0.25       │ 0.666667    │ 0.5         │ 4                │
│ 2   │ 2         │ 0.00714286 │ 0.961905    │ 0.885022    │ 3                │
│ 3   │ 3         │ 0.0047619  │ 0.990476    │ 0.922944    │ 2                │
│ 4   │ 4         │ 0.00357143 │ 1.0         │ 0.966667    │ 1                │

Best Params:
  cp => 0.0035714285714285713
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 1.55
    2) Split: PetalLength < 2.6
      3) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
      4) Split: PetalLength < 4.95
        5) Predict: versicolor (100.00%), [0,45,0], 45 points, error 0
        6) Predict: virginica (100.00%), [0,0,3], 3 points, error 0
    7) Split: PetalLength < 4.85
      8) Split: SepalWidth < 3.1
        9) Predict: virginica (100.00%), [0,0,3], 3 points, error 0
        10) Predict: versicolor (100.00%), [0,3,0], 3 points, error 0
      11) Split: PetalWidth < 1.75
        12) Split: SepalLength < 6.95
          13) Predict: versicolor (100.00%), [0,2,0], 2 points, error 0
          14) Predict: virginica (100.00%), [0,0,1], 1 points, error 0
        15) Predict: virginica (100.00%), [0,0,43], 43 points, error 0

Cross-validation

You can also conduct tuning using k-fold cross-validation rather than a single training/validation split by using IAI.fit_cv!:

IAI.fit_cv!(grid, X, y, n_folds=3)
All Grid Results:

│ Row │ max_depth │ cp        │ split1_train_score │ split2_train_score │
│     │ Int64     │ Float64   │ Float64            │ Float64            │
├─────┼───────────┼───────────┼────────────────────┼────────────────────┤
│ 1   │ 1         │ 0.25      │ 0.666667           │ 0.666667           │
│ 2   │ 2         │ 0.0225045 │ 0.959596           │ 0.960784           │
│ 3   │ 3         │ 0.0222816 │ 0.989899           │ 1.0                │
│ 4   │ 4         │ 0.0110294 │ 1.0                │ 1.0                │

│ Row │ split3_train_score │ mean_train_score │ std_train_score │
│     │ Float64            │ Float64          │ Float64         │
├─────┼────────────────────┼──────────────────┼─────────────────┤
│ 1   │ 0.666667           │ 0.666667         │ 1.35974e-16     │
│ 2   │ 0.969697           │ 0.963359         │ 0.00552084      │
│ 3   │ 0.979798           │ 0.989899         │ 0.010101        │
│ 4   │ 1.0                │ 1.0              │ 0.0             │

│ Row │ split1_valid_score │ split2_valid_score │ split3_valid_score │
│     │ Float64            │ Float64            │ Float64            │
├─────┼────────────────────┼────────────────────┼────────────────────┤
│ 1   │ 0.666667           │ 0.666667           │ 0.666667           │
│ 2   │ 0.95098            │ 0.93125            │ 0.941176           │
│ 3   │ 0.960784           │ 0.935417           │ 0.931373           │
│ 4   │ 0.956863           │ 0.9375             │ 0.966667           │

│ Row │ mean_valid_score │ std_valid_score │ split1_cp │ split2_cp │
│     │ Float64          │ Float64         │ Float64   │ Float64   │
├─────┼──────────────────┼─────────────────┼───────────┼───────────┤
│ 1   │ 0.666667         │ 0.0             │ 0.25      │ 0.25      │
│ 2   │ 0.941136         │ 0.00986526      │ 0.212121  │ 0.220588  │
│ 3   │ 0.942525         │ 0.0159422       │ 0.223485  │ 0.0147059 │
│ 4   │ 0.953676         │ 0.0148421       │ 0.227273  │ 0.0110294 │

│ Row │ split3_cp  │ mean_cp   │ std_cp   │ overall_valid_score │
│     │ Float64    │ Float64   │ Float64  │ Float64             │
├─────┼────────────┼───────────┼──────────┼─────────────────────┤
│ 1   │ 0.25       │ 0.25      │ 0.0      │ 0.666667            │
│ 2   │ 0.0151515  │ 0.149287  │ 0.116242 │ 0.941136            │
│ 3   │ 0.030303   │ 0.0894979 │ 0.116298 │ 0.942525            │
│ 4   │ 0.00378788 │ 0.0806967 │ 0.12699  │ 0.949101            │

│ Row │ rank_valid_score │
│     │ Int64            │
├─────┼──────────────────┤
│ 1   │ 4                │
│ 2   │ 3                │
│ 3   │ 2                │
│ 4   │ 1                │

Best Params:
  cp => 0.011029411764705881
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 0.8
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalWidth < 1.65
      4) Split: PetalLength < 4.95
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Predict: virginica (80.00%), [0,1,4], 5 points, error 1
      7) Predict: virginica (95.83%), [0,2,46], 48 points, error 2

Multiple Parameter Grids

If you would like to search over multiple disjoint parameter grids using different parameter combinations for each grid, you can pass a vector of Dict or NamedTuple, one for each grid. For example, OptImpute supports tuning which imputation method is used, and each method has its own parameters to tune that other methods do not require. We can achieve this with the following GridSearch setup:

grid = IAI.GridSearch(
    IAI.ImputationLearner(),
    [
        (method=:opt_knn, knn_k=[5, 10, 15]),
        (method=:opt_svm, svm_c=[0.1, 1, 10]),
        (method=:opt_tree,),
    ],
)
GridSearch - Unfitted OptKNNImputationLearner

GridSearch Params:
  (method=opt_knn,knn_k=5,)
  (method=opt_knn,knn_k=10,)
  (method=opt_knn,knn_k=15,)
  (method=opt_svm,svm_c=0.1,)
  (method=opt_svm,svm_c=1.0,)
  (method=opt_svm,svm_c=10.0,)
  (method=opt_tree,)

We can see that the two grids are expanded separately, and the grid search will now find the best method to use along with the best corresponding parameter for that method.

Imputation

Note for imputation that you can use IAI.fit_transform! and IAI.fit_transform_cv! in the same way as detailed above for IAI.fit! and IAI.fit_cv!.