Parameter Tuning

We provide the GridSearch object as an easy way of tuning parameters. The grid search takes a learner and one or more parameters with the range of values to search over. The training data is used to train the learner with each combination of parameters and then select the combination that leads to the best out-of-sample performance. After finding the best combination of parameters, the learner is re-trained using the entire data set with the best parameter combination to find the final model.

You construct a GridSearch by passing a learner along with the parameters that you want to validate over. For example, here we create a grid search to tune the max_depth parameter of an OptimalTreeClassifier:

grid = IAI.GridSearch(
  IAI.OptimalTreeClassifier(
    random_seed=1,
  ),
  max_depth=1:4,
)
GridSearch - Unfitted OptimalTreeClassifier:
  random_seed: 1

GridSearch Params:
  (max_depth=1,)
  (max_depth=2,)
  (max_depth=3,)
  (max_depth=4,)

The grid of parameter values to search over can be specified using keyword arguments to GridSearch as above, or alternatively you can pass a Dict or NamedTuple of key/value pairs to specify the search parameters.

To conduct the grid search on the data, use the fit! function. This will automatically split the data into training and validation sets before searching for the best parameters:

using CSV, DataFrames
df = CSV.read("iris.csv", DataFrame)
X = df[:, 1:4]
y = df[:, 5]
IAI.fit!(grid, X, y)
All Grid Results:

 Row │ max_depth  cp          train_score  valid_score  rank_valid_score
     │ Int64      Float64     Float64      Float64      Int64
─────┼───────────────────────────────────────────────────────────────────
   1 │         1  0.25           0.666667     0.666667                 4
   2 │         2  0.228571       0.971429     0.911111                 3
   3 │         3  0.0357143      0.980952     0.915556                 2
   4 │         4  0.00357143     0.990476     0.924444                 1

Best Params:
  cp => 0.0035714285714285713
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 0.8
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalLength < 4.95
      4) Split: PetalWidth < 1.65
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Split: SepalWidth < 3.1
          7) Predict: virginica (100.00%), [0,0,6], 6 points, error 0
          8) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
      9) Split: PetalLength < 5.05
        10) Split: SepalLength < 6.5
          11) Predict: virginica (100.00%), [0,0,3], 3 points, error 0
          12) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
        13) Predict: virginica (97.62%), [0,1,41], 42 points, error 0.02381

Alternatively, you can use a pre-specified split of the data into training and validation, such as one created with split_data. This is especially useful when comparing many methods and ensuring that each is trained using the same training/validation split:

(train_X, train_y), (valid_X, valid_y) = IAI.split_data(:classification, X, y,
                                                        seed=1)
IAI.fit!(grid, train_X, train_y, valid_X, valid_y)
All Grid Results:

 Row │ max_depth  cp          train_score  valid_score  rank_valid_score
     │ Int64      Float64     Float64      Float64      Int64
─────┼───────────────────────────────────────────────────────────────────
   1 │         1  0.25           0.666667     0.666667                 4
   2 │         2  0.228571       0.971429     0.911111                 3
   3 │         3  0.0357143      0.980952     0.915556                 2
   4 │         4  0.00357143     0.990476     0.924444                 1

Best Params:
  cp => 0.0035714285714285713
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 0.8
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalLength < 4.95
      4) Split: PetalWidth < 1.65
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Split: SepalWidth < 3.1
          7) Predict: virginica (100.00%), [0,0,6], 6 points, error 0
          8) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
      9) Split: PetalLength < 5.05
        10) Split: SepalLength < 6.5
          11) Predict: virginica (100.00%), [0,0,3], 3 points, error 0
          12) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
        13) Predict: virginica (97.62%), [0,1,41], 42 points, error 0.02381

After fitting, you can get the best learner using get_learner:

IAI.get_learner(grid)
Optimal Trees Visualization

You can get the best parameter combination using get_best_params:

IAI.get_best_params(grid)
Dict{Symbol, Any} with 2 entries:
  :cp        => 0.00357143
  :max_depth => 4

A summary of the grid search results is accessible via get_grid_result_summary:

IAI.get_grid_result_summary(grid)
4×5 DataFrame
 Row │ max_depth  cp          train_score  valid_score  rank_valid_score
     │ Int64      Float64     Float64      Float64      Int64
─────┼───────────────────────────────────────────────────────────────────
   1 │         1  0.25           0.666667     0.666667                 4
   2 │         2  0.228571       0.971429     0.911111                 3
   3 │         3  0.0357143      0.980952     0.915556                 2
   4 │         4  0.00357143     0.990476     0.924444                 1

You can also access the detailed results directly with get_grid_result_details.

Info

You can also use the fitted grid in the API functions where you would normally use a learner on data (e.g. predict, score, etc.).

By default, the criterion used to evaluate the models and select the best parameter combination is the same as the criterion used to train the models. You can specify a different criterion using the validation_criterion keyword argument when fitting:

IAI.fit!(grid, X, y, validation_criterion=:gini)
All Grid Results:

 Row │ max_depth  cp          train_score  valid_score  rank_valid_score
     │ Int64      Float64     Float64      Float64      Int64
─────┼───────────────────────────────────────────────────────────────────
   1 │         1  0.25           0.666667     0.5                      4
   2 │         2  0.228571       0.971429     0.835512                 3
   3 │         3  0.0357143      0.980952     0.841805                 2
   4 │         4  0.00357143     0.990476     0.883843                 1

Best Params:
  cp => 0.0035714285714285713
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 0.8
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalLength < 4.95
      4) Split: PetalWidth < 1.65
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Split: SepalWidth < 3.1
          7) Predict: virginica (100.00%), [0,0,6], 6 points, error 0
          8) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
      9) Split: PetalLength < 5.05
        10) Split: SepalLength < 6.5
          11) Predict: virginica (100.00%), [0,0,3], 3 points, error 0
          12) Predict: versicolor (100.00%), [0,1,0], 1 points, error 0
        13) Predict: virginica (97.62%), [0,1,41], 42 points, error 0.02381

Cross-validation

You can also conduct tuning using k-fold cross-validation rather than a single training/validation split by using fit_cv!:

IAI.fit_cv!(grid, X, y, n_folds=3)
All Grid Results:

 Row │ max_depth  cp         split1_train_score  split2_train_score  split3_train_score  mean_train_score  std_train_score  split1_valid_score  split2_valid_score  split3_valid_score  mean_valid_score  std_valid_score  split1_cp  split2_cp  split3_cp   mean_cp    std_cp      overall_valid_score  rank_valid_score
     │ Int64      Float64    Float64             Float64             Float64             Float64           Float64          Float64             Float64             Float64             Float64           Float64          Float64    Float64    Float64     Float64    Float64     Float64              Int64
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │         1  0.25                 0.666667            0.666667            0.666667          0.666667      1.35974e-16            0.666667            0.666667            0.666667          0.666667        0.0        0.25       0.25       0.25        0.25       0.0                    0.666667                 4
   2 │         2  0.219697             0.959596            0.970588            0.969697          0.966627      0.00610539             0.960784            0.9375              0.921569          0.939951        0.0197224  0.219697   0.227941   0.227273    0.22497    0.00457904             0.939951                 3
   3 │         3  0.0222816            0.979798            1.0                 0.979798          0.986532      0.0116636              0.980392            0.927083            0.92549           0.944322        0.0312479  0.0151515  0.0220588  0.0378788   0.0250297  0.0116513              0.944322                 2
   4 │         4  0.0222816            0.989899            1.0                 0.989899          0.993266      0.00583182             0.980392            0.9375              0.933333          0.950408        0.0260501  0.0227273  0.0220588  0.00378788  0.0161913  0.0107469              0.949101                 1

Best Params:
  cp => 0.022281639928698752
  max_depth => 4

Best Model - Fitted OptimalTreeClassifier:
  1) Split: PetalWidth < 0.8
    2) Predict: setosa (100.00%), [50,0,0], 50 points, error 0
    3) Split: PetalWidth < 1.65
      4) Split: PetalLength < 4.95
        5) Predict: versicolor (100.00%), [0,47,0], 47 points, error 0
        6) Predict: virginica (80.00%), [0,1,4], 5 points, error 0.2
      7) Predict: virginica (95.83%), [0,2,46], 48 points, error 0.04167

Multiple Parameter Grids

If you would like to search over multiple disjoint parameter grids using different parameter combinations for each grid, you can pass a vector of Dict or NamedTuple, one for each grid. For example, OptImpute supports tuning which imputation method is used, and each method has its own parameters to tune that other methods do not require. We can achieve this with the following GridSearch setup:

grid = IAI.GridSearch(
    IAI.ImputationLearner(),
    [
        (method=:opt_knn, knn_k=[5, 10, 15]),
        (method=:opt_svm, svm_c=[0.1, 1, 10]),
        (method=:opt_tree,),
    ],
)
GridSearch - Unfitted OptKNNImputationLearner

GridSearch Params:
  (method=opt_knn,knn_k=5,)
  (method=opt_knn,knn_k=10,)
  (method=opt_knn,knn_k=15,)
  (method=opt_svm,svm_c=0.1,)
  (method=opt_svm,svm_c=1.0,)
  (method=opt_svm,svm_c=10.0,)
  (method=opt_tree,)

We can see that the two grids are expanded separately, and the grid search will now find the best method to use along with the best corresponding parameter for that method.

Imputation

Note for imputation that you can use fit_transform! and fit_transform_cv! in the same way as detailed above for fit! and fit_cv!.