Reward Estimation Learners

RewardEstimation provides the RewardEstimator learner to easily conduct reward estimation. In addition to the shared learner parameters, the following parameters are used to control the reward estimation procedure.

propensity_estimation_method

The method to use for propensity score estimation. The following options are available:

  • :equal to estimate equal probabilities for each treatment (for data from randomized experiments)
  • :random_forest to estimate propensity scores using a random forest
  • :xgboost to estimate propensity scores using XGBoost

propensity_estimation_params

A Dict containing any parameters that should be passed to the propensity score estimation function.

propensity_min_value

A Real value between 0 and 1 that specifies the minimum propensity score that can be predicted for any treatment. Defaults to 0.

outcome_estimation_method

The method to use for outcome estimation. The following options are available:

  • :lasso to estimate outcomes using lasso regression
  • :random_forest to estimate outcomes using a random forest
  • :xgboost to estimate outcomes using XGBoost

outcome_estimation_params

A Dict containing any parameters that should be passed to the outcome estimation function.

reward_estimation_method

The method to use for reward estimation. The following options are available: