Reward Estimation Learners
RewardEstimation provides the RewardEstimator learner to easily conduct reward estimation. In addition to the shared learner parameters, the following parameters are used to control the reward estimation procedure.
propensity_estimation_method
The method to use for propensity score estimation. The following options are available:
:equalto estimate equal probabilities for each treatment (for data from randomized experiments):random_forestto estimate propensity scores using a random forest:xgboostto estimate propensity scores using XGBoost
propensity_estimation_params
A Dict containing any parameters that should be passed to the propensity score estimation function.
propensity_min_value
A Real value between 0 and 1 that specifies the minimum propensity score that can be predicted for any treatment. Defaults to 0.
outcome_estimation_method
The method to use for outcome estimation. The following options are available:
:lassoto estimate outcomes using lasso regression:random_forestto estimate outcomes using a random forest:xgboostto estimate outcomes using XGBoost
outcome_estimation_params
A Dict containing any parameters that should be passed to the outcome estimation function.
reward_estimation_method
The method to use for reward estimation. The following options are available:
:direct_methodto use the direct method estimator:inverse_propensity_weightingto use the inverse propensity weighting estimator:doubly_robustto use the doubly robust estimator