Reward Estimation Learners
RewardEstimation provides the RewardEstimator
learner to easily conduct reward estimation. In addition to the shared learner parameters, the following parameters are used to control the reward estimation procedure.
propensity_estimation_method
The method to use for propensity score estimation. The following options are available:
:equal
to estimate equal probabilities for each treatment (for data from randomized experiments):random_forest
to estimate propensity scores using a random forest:xgboost
to estimate propensity scores using XGBoost
propensity_estimation_params
A Dict
containing any parameters that should be passed to the propensity score estimation function.
propensity_min_value
A Real
value between 0 and 1 that specifies the minimum propensity score that can be predicted for any treatment. Defaults to 0.
outcome_estimation_method
The method to use for outcome estimation. The following options are available:
:lasso
to estimate outcomes using lasso regression:random_forest
to estimate outcomes using a random forest:xgboost
to estimate outcomes using XGBoost
outcome_estimation_params
A Dict
containing any parameters that should be passed to the outcome estimation function.
reward_estimation_method
The method to use for reward estimation. The following options are available:
:direct_method
to use the direct method estimator:inverse_propensity_weighting
to use the inverse propensity weighting estimator:doubly_robust
to use the doubly robust estimator