Reward Estimation Learners

RewardEstimation provides the RewardEstimator learner to easily conduct reward estimation. In addition to the shared learner parameters, the following parameters are used to control the reward estimation procedure.

`propensity_estimation_method`

The method to use for propensity score estimation. The following options are available:

:equal to estimate equal probabilities for each treatment (for data from randomized experiments)
:random_forest to estimate propensity scores using a random forest
:xgboost to estimate propensity scores using XGBoost

`propensity_estimation_params`

A Dict containing any parameters that should be passed to the propensity score estimation function.

`propensity_min_value`

A Real value between 0 and 1 that specifies the minimum propensity score that can be predicted for any treatment. Defaults to 0.

`outcome_estimation_method`

The method to use for outcome estimation. The following options are available:

:lasso to estimate outcomes using lasso regression
:random_forest to estimate outcomes using a random forest
:xgboost to estimate outcomes using XGBoost

`outcome_estimation_params`

A Dict containing any parameters that should be passed to the outcome estimation function.

`reward_estimation_method`

The method to use for reward estimation. The following options are available:

:direct_method to use the direct method estimator
:inverse_propensity_weighting to use the inverse propensity weighting estimator
:doubly_robust to use the doubly robust estimator