Case Studies and Examples
This is a collection of case studies that demonstrate the application of IAI modules to real-world problems and datasets:
Loan Default Risk Optimal Classification Trees
We compare and contrast the interpretability of Optimal Trees against methods for model explanability (LIME/SHAP) using the dataset from the FICO Explainable Machine Learning Challenge.
Hepatitis Mortality Prediction OptImpute Optimal Classification Trees
We use Optimal Trees to make mortality predictions for patients with hepatitis. The dataset contains a large number of missing values, so we examine the performance of the final predictive model under a variety of schemes for handling missing values.
Supreme Court Outcomes Optimal Classification Trees
We revisit a case study from The Analytics Edge where CART is used to predict the outcomes of Supreme Court votes. We apply Optimal Trees to the same dataset to investigate the improvement over CART.
House Sale Prices Optimal Regression Trees
We use a house price dataset to show when regression trees with constant predictions can be inadequate when there is a strong linear relationship in the data. We show how to build Optimal Regression Trees with linear predictions to improve the model performance and interpretability.
Mercedes-Benz Testing Optimal Feature Selection
We compare and contrast Optimal Feature Selection and elastic net regression as methods for conducting feature selection on the Mercedes-Benz Greener Manufacturing Kaggle competition.
Turbofan Predictive Maintenance Optimal Classification Trees Optimal Survival Trees
We study a concrete case of predictive maintenance using Optimal Classification and Survival Trees. We compare our approach to classical models (e.g. XGBoost, CART) and we illustrate how interpretability helps to understand the underlying failure mechanisms.
Online Imputation for Production Pipelines OptImpute Optimal Classification Trees
We use the breast-cancer dataset as the context to investigate the potential impact of missing data in the online setting, and illustrate some potential remedies such as using Optimal Imputation.
Detecting Racial Bias in Jury Selection Optimal Feature Selection Optimal Classification Trees
We use interpretable methods to investigate the presence of racial biases in jury selection, using data released as part of the 2019 U.S. Supreme Court case "Flowers v. Mississippi".
Revenue Optimization for Grocery Pricing Optimal Prescriptive Trees Optimal Policy Trees
We utilize two prescriptive methods to develop interpretable pricing strategies for grocery items based on demographic information, resulting in an estimated 60-70% lift in revenue.
Optimal Prescription for Diabetes Management Optimal Prescriptive Trees Optimal Policy Trees
We learn an interpretable diabetes management policy from observational data, with various treatment combination and dosing options. We show that Optimal Policy Trees use the data very efficiently and lead to a clinically significant improvement in health outcomes.
Interpretable Clustering Optimal Classification Trees Optimal Policy Trees
We study two ways of using Optimal Trees to identify meaningful clusters from a credit card usage behavior dataset. The first approach consists of training an Optimal Tree to predict the cluster assignments made by a traditional clustering method. The second approach consists of considering the clustering problem as a supervised learning problem by selecting a feature as a relevant target variable to guide the clustering process.
Reducing Churn Reward Estimation Optimal Policy Trees
We demonstrate how to segment customers and prescribe optimal interventions to reduce churn over time. We use Reward Estimation with Survival Outcomes to estimate the counterfactual outcomes, and Optimal Policy Trees to construct cohorts of customers with similar response to interventions and find the optimal pricing policy for each cohort.