Advanced
Visualization
There are a number of advanced features that are available when creating interactive browswer visualizations of tree learners. To access this additional functionality, use the TreePlot
and Questionnaire
objects for tree visualizations and questionnaires, respectively.
Each of these objects takes a tree learner as the first argument. Keyword arguments are used to control the additional functionality, as described below. Use write_html
or show_in_browser
to save or view the visualization. In a Jupyter notebook, the resulting visualization is displayed automatically.
Most of this additional functionality applies to the static images using write_png
or write_dot
as well.
Changing Names
The following keyword arguments enable you to rename various aspects of the data:
feature_renames
to rename the features in the data, where the keys are the old feature names and the values are the corresponding new feature names:Dict("old_feature_1" => "new_feature_1", "old_feature_2" => "new_feature_2")
level_renames
to rename the categoric/ordinal levels in the data, where the keys are the feature names and each value is aDict
for this feature where the keys are the old level names and the values are the new level names:Dict("feature_1" => Dict("old_level_1" => "new_level_1"), "feature_2" => Dict("old_level_1" => "new_level_1", "old_level_2" => "new_level_2"))
label_renames
to rename the labels of the target for classification and prescription problems, where the keys are the old label names and the values are the new label names:Dict("old_label_1" => "new_label_1", "old_label_2" => "new_label_2")
In the following example, we use feature_renames
to replace some feature codes in the data with more descriptive names:
vis_renamed_features = IAI.TreePlot(lnr, feature_renames=Dict(
"Disp" => "Displacement",
"HP" => "Horsepower",
"WT" => "Weight",
))
Controlling the Visualization Content
The extra_content
keyword argument allows you to add or remove information for each node of the tree in the visualization. To do this, simply construct a vector with one Dict for each node of the tree containing the parameters you would like to control for this node.
Each Dict can contain any of the following fields:
:node_color
: color in hex format for the node (e.g.,#ACACAC
). Only applies to tree visualizations.:node_summary_extra
: the extra HTML content to display as the node summary. For tree visualizations, this is the content in the node; for questionnaires, this is displayed above the question.:node_details_extra
: the extra HTML content to display as the node details. For tree visualizations, this is the tooltip content; for questionnaires, this is the content on the bottom of each question.:node_details_include_default
: whether to include the default node details. Defaults to true.:node_summary_include_default
: whether to include the default node summary. Defaults to true.:node_split_extra
: the extra content to display alongside the split variable below the node.:node_criterion_extra
: the extra content to display alongside the split criterion above the node.:node_split_include_default
: whether to include the default split variable below the node. Defaults to true.:node_criterion_include_default
: whether to include the default split criterion above the node. Defaults to true.
For example, the following code calculates the mean feature value for all the points that fall into each node of the tree and stores in as a vector of Dicts of :node_details_extra
:
using Statistics
node_inds = IAI.apply_nodes(lnr, X)
extras = map(node_inds) do inds
Dict(:node_details_extra =>
string("<b>Mean horsepower in node:</b> ",
round(mean(X[inds, :HP]), digits=1)))
end
7-element Vector{Dict{Symbol, String}}:
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 146.7")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 72.4")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 160.4")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 129.6")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 105.2")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 154.1")
Dict(:node_details_extra => "<b>Mean horsepower in node:</b> 248.4")
We can then include this information in the visualization by passing the extra_content
keyword argument to any of the visualization functions:
vis_extra_text = IAI.TreePlot(lnr, extra_content=extras)
You can incorporate additional information using any combination of HTML/CSS/JavaScript. Note that if you include any <script>
tags, the closing tag must be terminated with <\/script>
for the code to function correctly.
For example, the following code uses billboard.js to visualize the split threshold at each split node in the tree:
node_inds = IAI.apply_nodes(lnr, X)
extras = map(enumerate(node_inds)) do (t, inds)
IAI.is_leaf(lnr, t) && return ""
feature = IAI.get_split_feature(lnr, t)
threshold = round(IAI.get_split_threshold(lnr, t), digits=2)
Dict(:node_details_extra =>
"""
<div id="node-plot-$t" style="width: 280px;"></div>
<script>
var chart = bb.generate({
data: {
xs: {
y: "x"
},
columns: [
["x", $(join(X[inds, feature], ","))],
["y", $(join(y[inds], ","))],
],
type: "scatter",
},
axis: {
x: {
label: "$feature",
tick: {
count: 3,
format: function(x) { return x.toFixed(2); }
}
},
},
grid: {
x: {
lines: [
{
value: $threshold,
text: "$feature < $threshold"
}
]
}
},
legend: {
show: false
},
bindto: "#node-plot-$t"
});
<\\/script>
""")
end
vis_extra_plots = IAI.TreePlot(lnr, extra_content=extras)
Advanced Multi-learner Visualization
It is possible to use the advanced visualization controls in conjunction with multi-learner visualizations. To do this, simply pass TreePlot
or Questionnaire
objects with the appropriate options instead of learners when constructing the visualization. This functionality does not apply to static images.
For example, the following visualization combines the earlier examples with renamed features and extra outputs:
questions = ("Use learner with" => [
"renamed features" => vis_renamed_features,
"extra text output" => vis_extra_text,
"extra plots" => vis_extra_plots,
])
IAI.MultiTreePlot(questions)
Advanced Questionnaire Visualization
Offering binary choices rather than entering values
Sometimes it is a simpler experience for the end user to pick between the two splits of the tree rather than entering a raw value. The binary_choice_features
parameter can be used to specify a subset of features in the questionnaire that should be presented in this way, and should be passed as a FeatureSet
. For instance, the following code converts all features in the questionnaire to this binary choice format:
using DataFrames
IAI.Questionnaire(lnr, binary_choice_features=All())
Changing the "Not sure" button
The missing_renames
parameter allows you to change the text on the "Not sure" button used to indicate a missing value. This allows you to improve the user experience when a value of missing
has a different value in the context of a specific feature. missing_renames
accepts a FeatureMapping
that maps features to the desired text on the button. For instance, supposing that missing
was used to indicate "First visit" for a given feature, the following code changes the label to reflect this:
IAI.Questionnaire(lnr, missing_renames=Dict("Disp" => "First visit"),
include_not_sure_buttons=true)
Choice of multiple units
Often it is helpful to offer the choice of multiple units when entering answers, as each user may have a different unit preference. The units
parameter accepts a FeatureMapping
that maps features to a set of units. Each unit contains two values: a name and a scaling factor to apply to the answer when this unit is selected.
For example, suppose that our feature represents speed, and the data we have was measured in km/hr. If we also want to allow the user to enter the speed in miles/hr, we can use the following code:
using DataFrames
IAI.Questionnaire(lnr, units=Dict(
"Disp" => [("km/h", 1), ("mi/h", 1.60934)],
))
Now, if the user selects miles/hr, a scaling factor of 1.60934 is automatically applied to their answer to convert it back to km/hr before passing the answer to the tree logic.
Showing the logic of the decision tree
It is possible to include the logic of the decision tree splits between each question by using the show_tree_logic
parameter:
IAI.Questionnaire(lnr, show_tree_logic=true)
It is also possible to control the naming of the features, levels, and missing values in the logic display by passing show_tree_logic
as a dictionary:
IAI.Questionnaire(lnr, include_not_sure_buttons=true, show_tree_logic=Dict(
"feature_renames" => Dict("Disp" => "new feature name"),
"missing_renames" => Dict("Disp" => "new missing name"),
))
The possible keys for the dictionary are:
feature_renames
andlevel_renames
, following the format used for changing names in visualizationsmissing_renames
, following the format used to change the label on the "Not sure" button in the questionnaire