Differences between Python and Julia
The IAI Python interface matches the Julia API very closely, so you can refer to the Julia documentation for information on most tasks. On this page we note the main differences between the Python and Julia interfaces.
Conversion of Julia data types to Python
In order to figure out the types to pass to an IAI function from the Python interface, you can refer to the equivalent function in the Julia API and translate the types to their Python equivalent. Most literal data types convert in a straighforward manner, for example:
Int
toint
Float64
tofloat
String
tostr
Dict
todict
The following Julia types can be passed as follows:
nothing
can be passed usingNone
- a
Symbol
can be passed as astr
- a
Vector
can be passed as alist
, 1-Dnumpy.array
or apandas.Series
- a
Matrix
can be passed as a 2-Dnumpy.array
- a
DataFrame
can be passed as apandas.DataFrame
Specifying Feature Set in Python
We list the Python input types for specifying set of features in a dataframe as learner parameters. Refer to IAI.FeatureSet
for the Julia equivalence. Note that if you are using integers to specify the indices of columns, the input is expected to use one-based indexing as in Julia.
Input Type | Description | Examples |
---|---|---|
All | Use all columns | {'All' : []} |
Integer or a vector of Integer s | Specify indices of columns to use | 1 , [1, 3, 4] |
String or a vector of String s | Specify names of columns to use | 'x1' , ['x1', 'x3'] |
Not | Specify columns not to use | {'Not' : 1} , {'Not' : ['x2', 'x4']} |
Between | Specify range of columns to use | {'Between' : ['x1', 'x4']} |
Object-oriented interface for learners
In the IAI Python interface, the API methods relating to learners are methods of the learner objects rather than functions that operate on learners as in the Julia interface. For instance the IAI.fit!
method in Julia:
IAI.fit!(lnr, X, y)
would be called from the Python interface as
lnr.fit(X, y)
Interactive Visualizations
The write_html
and show_in_browser
functions work the same in Python as in Julia for saving visualizations to file or displaying in an external browser, respectively. Additionally, visualizations will be automatically shown in Jupyter notebooks as they are for Julia.
Below is an example that shows the equivalent Python code for the advanced visualization examples in Julia. In these examples we work with the following tree learner:
We can rename the features with a dict
that maps from the original names to more descriptive names:
vis_renamed_features = lnr.TreePlot(feature_renames={
"Disp": "Displacement",
"HP": "Horsepower",
"WT": "Weight",
})
We can also have a finer-grained control of what is displayed for each node, such as adding summary statistics. We create a list
of dict
s with the parameters controlling what you want to show in each node and pass this as extra_content
:
import numpy as np
node_inds = lnr.apply_nodes(X)
def get_text(inds):
content = ('<b>Mean horsepower in node:</b> ' +
str(np.round(np.mean(X.HP.iloc[inds]), decimals=2)))
return ({'node_details_extra' : content})
extras = [get_text(inds) for inds in node_inds]
vis_extra_text = lnr.TreePlot(extra_content=extras)
Finally, we can combine multiple learners in a single visualization as described in the Julia documentation. In Python, a question is a tuple
of the form (question, responses)
, where question
is the string prompt for the question and responses
is a list
of possible responses:
questions = ("Use learner with", [
("renamed features", vis_renamed_features),
("extra text output", vis_extra_text)
])
iai.MultiTreePlot(questions)