Decision Mining

Decision Mining

Decision mining permits, provided:

  • An event log
  • A process model (an accepting Petri net)
  • A decision point

To retrieve the features of the cases that go in the different directions. This permits, for example, to calculate a decision tree that explains the decisions.

Let’s start by importing a XES log:

from pm4py.objects.log.importer.xes import importer as xes_importer

log = xes_importer.apply("tests/input_data/running-example.xes")

Calculating a model using the inductive miner:

from pm4py.algo.discovery.inductive import algorithm as inductive_miner

net, im, fm = inductive_miner.apply(log)

A visualization of the model can be obtained in the following way:

from pm4py.visualization.petrinet import visualizer

gviz = visualizer.apply(net, im, fm, parameters={visualizer.Variants.WO_DECORATION.value.Parameters.DEBUG: True})

For this example, we choose the decision point p_10. There, a decision, is done between the activities examine casually and examine throughly.

To execute the decision mining algorithm, once we have a log, model and a decision point, the following code can be used:

from pm4py.algo.enhancement.decision import algorithm as decision_mining

X, y, class_names = decision_mining.apply(log, net, im, fm, decision_point="p_10")

As we see, the outputs of the apply method are the following:

  • X: a Pandas dataframe containing the features associated to the cases leading to a decision.
  • y: a Pandas dataframe, that is a single column, containing the number of the class that is the output of the decision (in this case, the values possible are 0 and 1, since we have two target classes)
  • class_names: the names of the output classes of the decision (in this case, examine casually and examine thoroughly).

These outputs can be used in a generic way with any classification or comparison technique.

In particular, decision trees can be useful. We provide a function to automate the discovery of decision trees out of the decision mining technique. The code that should be applied is the following:

from pm4py.algo.enhancement.decision import algorithm as decision_mining

clf, feature_names, classes = decision_mining.get_decision_tree(log, net, im, fm, decision_point="p_10")

Then, a visualization of the decision tree can be obtained in the following way:

from pm4py.visualization.decisiontree import visualizer as tree_visualizer

gviz = tree_visualizer.apply(clf, feature_names, classes)