Evaluation Log-Model

Evaluating Petri Nets

Now that is is clear how to obtain a Petri net, along with an initial and a final marking and how to apply a Process Discovery algorithm, the question is how to evaluate the quality of the extracted models in the 4 dimensions of Fitness, Precision, Generalization, and Simplicity. In pm4py we provide algorithms to evaluate these 4 dimensions.

For the examples reported in the following sections, we assume to work with the running-example logs located in the folder tests\input_data and apply the Alpha Miner as well as the Inductive Miner:

import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.algo.discovery.alpha import factory as alpha_miner
from pm4py.algo.discovery.inductive import factory as inductive_miner

log = xes_importer.import_log(os.path.join("tests","input_data","running-example.xes"))
alpha_petri, alpha_initial_marking, alpha_final_marking = alpha_miner.apply(log)
inductive_petri, inductive_initial_marking, inductive_final_marking = inductive_miner.apply(log)


Fitness is a measure of the replayability of the traces used to mine the model. A fitness evaluation could provide:

  • An average fitness value for the log according to the model, that is comprised between 0 and 1 and indicates how well the model can represent the behavior seen in the traces.
  • The percentage of traces in the log that is perfectly fitting according to the model.

In pm4py we provide the following algorithms to replay traces on a process model: token-based replay and alignment-based replay.

The following code is useful to get the average fitness value and the percentage of fit traces according to the token replayer:

from pm4py.evaluation.replay_fitness import factory as replay_factory

fitness_alpha = replay_factory.apply(log, alpha_petri, alpha_initial_marking, alpha_final_marking)
fitness_inductive = replay_factory.apply(log, inductive_petri, inductive_initial_marking, inductive_final_marking)

The output shows that for the running-example log and both Alpha Miner and Inductive Miner we have perfect fitness:

fitness_alpha= {'percFitTraces': 100.0, 'averageFitness': 1.0}
fitness_inductive= {'percFitTraces': 100.0, 'averageFitness': 1.0}

To use the alignment-based replay and get the fitness values, the following code can be used on the Inductive model. Since Alpha Miner does not produce a sound workflow net, alignments based replay cannot be applied.

fitness_inductive = replay_factory.apply(log, inductive_petri, inductive_initial_marking, inductive_final_marking, variant="alignments")

Alignments are using multiprocessing in order to improve the performance, therefore, it is mandatory to start the script with this condition in order to compute alignments:

if __name__ == "__main__":


Precision is a comparison between the behavior activated in the model at a given state and the behavior activated in the log. A model is precise when it does not allow for paths that are not present in the log. An approach to measure precision has been proposed in the following paper and is called ETConformance:

Muñoz-Gama, Jorge, and Josep Carmona. “A fresh look at precision in process conformance.” International Conference on Business Process Management. Springer, Berlin, Heidelberg, 2010.

Basically, the idea is to build an automaton from the log where the states are represented by prefixes of the traces in the log and transitions are inserted in the automaton if they are present in some trace of the log.

Each state of the automaton is replayed in the Petri net (assuming that it is fit according to the Petri net) and then we have:

  • The reflected tasks that are the output transitions in the log automaton of such state
  • The activated transitions that are the transitions which were activated but not executed in the Petri net after the trace prefix has been replayed

A set of escaping edges is defined as difference between the activated transitions and the reflected tasks. The following sums are computed:

  • a sum of the number of the activated transitions in the Petri net for each state in the log automaton (SUM_AT).
  • a sum of the number of escaping edges for each state (SUM_EE)

The precision measure then could be valued as 1 – SUM_EE/SUM_AT.

The following code measures the precision of the Alpha and Inductive Miner models on the receipt.xes log:

from pm4py.evaluation.precision import factory as precision_factory

precision_alpha = precision_factory.apply(log, alpha_petri, alpha_initial_marking, alpha_final_marking)
precision_inductive = precision_factory.apply(log, inductive_petri, inductive_initial_marking, inductive_final_marking)


We obtain the following values:

precision_alpha= 0.7333333333333334
precision_inductive= 0.7333333333333334

The Inductive Miner model is in this case less precise than the Alpha Model, as the model is a Spaghetti model and to fit the model a lot of skip/loop transitions are added.


Generalization indicates the characteristic of a process model to not host components that are too specific and are used only in few executions of the process. Models that overfit the log have generally a lot of components that are too specific.

In the context of measuring precision on a Petri net, the components that have been taken into account for measuring precision are the transitions (both visible and hidden). In particular, the token replayer returns for each trace the list of transitions that have been activated during the replay. Note, that the implementation provided in pm4py is able to take into account hidden transitions. So it is easy to measure how many times given transitions have been activated during the replay of the log.

The implemented approach is suggested in the paper:

Buijs, Joos CAM, Boudewijn F. van Dongen, and Wil MP van der Aalst. “Quality dimensions in process discovery: The importance of fitness, precision, generalization and simplicity.” International Journal of Cooperative Information Systems 23.01 (2014): 1440001.

Accordingly, generalization is obtained using the following formula on the Petri net:

The following code measures the generalization of the Alpha and Inductive Miner models on the receipt.xes log:

from pm4py.evaluation.generalization import factory as generalization_factory

generalization_alpha = generalization_factory.apply(log, alpha_petri, alpha_initial_marking, alpha_final_marking)
generalization_inductive = generalization_factory.apply(log, inductive_petri, inductive_initial_marking, inductive_final_marking)


We obtain the following values:

generalization_alpha= 0.5259294594558881
generalization_inductive= 0.546264204110579

The generalization value provided by the Inductive Miner on this log is slightly lower than the generalization of the Alpha Miner model because of the presence of skip/loop transitions that are visited less often than visible transitions. In comparison, the Petri net constructed by the Alpha Miner only contains visible transitions.


A model is simple when the end user can really understand information from the process model, so the execution paths of the model are clear. For Petri nets, the execution semantics is related to firing transitions, removing tokens from their input places and adding tokens to output places. So a model could be seen as simpler when the number of transitions (a possible way to consume/insert tokens) is low in comparison to the number of places. The approach implemented in pm4py is inspired by this idea, which has been reported in the following paper and is called ‘inverse arc degree’:

Blum, Fabian Rojas. Metrics in process discovery. Technical Report TR/DCC-2015-6, Computer Science Department, University of Chile, 2015.

The formula applied for simplicity is the following:

The following code measures the simplicity of the Alpha and Inductive Miner models on the receipt.xes log:

from pm4py.evaluation.simplicity import factory as simplicity_factory

simplicity_alpha = simplicity_factory.apply(alpha_petri)
simplicity_inductive = simplicity_factory.apply(inductive_petri)


We obtain the following values:

simplicity_alpha= 0.5333333333333333
simplicity_inductive= 0.5217391304347826

The simplicity of the Inductive Miner model is higher than the simplicity provided by the Alpha Miner on this log.

Getting all measures in one-line

In the previous sections, methods to calculate fitness, precision, generalization and simplicity of a process model have been provided. In this section, some code to retrieve all the measures at once is provided:

from pm4py.evaluation import factory as evaluation_factory
alpha_evaluation_result = evaluation_factory.apply(log, alpha_petri, alpha_initial_marking, alpha_final_marking)

inductive_evaluation_result = evaluation_factory.apply(log, inductive_petri, inductive_initial_marking, inductive_final_marking)

We obtain the following values:

alpha_evaluation_result= {'fitness': {'perc_fit_traces': 100.0, 'average_trace_fitness': 1.0, 'log_fitness': 1.0}, 'precision': 0.7333333333333334, 'generalization': 0.5259294594558881, 'simplicity': 0.5333333333333333, 'metricsAverageWeight': 0.6981490315306387}
inductive_evaluation_result= {'fitness': {'perc_fit_traces': 100.0, 'average_trace_fitness': 1.0, 'log_fitness': 1.0}, 'precision': 0.7333333333333334, 'generalization': 0.546264204110579, 'simplicity': 0.5217391304347826, 'metricsAverageWeight': 0.7003341669696738}

These values are the same that have been reported previously, and another measure (that is the average of the 4 measures) is provided with key ‘metricsAverageWeight’. It measures the overall quality of the process model.