Alignments

Alignments

Alignment-based replay aims to find one of the best alignment between the trace and the model. For each trace, the output of an alignment is a list of couples where the first element is an event (of the trace) or » and the second element is a transition (of the model) or ». For each couple, the following classification could be provided:

  • Sync move: the classification of the event corresponds to the transition label; in this case, both the trace and the model advance in the same way during the replay.
  • Move on log: for couples where the second element is », it corresponds to a replay move in the trace that is not mimicked in the model. This kind of move is unfit and signal a deviation between the trace and the model.
  • Move on model: for couples where the first element is », it corresponds to a replay move in the model that is not mimicked in the trace. For moves on model, we can have the following distinction:
    • Moves on model involving hidden transitions: in this case, even if it is not a sync move, the move is fit.
    • Moves on model not involving hidden transitions: in this case, the move is unfit and signals a deviation between the trace and the model.

The following code implements an example for obtaining alignments. First, the running-example.xes log is loaded and the Inductive Miner is applied:

import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.algo.discovery.inductive import factory as inductive_miner

log = xes_importer.import_log(os.path.join("tests", "input_data", "running-example.xes"))

net, initial_marking, final_marking = inductive_miner.apply(log)

And the alignments can be obtained by this piece of code:

import pm4py
from pm4py.algo.conformance.alignments import factory as align_factory

alignments = align_factory.apply_log(log, net, initial_marking, final_marking)

If we execute print(alignments) we get the following output:

[{'alignment': [('register request', 'register request'), ('examine casually', 'examine casually'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('examine thoroughly', 'examine thoroughly'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('pay compensation', 'pay compensation'), ('>>', None)], 'cost': 7, 'visited_states': 18, 'queued_states': 50, 'traversed_arcs': 100, 'fitness': 1.0}, {'alignment': [('register request', 'register request'), ('check ticket', 'check ticket'), ('>>', None), ('examine casually', 'examine casually'), ('>>', None), ('decide', 'decide'), ('pay compensation', 'pay compensation'), ('>>', None)], 'cost': 3, 'visited_states': 9, 'queued_states': 26, 'traversed_arcs': 45, 'fitness': 1.0}, {'alignment': [('register request', 'register request'), ('examine thoroughly', 'examine thoroughly'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reject request', 'reject request'), ('>>', None)], 'cost': 3, 'visited_states': 9, 'queued_states': 26, 'traversed_arcs': 45, 'fitness': 1.0}, {'alignment': [('register request', 'register request'), ('examine casually', 'examine casually'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('pay compensation', 'pay compensation'), ('>>', None)], 'cost': 3, 'visited_states': 9, 'queued_states': 26, 'traversed_arcs': 45, 'fitness': 1.0}, {'alignment': [('register request', 'register request'), ('examine casually', 'examine casually'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('examine casually', 'examine casually'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('examine casually', 'examine casually'), ('>>', None), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reject request', 'reject request'), ('>>', None)], 'cost': 11, 'visited_states': 29, 'queued_states': 75, 'traversed_arcs': 157, 'fitness': 1.0}, {'alignment': [('register request', 'register request'), ('check ticket', 'check ticket'), ('>>', None), ('examine thoroughly', 'examine thoroughly'), ('>>', None), ('decide', 'decide'), ('reject request', 'reject request'), ('>>', None)], 'cost': 3, 'visited_states': 9, 'queued_states': 26, 'traversed_arcs': 45, 'fitness': 1.0}]

This list reports for each trace the corresponding alignment along with its statistics. With each trace, a dictionary containing among the others the following information is associated:

  • alignment: contains the alignment (sync moves, moves on log, moves on model)
  • cost: contains the cost of the alignment according to the provided cost function
  • fitness: is equal to 1 if the trace is perfectly fitting

To use a different classifier, we recall the Classifiers section in documentation of Process Discovery. Indeed, the following code defines a custom classifier for each event of each trace in the log:

for trace in log:
    for event in trace:
        event["customClassifier"] = event["concept:name"] + event["concept:name"]

A parameters dictionary containing the activity key can be formed:

# import constants
from pm4py.util import constants
# define the activity key in the parameters
parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: "customClassifier"}

Then the process model could be calculated:

# calculate process model using the given classifier
net, initial_marking, final_marking = inductive_miner.apply(log, parameters=parameters)

And eventually the replay is done:

alignments = align_factory.apply_log(log, net, initial_marking, final_marking, parameters=parameters)

To get the overall log fitness value, the following code can be used:

from pm4py.evaluation.replay_fitness import factory as replay_fitness_factory

log_fitness = replay_fitness_factory.evaluate(alignments, variant="alignments")

Using print(log_fitness) the following result is obtained:

{'percFitTraces': 100.0, 'averageFitness': 1.0}

The following parameters can also be provided to the alignments:

  • Model cost function: associating to each transition in the Petri net the corresponding cost of a move-on-model.
  • Sync cost function: associating to each visible transition in the Petri net the cost of a sync move.

Implementation of a custom model cost function, and sync cost function:

model_cost_function = dict()
sync_cost_function = dict()
for t in net.transitions:
	# if the label is not None, we have a visible transition
	if t.label is not None:
		# associate cost 1000 to each move-on-model associated to visible transitions
		model_cost_function[t] = 1000
		# associate cost 0 to each move-on-log
		sync_cost_function[t] = 0
	else:
		# associate cost 1 to each move-on-model associated to hidden transitions
		model_cost_function[t] = 1

Insertion of the model cost function and sync cost function in the parameters:

parameters[pm4py.algo.conformance.alignments.versions.state_equation_a_star.PARAM_MODEL_COST_FUNCTION] = model_cost_function
parameters[pm4py.algo.conformance.alignments.versions.state_equation_a_star.PARAM_SYNC_COST_FUNCTION] = sync_cost_function

And eventually the replay is done:

alignments = align_factory.apply_log(log, net, initial_marking, final_marking, parameters=parameters)