Specifying a different activity key in a Process Mining algorithm
Algorithms implemented in pm4py assume to classify events based on their activity name, which is usually reported inside the concept:name event attribute. In some contexts, it is useful to use another event attribute as activity:
- Importing an event log from a CSV does not assure to have a concept:name event attribute
- Multiple events in a case may refer to different lifecycles of the same activity
The following example, shows the specification of an activity key for the Alpha Miner algorithm:
import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.algo.discovery.alpha import factory as alpha_miner
from pm4py.util import constants
log = xes_importer.import_log(os.path.join("tests","input_data","running-example.xes"))
parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: "concept:name"}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)
For logs imported from XES format, a list of fields that could be used in order to classify events and apply Process Mining algorithms is usually reported in the classifiers section. The Standard classifier usually includes the activity name (the concept:name attribute) and the lifecycle (the lifecycle:transition attribute); the Event name classifier includes only the activity name.
In pm4py, it is assumed that algorithms work on a single activity key. In order to use multiple fields, a new attribute should be inserted for each event as the concatenation of the two.
Classifiers: retrieval and insertion of a corresponding attribute
The following example demonstrates the retrieval of the classifiers inside a log file, using the receipt.xes log:
import os
from pm4py.objects.log.importer.xes import factory as xes_importer
log = xes_importer.import_log(os.path.join("tests","input_data","receipt.xes"))
print(log.classifiers)
The classifiers are then printed to the screen:
{'Activity classifier': ['concept:name', 'lifecycle:transition'], 'Resource classifier': ['org:resource'], 'Group classifier': ['org:group']}
To use the classifier Activity classifier and write a new attribute for each event in the log, the following code can be used:
from pm4py.objects.log.util import insert_classifier
log, activity_key = insert_classifier.insert_activity_classifier_attribute(log, "Activity classifier")
print(activity_key)
Then, as before, the Alpha Miner can be applied on the log specifying the newly inserted activity key:
from pm4py.algo.discovery.alpha import factory as alpha_miner
from pm4py.util import constants
parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: activity_key}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)
Insert manually a new attribute
In the case, the XES specifies no classifiers, and a different field should be used as activity key, there is the option to specify it manually. For example, in this piece of code we read the receipt.xes log and create a new attribute called customClassifier that is the activity name plus the transition
import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.util import constants
log = xes_importer.import_log(os.path.join("tests","input_data","receipt.xes"))
for trace in log:
for event in trace:
event["customClassifier"] = event["concept:name"] + event["lifecycle:transition"]
Then, for example, the Alpha Miner can be applied specifying customClassifier as activity key
from pm4py.algo.discovery.alpha import factory as alpha_miner
parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: "customClassifier"}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)