Classifiers

Specifying a different activity key in a Process Mining algorithm

Algorithms implemented in pm4py assume to classify events based on their activity name, which is usually reported inside the concept:name event attribute. In some contexts, it is useful to use another event attribute as activity:

  • Importing an event log from a CSV does not assure to have a concept:name event attribute
  • Multiple events in a case may refer to different lifecycles of the same activity

The following example, shows the specification of an activity key for the Alpha Miner algorithm:

import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.algo.discovery.alpha import factory as alpha_miner
from pm4py.util import constants

log = xes_importer.import_log(os.path.join("tests","input_data","running-example.xes"))

parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: "concept:name"}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)

For logs imported from XES format, a list of fields that could be used in order to classify events and apply Process Mining algorithms is usually reported in the classifiers section. The Standard classifier usually includes the activity name (the concept:name attribute) and the lifecycle (the lifecycle:transition attribute); the Event name classifier includes only the activity name.

In pm4py, it is assumed that algorithms work on a single activity key. In order to use multiple fields, a new attribute should be inserted for each event as the concatenation of the two.

Classifiers: retrieval and insertion of a corresponding attribute

The following example demonstrates the retrieval of the classifiers inside a log file, using the receipt.xes log:

import os
from pm4py.objects.log.importer.xes import factory as xes_importer

log = xes_importer.import_log(os.path.join("tests","input_data","receipt.xes"))
print(log.classifiers)

The classifiers are then printed to the screen:

{'Activity classifier': ['concept:name', 'lifecycle:transition'], 'Resource classifier': ['org:resource'], 'Group classifier': ['org:group']}

To use the classifier Activity classifier and write a new attribute for each event in the log, the following code can be used:

from pm4py.objects.log.util import insert_classifier

log, activity_key = insert_classifier.insert_activity_classifier_attribute(log, "Activity classifier")
print(activity_key)

Then, as before, the Alpha Miner can be applied on the log specifying the newly inserted activity key:

from pm4py.algo.discovery.alpha import factory as alpha_miner
from pm4py.util import constants

parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: activity_key}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)

Insert manually a new attribute

In the case, the XES specifies no classifiers, and a different field should be used as activity key, there is the option to specify it manually. For example, in this piece of code we read the receipt.xes log and create a new attribute called customClassifier that is the activity name plus the transition

import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.util import constants

log = xes_importer.import_log(os.path.join("tests","input_data","receipt.xes"))

for trace in log:
    for event in trace:
        event["customClassifier"] = event["concept:name"] + event["lifecycle:transition"]

Then, for example, the Alpha Miner can be applied specifying customClassifier as activity key

from pm4py.algo.discovery.alpha import factory as alpha_miner

parameters = {constants.PARAMETER_CONSTANT_ACTIVITY_KEY: "customClassifier"}
net, initial_marking, final_marking = alpha_miner.apply(log, parameters=parameters)