PM4Py-WS Log Handlers

PM4Py-WS Log Handlers

Each process is managed by an handler that controls the following operations on the log:

  • Loading
  • Filtering
  • Analysis

    Properties that should be provided by a log handler

  • variants_number => should store the number of variants of the given log
  • cases_number => should store the number of cases of the given log
  • events_number => should store the number of events of the given log
  • first_ancestor => the handler on the original log (without filters)
  • last_ancestor => the handler on the previously filtered log (without the last added filter)

   Methods that should be provided by a log handler

Methods that should be provided by a log handler:

add_filter(self, filter, all_filters)

Method that is called when a new filter needs to be added to the log. In this case, a (new) handler needs to be returned.

Parameters:

filter => The filter that is being added

all_filters => The list of all filters applied to the log (whether needed for the handler implementation or not)

Returns: a new handler where the log has been filtered

remove_filter(self, filter, all_filters)

Method that is called when a filter needs to be removed from the log. In this case, a (new) handler needs to be returned.

Parameters:

filter => The filter that is being added

all_filters => The list of all filters applied to the log (whether needed for the handler implementation or not)

Returns: a new handler where the log has been filtered (removing the filter if applied to the log)

get_schema(self, variant=dfg_freq, parameters=None)

Method that should retrieve the process schema in the specified variant and with the specified parameters.

Variants that should to be provided:

alpha_freq (Alpha Miner decorated with frequency information), alpha_perf (Alpha Miner decorated with performance information), dfg_freq (DFG representation decorated with frequency), dfg_perf (DFG representation decorated with performance), heuristics_freq (Heuristics Net decorated with frequency), heuristics_perf (Heuristics Net decorated with performance), inductive_freq (Inductive Miner Petri net decorated with frequency), inductive_perf (Inductive Miner Petri net decorated with performance), tree (Process Tree representation through Inductive Miner)

Parameters:

decreasingFactor => decreasing factor to apply in the auto filtering of PM4Py

Returns: a list of four elements:

  • The base 64 of the process schema according to the provided variant
  • The process model (could be empty)
  • The format of the process model if the process model is not empty (e.g. PNML)
  • The name of the current handler (e.g. parquet, xes …)

get_case_duration_svg(self)

A method that should return the SVG representation of the case duration graph

Returns: the SVG representation of the case duration graph

get_events_per_time_svg(self)

A method that should return the SVG representation of the events per time graph

Returns: the SVG representation of the events per time graph

get_variant_statistics(self, parameters=None)

A method that should return the variants (along with their count) of the given log, expressed as a list of dictionaries.

Inputs:

parameters: a list of optional parameters used to calculate the variants.

Returns: The variants (along with their count) of the given log, expressed as a list of dictionaries. For each variant, the dictionary should contain a variant element that is the list of activities of the variant, and another element that should contain the count. E.g.

[{"case:concept:name":713,"variant":"Confirmation of receipt,T02 Check confirmation of receipt,T04 Determine confirmation of receipt,T05 Print and send confirmation of receipt,T06 Determine necessity of stop advice,T10 Determine necessity to stop indication"},{"case:concept:name":123,"variant":"Confirmation of receipt,T06 Determine necessity of stop advice,T10 Determine necessity to stop indication,T02 Check confirmation of receipt,T04 Determine confirmation of receipt,T05 Print and send confirmation of receipt"},{"case:concept:name":116,"variant":"Confirmation of receipt"}...

get_sna(variant=”handover”, parameters=None)

A method that should return the Pyvis visualization of the Social Network from a given log

Inputs:

variant => the variant of the algorithm to use (handover, subcontracting, working_together, joint_activities)

parameters => parameters to use in calculating the social network (including the threshold on the arcs)

Returns: if the SNA representation fails, then an empty string. If the SNA representation succeeds, a HTML containing the Pyvis visualization of the Social Network.

get_transient(self, delay, parameters=None)

A method that is called to perform the CTMC simulation on the given log.

Parameters:

delay => The delay of the simulation, in seconds, starting from the initial state

parameters => possible optional parameters of the algorithm

Returns: a base64 containing the SVG representation of the CTMC simulation output on the given log.

get_case_statistics(self, parameters=None)

Gets the list of cases of the current log

Parameters:

parameters => Possible optional parameters of the case statistics PM4Py algorithm, variant is among them (specifying the variant for which we want the cases)

Returns: a list of cases contained in the event log.

get_events(self, caseid, parameters=None)

Get the events of a case of the current log

Parameters:

caseid => current case ID

parameters => Possible optional parameters.

Returns: a list of events (expressed as a key-value dictionary) of the given case ID.

download_xes_log(self)

Gets the log in the XES format

Returns: a string representing the log in the XES format

download_csv_log(self)

Gets the log in the CSV format

Returns: a string representing the log in the CSV format

get_start_activities(self, parameters=None)

Gets the start activities from the log

Parameters:

parameters => possible optional parameters of the method

Returns: a list of lists of start activities along with their count, e.g.

[[“A”,500],[“B”,300],[“C,250]]

get_end_activities(self, parameters=None)

Gets the end activities from the log

Parameters:

parameters => possible optional parameters of the method

Returns: a list of lists of end activities along with their count, e.g.

[[“A”,500],[“B”,300],[“C,250]]

get_attributes_list(self, parameters=None)

Gets the list of attributes of the given log

Parameters:

parameters => possible optional parameters of the method

Returns: a list of attribute keys contained in the log, e.g.

[“concept:name”, “org:resource” …]

get_attribute_values(self, attribute_key, parameters=None)

Gets the list of values for a particular attribute

Parameters:

attribute_key => name of the attribute

parameters => possible optional parameters of the method

Returns: a list of lists of attribute values along with their number of occurrences, e.g. for the resource

[["Resource01",1228],["Resource02",580],["Resource03",552],["Resource04",483],["Resource05",445],["Resource06",430],["Resource07",424],["Resource08",356],["admin1",352],["Resource09",350],["Resource10",329],["Resource11",328],["Resource12",326],["Resource13",307],["Resource14",264],["Resource15",235],["Resource16",215],["Resource17",194],["Resource18",170],["admin2",160],["Resource19",136],["Resource20",120],["Resource21",104],["Resource22",80],["Resource23",78],["Resource24",50],["Resource25",49],["Resource26",44],["Resource27",43],["Resource28",30],["Resource29",20],["Resource30",13],["Resource31",12],["Resource32",11],["Resource33",11],["Resource34",10],["Resource35",8],["Resource36",6],["test",5],["Resource37",5],["admin3",3],["Resource38",3],["TEST",2],["Resource39",2],["Resource41",1],["Resource40",1],["Resource42",1],["Resource43",1]

      Provided log handlers

  XesHandler

An handler that is thought to manage EventLog objects loaded from XES files.

Methods:

build_from_path(self, path, parameters=None)

Reads a XES log from the disk, and keep it in memory

Inputs:

path => Path to the XES log file

parameters => Optional parameters of the method

  ParquetHandler

An handler that is thought to manage Pandas dataframe objects loaded from Parquet/CSV files.

Methods:

build_from_path(self, path, parameters=None)

Reads a Parquet log from the disk, and keep it in memory

Inputs:

path => Path to the Parquet log file

parameters => Optional parameters of the method

 

build_from_csv(self, path, parameters=None)

Reads a CSV log from the disk, with the default mapping (e.g. the case ID is case:concept:name, the activity is concept:name, the timestamp is time:timestamp).

Inputs:

path => Path to the CSV log file

parameters => Optional parameters of the method, including:

pm4py.util.constants.PARAMETER_CONSTANT_CASEID_KEY => the column of the CSV that contains the case ID (default case:concept:name)

pm4py.util.constants.PARAMETER_CONSTANT_ACTIVITY_KEY => the column of the CSV that contains the activity (default concept:name)

pm4py.util.constants.PARAMETER_CONSTANT_TIMESTAMP_KEY => the column of the CSV that contains the timestamp (default time:timestamp)

 

.