Graphs permits to understand several aspects of the current log (for example, the distribution of a numeric attribute, or the distribution of case duration, or the events over time).

### Distribution of case duration

In the following example, the distribution of case duration is shown in two different graphs, a simple plot and a semi-logarithmic (on the X-axis plot). The semi-logarithmic plot is less sensible to possible outliers.

First, the Receipt log may be loaded:

import os from pm4py.objects.log.importer.xes import factory as xes_importer log_path = os.path.join("tests","input_data","receipt.xes") log = xes_importer.import_log(log_path)

Then, the distribution related to case duration may be obtained:

from pm4py.util import constants from pm4py.statistics.traces.log import case_statistics x, y = case_statistics.get_kde_caseduration(log, parameters={constants.PARAMETER_CONSTANT_TIMESTAMP_KEY: "time:timestamp"})

We could obtain the simple plot:

from pm4py.visualization.graphs import factory as graphs_vis_factory gviz = graphs_vis_factory.apply_plot(x, y, variant="cases") graphs_vis_factory.view(gviz)

Or the semi-logarithmic (on the X-axis) plot.

gviz = graphs_vis_factory.apply_semilogx(x, y, variant="cases") graphs_vis_factory.view(gviz)

The following code is useful to obtain, instead, a JSON with the points composing the graph:

from pm4py.util import constants from pm4py.statistics.traces.log import case_statistics json = case_statistics.get_kde_caseduration_json(log, parameters={constants.PARAMETER_CONSTANT_TIMESTAMP_KEY: "time:timestamp"})

### Distribution of events over time

In the following example, a graph representing the distribution of events over time is obtained. This is particularly important because it helps to understand in which time intervals the greatest number of events is recorded.

First, the Receipt log may be loaded:

import os from pm4py.objects.log.importer.xes import factory as xes_importer log_path = os.path.join("tests","input_data","receipt.xes") log = xes_importer.import_log(log_path)

Then, the distribution related to events over time may be obtained:

from pm4py.algo.filtering.log.attributes import attributes_filter x, y = attributes_filter.get_kde_date_attribute(log, attribute="time:timestamp")

And the graph could be obtained:

from pm4py.visualization.graphs import factory as graphs_vis_factory gviz = graphs_vis_factory.apply_plot(x, y, variant="dates") graphs_vis_factory.view(gviz)

The following code is useful to obtain, instead, a JSON with the points composing the graph:

from pm4py.util import constants from pm4py.statistics.traces.log import case_statistics json = attributes_filter.get_kde_date_attribute_json(log, attribute="time:timestamp")

### Distribution of a numeric attribute

In the following example, two graphs related to the distribution of a numeric attribute will be obtained, a normal plot and a semilogarithmic (on the X-axis) plot (that is less sensitive to outliers).

First, a filtered version of the Road Traffic log is loaded:

import os from pm4py.objects.log.importer.xes import factory as xes_importer log_path = os.path.join("tests", "input_data", "roadtraffic100traces.xes") log = xes_importer.import_log(log_path)

Then, the distribution of the numeric attribute **amount** is obtained:

from pm4py.algo.filtering.log.attributes import attributes_filter x, y = attributes_filter.get_kde_numeric_attribute(log, "amount")

The standard graph could be then obtained:

from pm4py.visualization.graphs import factory as graphs_vis_factory gviz = graphs_vis_factory.apply_plot(x, y, variant="attributes") graphs_vis_factory.view(gviz)

Or the semi-logarithmic graph could be obtained:

from pm4py.visualization.graphs import factory as graphs_vis_factory gviz = graphs_vis_factory.apply_semilogx(x, y, variant="attributes") graphs_vis_factory.view(gviz)

.