Social Network Analysis

Social Network Analysis from events logs permits to construct and visualize the relationships between resources given a metric. In this tutorial, the steps needed to construct and visualize a Social Network from events logs, starting from a preliminary transformation of the log object, are undertaken.

Transforming the log object into a SNA dataframe object

Transforming trace logs objects

SNA dataframe object is a representation that is suited for the calculation of metrics. After importing a log (for example the running-example.xes log):

import os
from pm4py.objects.log.importer.xes import factory as xes_importer

log = xes_importer.apply(os.path.join("tests", "input_data", "running-example.xes"))

It is possible to calculate the SNA dataframe using the following code:

from pm4py.algo.sna.transformer.tracelog import factory as sna_transformer

mco = sna_transformer.apply(log)

Transforming Pandas dataframes (imported from CSVs)

SNA dataframe object is a representation that is suited for the calculation of metrics. After importing a dataframe (for example the running-example.csv dataframe):

from pm4py.objects.log.adapters.pandas import csv_import_adapter

df = csv_import_adapter.import_dataframe_from_path(os.path.join("tests", "input_data", "running-example.csv"))

It is possible to calculate the SNA dataframe using the following code:

from pm4py.algo.sna.transformer.pandas import factory as sna_transformer
mco = sna_transformer.apply(df)

Calculating Social Network Analysis metrics

Handover of Work metric

The Handover of Work metric between resources is a directed metric that calculates how often an individual “pass” his work to another individual.

It is possible to calculate the Handover of Work metric through the following code:

from pm4py.algo.sna.metrics.handover import factory as handover_of_work
hw_matrix = handover_of_work.apply(mco)

That provides a RxR (where R is the number of resources) matrix.

Similar Activities metric

The Similar Activities metric between resources is a undirected metric that calculates how much similar is the work pattern (expressed as the distribution of activities that are executed) between people.

It is possible to calculatedthe Similar Activities metric through the following code:

from pm4py.algo.sna.metrics.similar_activities import factory as similar_activities

sim_act_matrix = similar_activities.apply(mco)

That provides a RxR (where R is the number of resources) matrix.

Visualization of Social Networks

At this point, we assume that a RxR matrix (where R is the number of resources), expressing the values of the desidered SNA matrix, has been calculated using one of the metric of the previous points.

(Static) visualization through NetworkX

A graphic object could be obtained from the Handover of Work and/or the Similar Activities matrix through the following code:

from pm4py.visualization.sna import factory as sna_vis_factory

gviz_hwmatrix = sna_vis_factory.apply(mco, hw_matrix, parameters={"directed": True, "threshold": 0})
gviz_samatrix = sna_vis_factory.apply(mco, sim_act_matrix, parameters={"directed": False, "threshold": 0.5})

The parameters accepted by the method apply are the following:

  • directed: boolean values that says the graph is directed (e.g. the Handover of Work metric is directed, while the Similar Activities metric is not directed)
  • threshold: float (between 0 and 1) threshold of representation of the arcs (no arc with associated value below threshold is represented)

Producing the following two pictures (the first represents the Handover of Work metric; the second represents the Similar Activities metric):

The result could be saved using the following code:

sna_vis_factory.save(gviz_hwmatrix, "hwmatrix_networkx.png")
sna_vis_factory.save(gviz_samatrix, "samatrix_networkx.png")

Interative visualization through Pyvis

If we take into account a more complex log, for example the receipt.xes log, then the NetworkX visualization becomes difficult to interpreter. To aid visualization in these context, support for an interactive visualization has been provided through Pyvis.

SNA visualization in PM4Py through Pyvis is still not supported at Jupyter/IPython notebook level, so this will work only outside of notebooks.

The interactive visualization is based on the production of a HTML page that could be viewed through a web browser.

A graphic object could be obtained from the Handover of Work and/or the Similar Activities matrix through the following code:

from pm4py.visualization.sna import factory as sna_vis_factory

gviz_hwmatrix = sna_vis_factory.apply(mco, hw_matrix, variant="pyvis", parameters={"directed": True, "threshold": 0})
gviz_samatrix = sna_vis_factory.apply(mco, sim_act_matrix, variant="pyvis", parameters={"directed": False, "threshold": 0.5})

sna_vis_factory.view(gviz_hwmatrix, variant="pyvis")
sna_vis_factory.view(gviz_samatrix, variant="pyvis")

The parameters accepted by the method apply are the following:

  • directed: boolean values that says the graph is directed (e.g. the Handover of Work metric is directed, while the Similar Activities metric is not directed)
  • threshold: float (between 0 and 1) threshold of representation of the arcs (no arc with associated value below threshold is represented)

Producing the following representation for the Handover of Work metric:

The HTML file could be saved through the following instructions:

sna_vis_factory.save(gviz_hwmatrix, "hwmatrix_pyvis.html", variant="pyvis")
sna_vis_factory.save(gviz_samatrix, "samatrix_pyvis.html", variant="pyvis")

.