It is great honor for the PM4Py team to announce the second major release of the Process Mining for Python library (PM4Py), i.e. PM4Py 1.1..
The main motivations for the PM4Py library are:
- Providing more freedom in performing process mining analyses w.r.t. existing (academical) tools such as ProM, RapidProM and AproMore.
- Allowing to combine process mining algorithms with algorithms from other data science fields, implemented in various state-of-the-art python packages.
- Reducing the time needed for the replication of scientific experiments regarding process mining, in comparison to other open-source process mining tools.
- Reducing the entry level needed to apply and develop process mining techniques.
- Creating a collaborative eco-system (through the git repo) that easily allows researchers and practitioners to share valuable code and results with the process mining world.
The library is well-documented and, furthermore, we have developed a large body of tests to guarantee the stability of the code.
The previous version of PM4Py, i.e. PM4Py 1.0, was a huge success, with more than 17000 downloads overall!
Benchmarks for the library are made available at this URL.
In PM4Py 1.1 a lot of new features are added to the ecosystem:
- Loading and handling (look at Apache Parquet support and Big Dataframe management section) bigger logs.
- Social Network Analysis
- Support for feature extraction from logs and decision trees
- Graphs for case duration, events over time, distribution of attribute values.
- Case management: exploring variants and cases.
- Stochastic Petri nets: support for Continuous Time Markov Chains and performance bounds.
- Faster and improved Inductive Miner Directly-Follows implementation.
- Diagnostics (throughput analysis and root cause analysis) based on token-based replay.
Some comments from Alessandro Berti, Software Engineer of the RWTH PADS research group and lead-developer of the PM4Py project:
“In PM4Py 1.1.0 we provide a more complete set of features. This increases the possibility to apply the library in real projects, and opens the possibility to create a process mining interface on top of PM4Py. PM4Py is pioneering the entrance of open source process mining projects into the world of big data, enabling an average machine to process significantly more event data in comparison to other process mining products. With 12584 lines of code and over 175 A4 pages of documentation, PM4Py is a swiss army knife to handle everyday process data challenges.”