Spis treści

Simplified Alpha algorithm

During this and the next laboratory, we will discuss a simple process mining solution. For this purpose, the widely known Alpha algorithm was simplified and adjusted to discover a BPMN process model.

Let's consider the following traces (variants) of some workflow log:

a b c d e g
a b c d f g 
a c b d e g
a c b d f g

For this workflow log, based on the Alpha algorithm, we can find the following relations (x,y - events):

In the below implementation, we do not use directly „no direct succession” relation, as it can be just checked if there is causality or parallelism instead.

direct_succession = {
    'a': {'b','c'},
    'b': {'c','d'},
    'c': {'b','d'},
    'd': {'e','f'},
    'e': {'g'},
    'f': {'g'}
}

causality = {
    'a': {'b', 'c'},
    'b': {'d'},
    'c': {'d'},
    'd': {'e','f'},
    'e': {'g'},
    'f': {'g'}
}

parallel_events = {('b', 'c'), ('c', 'b')}

as well as we can find the following sets of start and end events:

start_set_events = {'a'}
end_set_events = {'g'}

For the purpose of coding, we can also define the inverted causality for these events in causality relation which have just one successor (this is not the part of the alpha algorithm itself, but can be helpful as a temporary variable for using the alpha mining patterns):

inv_causality = {
    'd': {'b', 'c'}, 
    'g': {'e', 'f'}
}

Next, use the following code to generate a simple BPMN process model:

bpmnalphatemplate.py
import graphviz
 
class MyGraph(graphviz.Digraph):
 
    def __init__(self, *args):
        super(MyGraph, self).__init__(*args)
        self.graph_attr['rankdir'] = 'LR'
        self.node_attr['shape'] = 'Mrecord'
        self.graph_attr['splines'] = 'ortho'
        self.graph_attr['nodesep'] = '0.8'
        self.edge_attr.update(penwidth='2')
 
    def add_event(self, name):
        super(MyGraph, self).node(name, shape="circle", label="")
 
    def add_and_gateway(self, *args):
        super(MyGraph, self).node(*args, shape="diamond",
                                  width=".6",height=".6",
                                  fixedsize="true",
                                  fontsize="40",label="+")
 
    def add_xor_gateway(self, *args, **kwargs):
        super(MyGraph, self).node(*args, shape="diamond",
                                  width=".6",height=".6",
                                  fixedsize="true",
                                  fontsize="35",label="×")
 
    def add_and_split_gateway(self, source, targets, *args):
        gateway = 'ANDs '+str(source)+'->'+str(targets)        
        self.add_and_gateway(gateway,*args)
        super(MyGraph, self).edge(source, gateway)
        for target in targets:
            super(MyGraph, self).edge(gateway, target)
 
    def add_xor_split_gateway(self, source, targets, *args):
        gateway = 'XORs '+str(source)+'->'+str(targets) 
        self.add_xor_gateway(gateway, *args)
        super(MyGraph, self).edge(source, gateway)
        for target in targets:
            super(MyGraph, self).edge(gateway, target)
 
    def add_and_merge_gateway(self, sources, target, *args):
        gateway = 'ANDm '+str(sources)+'->'+str(target)
        self.add_and_gateway(gateway,*args)
        super(MyGraph, self).edge(gateway,target)
        for source in sources:
            super(MyGraph, self).edge(source, gateway)
 
    def add_xor_merge_gateway(self, sources, target, *args):
        gateway = 'XORm '+str(sources)+'->'+str(target)
        self.add_xor_gateway(gateway, *args)
        super(MyGraph, self).edge(gateway,target)
        for source in sources:
            super(MyGraph, self).edge(source, gateway)
 
 
G = MyGraph()
 
# adding split gateways based on causality
for event in causality:
    if len(causality[event]) > 1:
        if tuple(causality[event]) in parallel_events:
            G.add_and_split_gateway(event,causality[event])
        else:
            G.add_xor_split_gateway(event,causality[event])
 
# adding merge gateways based on inverted causality
for event in inv_causality:
    if len(inv_causality[event]) > 1:
        if tuple(inv_causality[event]) in parallel_events:
            G.add_and_merge_gateway(inv_causality[event],event)
        else:
            G.add_xor_merge_gateway(inv_causality[event],event)
    elif len(inv_causality[event]) == 1:
        source = list(inv_causality[event])[0]
        G.edge(source,event)
 
# adding start event
G.add_event("start")
if len(start_set_events) > 1:
    if tuple(start_set_events) in parallel_events: 
        G.add_and_split_gateway(event,start_set_events)
    else:
        G.add_xor_split_gateway(event,start_set_events)
else: 
    G.edge("start",list(start_set_events)[0])
 
# adding end event
G.add_event("end")
if len(end_set_events) > 1:
    if tuple(end_set_events) in parallel_events: 
        G.add_and_merge_gateway(end_set_events,event)
    else:
        G.add_xor_merge_gateway(end_set_events,event)    
else: 
    G.edge(list(end_set_events)[0],"end")
 
G.render('simple_graphviz_graph')
G.view('simple_graphviz_graph')

Alpha Plus

Alpha+ is a simple extension of Alpha algorithm which can deal with short-loops and self-loops.

In the picture below, the models on the left-hand side are the models discovered by Alpha algorithm. The models on the right-hand side are the models which could be discovered using Alpha+ algorithm.

On the UPEL platform, there are articles about the algorithms that use Petri nets. There is no simple translation from the Petri nets to BPMN, however, you for Alpha+, the situation is similar like in the case of Heuristic miner (presented during the lecture).

So for discovering self-loops and short loops, you need to construct the additional matrix and exclude the discovered relations from the footprint matrix of the Alpha algorithm.

a b c
aτa
bτb
cτc

Excercise

Based on some log, the following direct succession relations were discovered:

direct_succession = {
    'a': {'b', 'f'},
    'b': {'c', 'd'},
    'c': {'d', 'e'},
    'd': {'c', 'e'},
    'e': {'h'},
    'f': {'g'},
    'g': {'h'},
    'h': {'i', 'j'},
    'i': {'k'},
    'j': {'k'}
}

Calculate (first manually, then in Python) the causality relations, determine parallel events as well as the set of start and end events, and try to discover the BPMN process model.

Warning: Alpha miner was originally invented to discover Petri net, so discovering directly BPMN with it is not straightforward. However, it shows the basic process mining idea and allows to notice some theoretical limits of process mining. Thus, Alpha miner is only of theoretical interest as it is too simple to be applicable to real-life logs.

Report

This lab will be conducted 26.10.2020 and 2.11.2020.
The deadline for the report is 10.11.2020 23.11.2020
.

The lab on 9.11.2020 16.11.2020 will be only as consultation hours for people who want to consult their mini-project concerning the implementation of Heuristic miner or some other advanced miner.

On the UPEL platform, you can find some log examples of various complexity:
https://upel2.cel.agh.edu.pl/weaiib/mod/resource/view.php?id=40748
Be aware that not for all of these logs you will be able to get a correct solution, as some of the constructs may not be supported by your algorithm. However, as the provided logs are quite simple, it should be easy to check your solutions using them.

The basic report should contain:

  1. a python file or an ipython notebook implementing the algorithm,
  2. an example input file for the implementation,
  3. a short pdf report on the using of the implementation and the results for the provided example, with some discussion of the interesting elements of your implementation.

For 3 points (i.e. satisfactory grade 3.0):

For 4 points (i.e. good grade 4.0):

For 5 points (i.e. very good grade 5.0) [mini-project]:

Additional 0.5-1 point may be granted for the groups which will cover some of the following issues (this should be explained and justified in the report):