Simplified Alpha algorithm

During this laboratory, we will discuss a simple process mining solution. For this purpose, the widely known Alpha algorithm was simplified and adjusted to discover a BPMN process model.

Let's consider the following traces (variants) of some workflow log:

a b c d e g
a b c d f g 
a c b d e g
a c b d f g

For this workflow log, based on the Alpha algorithm, we can find the following relations (x,y - events):

  • (>) direct succession: x>y iff for some case x is directly followed by y,
  • (→) causality: x→y iff x>y and not y>x,
  • (||) parallelism: x||y iff x>y and y>x,
  • (#) choice/unrelated/no direct succession: x#y iff not x>y and not y>x.

In the below implementation, we do not use directly „no direct succession” relation, as it can be just checked if there is causality or parallelism instead.

direct_succession = {
    'a': {'b','c'},
    'b': {'c','d'},
    'c': {'b','d'},
    'd': {'e','f'},
    'e': {'g'},
    'f': {'g'}
}

causality = {
    'a': {'b', 'c'},
    'b': {'d'},
    'c': {'d'},
    'd': {'e','f'},
    'e': {'g'},
    'f': {'g'}
}

parallel_events = {('b', 'c'), ('c', 'b')}

as well as we can find the following sets of start and end events:

start_set_events = {'a'}
end_set_events = {'g'}

For the purpose of coding, we can also define the inverted causality for these events in causality relation which have just one successor (this is not the part of the alpha algorithm itself, but can be helpful as a temporary variable for using the alpha mining patterns):

inv_causality = {
    'd': {'b', 'c'}, 
    'g': {'e', 'f'}
}

Next, use the following code to generate a simple BPMN process model:

bpmnalpha.py
import graphviz
 
class MyGraph(graphviz.Digraph):
 
    def __init__(self, *args):
        super(MyGraph, self).__init__(*args)
        self.graph_attr['rankdir'] = 'LR'
        self.node_attr['shape'] = 'Mrecord'
        self.graph_attr['splines'] = 'ortho'
        self.graph_attr['nodesep'] = '0.8'
        self.edge_attr.update(penwidth='2')
 
    def add_event(self, name):
        super(MyGraph, self).node(name, shape="circle", label="")
 
    def add_and_gateway(self, *args):
        super(MyGraph, self).node(*args, shape="diamond",
                                  width=".6",height=".6",
                                  fixedsize="true",
                                  fontsize="40",label="+")
 
    def add_xor_gateway(self, *args, **kwargs):
        super(MyGraph, self).node(*args, shape="diamond",
                                  width=".6",height=".6",
                                  fixedsize="true",
                                  fontsize="35",label="×")
 
    def add_and_split_gateway(self, source, targets, *args):
        gateway = 'ANDs '+str(source)+'->'+str(targets)        
        self.add_and_gateway(gateway,*args)
        super(MyGraph, self).edge(source, gateway)
        for target in targets:
            super(MyGraph, self).edge(gateway, target)
 
    def add_xor_split_gateway(self, source, targets, *args):
        gateway = 'XORs '+str(source)+'->'+str(targets) 
        self.add_xor_gateway(gateway, *args)
        super(MyGraph, self).edge(source, gateway)
        for target in targets:
            super(MyGraph, self).edge(gateway, target)
 
    def add_and_merge_gateway(self, sources, target, *args):
        gateway = 'ANDm '+str(sources)+'->'+str(target)
        self.add_and_gateway(gateway,*args)
        super(MyGraph, self).edge(gateway,target)
        for source in sources:
            super(MyGraph, self).edge(source, gateway)
 
    def add_xor_merge_gateway(self, sources, target, *args):
        gateway = 'XORm '+str(sources)+'->'+str(target)
        self.add_xor_gateway(gateway, *args)
        super(MyGraph, self).edge(gateway,target)
        for source in sources:
            super(MyGraph, self).edge(source, gateway)
 
 
G = MyGraph()
 
# adding split gateways based on causality
for event in causality:
    if len(causality[event]) > 1:
        if tuple(causality[event]) in parallel_events:
            G.add_and_split_gateway(event,causality[event])
        else:
            G.add_xor_split_gateway(event,causality[event])
 
# adding merge gateways based on inverted causality
for event in inv_causality:
    if len(inv_causality[event]) > 1:
        if tuple(inv_causality[event]) in parallel_events:
            G.add_and_merge_gateway(inv_causality[event],event)
        else:
            G.add_xor_merge_gateway(inv_causality[event],event)
    elif len(inv_causality[event]) == 1:
        source = list(inv_causality[event])[0]
        G.edge(source,event)
 
# adding start event
G.add_event("start")
if len(start_set_events) > 1:
    if tuple(start_set_events) in parallel_events: 
        G.add_and_split_gateway(event,start_set_events)
    else:
        G.add_xor_split_gateway(event,start_set_events)
else: 
    G.edge("start",list(start_set_events)[0])
 
# adding end event
G.add_event("end")
if len(end_set_events) > 1:
    if tuple(end_set_events) in parallel_events: 
        G.add_and_merge_gateway(end_set_events,event)
    else:
        G.add_xor_merge_gateway(end_set_events,event)    
else: 
    G.edge(list(end_set_events)[0],"end")
 
G.render('simple_graphviz_graph')
G.view('simple_graphviz_graph')

Excercise

Based on some log, the following direct succession relations were discovered:

direct_succession = {
    'a': {'b', 'f'},
    'b': {'c', 'd'},
    'c': {'d', 'e'},
    'd': {'c', 'e'},
    'e': {'h'},
    'f': {'g'},
    'g': {'h'},
    'h': {'i', 'j'},
    'i': {'k'},
    'j': {'k'}
}

Calculate (first manually, than in Python) the causality relations, determine parallel events as well as the set of start and end events, and try to discover the BPMN process model.

As the goal of this lab is to get understanding of Alpha mining algorithm, there is no report needed after this lab. But if you implement some other solution (e.g. Alpha+, Alpha++, Alpha#, Heuristic or ILP miner), you will be able to present your work during the next class and get additional (extra) points. ^_^

Warning: Alpha miner was originally invented to discover Petri net, so discovering directly BPMN with it is not straightforward. However, it shows basic process mining idea and allows to notice some theoretical limits of process mining. Thus, Alpha miner is only of theoretical interest as it is too simple to be applicable to real-life logs.

  • It does not address various issues such as noise, frequency etc.
  • It has problems with some control-flow constructs (e.g. is not able to discover some kinds of structures like short loops, but also with some more complex structures).
  • It does not allow for duplicate events in the trace.
  • It can discover inconsistent models.
pl/dydaktyka/dss/lab03.txt · ostatnio zmienione: 2019/06/27 15:50 (edycja zewnętrzna)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0