Rete algorithm

The aim of this tutorial is to make you familiar with the Rete algorithm, which is used in many rule-based systems to optimize finding rules having satisfied conditions.

Introduction

What is the Rule-Based System?

A Rule-Based System (RBS shortly) is a software, which task is to process the gathered knowledge in order to answer user's question. Currently, it’s commonly named Business Rules Management System (BRMS). Its knowledge base contains rules defining some conclusion occurring when some given preconditions (assessed against the facts list) are satisfied. If the condition part of a given rule is satisfied, its conclusion part may be executed – we say that such a rule is fired. The image below presents a typical scheme of an RBS.

The RSB has to decide which rules should be fired by checking the IF part of the rules. It's not a problem when we have relatively few rules, but what to do if we have a huge number of rules? What approach should we apply to get a useful and scalable software that will be able to deal with a complicated knowledge base?

Used approaches

  • constant screening for applicability, like the EasyRules engine works
  • constant screening for applicability after grouping the rules in a logical way, like the HeaRTDroid engine works -– it keeps rules in the form of XTT tables (see the figure below – yes, it’s the thermostat again)
  • using some more or less sophisticated methods to find the applicable rules, like e.g. Drools and CLIPS engines work

Origin of Rete

Dr Charles L. Forgy of Carnegie Mellon University in Pittsburgh, Pennsylvania developed this algorithm in the late 70's, but it emerged and became popular a few years later. The algorithm created by Forgy was first used in the OPS5 rule-based language created by Forgy too (as you could guess). The name of the algorithm is per the English word for an anatomical network of blood vessels and nerve fibers and should be pronounced “ree-tee” (although in Polish we say simply “re-te”). Forgy named it this way because in his opinion the conditions network in the algorithm is alive like blood or nerve nets in human body (you can see and hear more at https://youtu.be/BUw-67B_qA4

How Rete algorithm works – example

Let’s consider an example of rules in the airline miles calculations. The decision service is responsible for processing the frequent flyer’s activity.

1.IF award miles for last year or current year > 25,000 THEN status = Silver

2.IF award miles for last year or current year > 100,000 THEN status = Gold

3.IF flight is less than 500 miles THEN award 500 miles

4.IF flight is 500 miles or more THEN award flight miles

5.IF category is business or first THEN award 50% bonus miles

6.IF status is Gold and airline is not partner THEN award 100% bonus miles

7.IF status is Silver and airline is not partner THEN award 20% bonus miles

8.IF member signed up for 3-flights-for-5k-in-March and number of return flights in March 2011 = 3 THEN award 5,000 additional miles

9.IF status becomes Gold THEN award 8 upgrade certificates

Task 1

Let’s assume that the input data is a frequent client having 150k award miles, who wants to fly from Washington DC to San Fransisco (distance about 2419 miles) with a not-partner airline. We can call him Joe or Stefan. Based on the data and the ruleset, try to process it in your head in a naïve way to get the result facts list. If you do it the right way, you should get the information that Stefan is a GOLD client and has 154 838 award miles (150k so far + 2 times the flight distance). You should perform 3 iterations (the third one will show you that no more rules can be fired).

How Rete algorithm works – continuation

The main feature of the Rete algorithm is that it separates the evaluation of a hypothesis (i.e. checking the fulfilling of the condition) and execution sequencing. The algorithm gets all rules and build a tree called Rete network, where each node corresponds to an atomic condition in a rule. Nodes referring to a mutual fact core (e.g. a single object in object-based rules modelling or a single attribute in attributive logic) have a mutual root and every edge is understood as AND. In addition, each node holds a list of facts/objects/attributes values that satisfy the associated condition.

Now we can distinguish two groups of conditions:

  • Frequent Flyer Account (FAA)
    • status is Gold
    • status is Silver
    • award miles for last year or current year > 25k
    • award miles for last year or current year > 100k
    • signed up for „3-flights-for-5k-in-March” promotion
      • number of flights in March = 3
  • Flight (F)
    • miles >= 500
    • miles < 500
    • airline is not a partner
    • airline is partner
    • category is business or first

The tree describing all conditions looks as follows (the rectangles at the very bottom represent rules which conditions on the path belong to):

If we consider the input data (which is actually a facts base) and the ruleset, we see that conditions for rules 1, 2 and 4 are satisfied. They are said to be active in the agenda. The agenda contains the list of all rules that should be executed, along with the list of objects that are responsible for the conditions to be true. The agenda will sort them according to their priorities (and other conflict resolution methods). If the execution of a rules invalidates a rule with a lower priority, then this second rule will logically never execute. We have such a case when we look at the rule 1 (IF award miles for last year or current year > 25,000 THEN status = Silver) and 2 (IF award miles for last year or current year > 100,000 THEN status = Gold). Stefan can qualify for SILVER or GOLD bonus miles but certainly not both. A solution may be to assign priorities to these rules, where rule 2 will have a higher priority than rule 1. Other solution (possibly better) is to modify the model so that the ruleset will be fully deterministic.

After dealing with this problem, the agenda lists the 2 rules (GOLD status and 500+ Flight Miles). After the execution Stefan gets 2419 miles in his account (because of the rule 4) and gets the GOLD status (because of rule 2) – this makes we need to re-evaluate the subtree corresponding to the client account. We begin a new cycle and propagate new facts in out facts base, which makes that the agenda contains rule 6 (Stefan is a GOLD client and travels by plane that belongs to a non-partner airline). This means Stefan gets another 2419 award miles. In conclusion, Stefan is a GOLD client with 154,839 in his award miles account.

Leveraging the algorithm is clearly visible when you repeat inference cycles and reuse those atomic conditions.

Task 2

See the example rules model and try to build the Rete network. If you finish, compare your solution with the tips at https://informatik-forum.at/showthread.php?114766-Rete-Algorithm-Example. In case you get stuck, get inspired with the same link.

pl/dydaktyka/dss/projects/rete/start.txt · ostatnio zmienione: 2019/06/27 15:50 (edycja zewnętrzna)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0