This is an old revision of the document!


Probabilistic Programming 101

This class will introduce a concept of probabilistic programming — a new declarative paradigm meant to represent and solve a wide class of problems involving uncertainty and lack of perfect knowledge.

Be warned! This class assumes that student knows at least basics of the probability theory, e.g. what a random variable is, how to use chain rule and Bayes theorem. A short (and a bit dense) introduction can be read here.

Tools

In this class we will use ProbLog language, which is based on Prolog. If you haven't head of this language, read this short and gentle introduction and later do the more comprehensive tutorial in your free time.

For now, you should now that you can use ProbLog in two different ways:

  1. from command line: problog <path to file with model>

First Model

We will start gently, by creating a fairly reasonable model of an academic teacher assessing an essay question.

First of all, there is a very high chance (let's say 90%), that the handwriting will be illegible. This is so called probabilistic fact and can be expressed in a one line of ProbLog code.

0.9::illegible_handwriting.

Notice that name of the fact starts with a small letter — it's important! The other thing is that every statement ends with a dot ..

Let's return to the problem… the only fair way to assess this type of answer is to throw a coin. Let's define a coin toss as another probabilistic fact.

0.5::head.

And now we will write a first rule, that says: 'If the handwriting is illegible and I get head, the student pass the exam'.

pass_exam :- illegible_handwriting, head.

Here :- means 'if' and , means 'and' (student pass exam if handwriting is illegible and head is tossed).

Of course there is always a chance, that the handwriting will be nice and teacher will be able to comprehend the answer. Then everything depends on the student's intelligence. We will add it to model by introducing one fact and one more rule.

0.4::student_knows_the_answer.
 
pass_exam :- \+ illegible_handwriting, student_knows_the_answer.

\+ represents here the logical negation.

The last (but not the least) element of the model is query definition — what do we would like to know? In this case we are interested in probability of passing the exam.

query(pass_exam).

Now put all these lines into one file (e.g. test.pbl), and run:

# problog test.pbl

…or paste it in the web ide and press 'Evaluate'.

One way or another you should get a probability of the desired event.

Bayesian Network

The model we have written can be represented as a simple bayesian network as in the image on the right (just ignore constants c1, etc.). If you want to generate a bayesian network from a ProbLog model, you can do it using problog bn command, e.g.

# problog bn -o test.dot --format dot test.pbl

will generate a dot file, and:

# dot -Tpng test.dot -o test.png

will render the network as a png file.

Adding evidence

In the previous model we have calculated what is the probability of passing the exam by a generic student. But what if we want to calculate the same for a more specific case… let's say that the student knows the answer. In Bayesian statistics we can represent it as evidence — something we know.

In ProbLog we can add evidences by writing simply:

evidence(student_knows_the_answer).

Add this line to the model a re-evaulate it. What is the impact of knowledge on the probability of passing the exam. Isn't it motivating?

Try to add different evidences, e.g. that student doesn't know the answer (use \+ as negation).

Students Live In Groups

Our model seems to be pretty realistic and shiny but has one major drawback — it models assessment of only one student, while in reality most of the students live in herds (or so called groups) and teacher has to assess one work per student.

The naive approach would be to duplicate facts and rules for every student, e.g.

0.9::joan_has_illegible_writing.
0.9::marcus_has_illegible_writing.
 
0.5::joan_tossed_head.
0.5::marcus_tossed_head.
 
joan_pass_exam :- joan_has_illegible_writing, joan_tossed_head.
marcus_pass_exam :- marcus_has_illegible_writing, marcus_tossed_head.
 
query(joan_pass_exam).
query(marcus_pass_exam).

It could work, but it certainly wouldn't be a manageable solution for more than three students… In order to model bigger herds we will use power of first-order logic!

First-Order Logic For The Rescue

First things first — we need to redefine our probabilistic facts.

0.9::has_illegible_writing(marcus).
0.9::has_illegible_writing(joan).
 
0.5::heads(marcus).
0.5::heads(joan).
 
0.4::knows_the_answer(marcus).
0.4::knows_the_answer(joan).

The facts now are predicates and therefore have arguments. It's neat, but in comparison to our previous approach, it doesn't save too much space… But let's focus on the rules — we would like to say: “Student named passes the exam if the student named * has illegible handwriting and when tossed the coin for him, we got heads. In ProbLog (and Prolog) *** is represented by variable, which starts with a big letter.

pass_exam(Student) :- has_illegible_writing(Student), heads(Student).
pass_exam(Student) :- \+ has_illegible_writing(Student), knows_the_answer(Student).

Try to ran the following model and check what is the probability of passing the exam by Joan. Try to add some evidence to the model.

en/dydaktyka/problog/intro.1494810597.txt.gz · Last modified: 2019/06/27 16:00 (external edit)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0