Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:dydaktyka:problog:intro [2017/05/15 12:48]
msl [Levels of illegibility]
en:dydaktyka:problog:intro [2019/06/27 15:49]
Line 1: Line 1:
-====== Probabilistic Programming 101 ====== 
  
-This class will introduce a concept of probabilistic programming --- a new declarative paradigm meant to represent and solve a wide class of problems involving uncertainty and lack of perfect knowledge. 
- 
-Be warned! This class assumes that student knows at least basics of the probability theory, e.g. what a random variable is, how to use chain rule and Bayes theorem. A short (and a bit dense) introduction can be read [[https://​ermongroup.github.io/​cs228-notes/​preliminaries/​probabilityreview/​|here]]. 
- 
-===== Tools ===== 
- 
-In this class we will use [[https://​dtai.cs.kuleuven.be/​problog/​|ProbLog]] language, which is based on Prolog. If you haven'​t heard of this language, read [[https://​bernardopires.com/​2013/​10/​try-logic-programming-a-gentle-introduction-to-prolog/​|this short and gentle introduction]] and later do the [[http://​lpn.swi-prolog.org/​lpnpage.php?​pageid=online|more comprehensive tutorial in your free time]]. 
- 
-For now, you should now that you can use ProbLog in two different ways: 
- 
-  - from command line: ''​problog <path to file with model>''​ 
-  - from [[https://​dtai.cs.kuleuven.be/​problog/​editor.html''​| interactive web environment]] ​ 
-  
-===== First Model ===== 
- 
-We will start gently, by creating a fairly reasonable model of an academic teacher assessing an essay question. ​ 
- 
-First of all, there is a very high chance (let's say 90%), that the handwriting will be illegible. This is so called probabilistic fact and can be expressed in a one line of ProbLog code.  
- 
-<code prolog> 
-0.9::​illegible_handwriting. 
-</​code>​ 
- 
-Notice that name of the fact starts with a small letter --- it's important! The other thing is that every statement ends with a dot ''​.''​. 
- 
-Let's return to the problem... the only fair way to assess this type of answer is to throw a coin. Let's define a coin toss as another probabilistic fact. 
- 
-<code prolog> 
-0.5::heads. 
-</​code>​ 
- 
-And now we will write a first rule, that says: 'If the handwriting is illegible and I get heads, the student pass the exam'. 
- 
-<code prolog> 
-pass_exam :- illegible_handwriting,​ heads. 
-</​code>​ 
- 
-Here '':​-''​ means '​if'​ and '',''​ means '​and'​ (student pass exam if handwriting is illegible and head is tossed). 
- 
-Of course there is always a chance, that the handwriting will be nice and teacher will be able to comprehend the answer. Then everything depends on the student'​s intelligence. We will add it to model by introducing one fact and one more rule. 
- 
-<code prolog> 
-0.4::​student_knows_the_answer. 
- 
-pass_exam :- \+ illegible_handwriting,​ student_knows_the_answer. 
-</​code>  ​ 
- 
-''​\+''​ represents here the logical negation. 
- 
-The last (but not the least) element of the model is query definition --- what do we would like to know? In this case we are interested in probability of passing the exam. 
- 
-<code prolog> 
-query(pass_exam). 
-</​code>​ 
- 
-Now put all these lines into one file (e.g. test.pbl), and run: 
-<code bash> 
-# problog test.pbl 
-</​code>​ 
- 
-...or paste it in the [[https://​dtai.cs.kuleuven.be/​problog/​editor.html|web ide]] and press '​Evaluate'​. 
- 
-One way or another you should get a probability of the desired event. 
- 
-==== Bayesian Network ==== 
- 
-{{ :​en:​dydaktyka:​problog:​test.png?​nolink&​300|}} 
-The model we have written can be represented as a simple bayesian network as in the image on the right (just ignore constants ''​c1'',​ etc.). If you want to generate a bayesian network from a ProbLog model, you can do it using ''​problog bn''​ command, e.g.  
- 
-<code bash> 
-# problog bn -o test.dot --format dot test.pbl 
-</​code>​ 
- 
-will generate a dot file, and: 
- 
-<code bash> 
-# dot -Tpng test.dot -o test.png 
-</​code>​ 
- 
-will render the network as a png file. 
- 
-===== Adding evidence ===== 
- 
-In the previous model we have calculated what is the probability of passing the exam by a generic student. But what if we want to calculate the same for a more specific case... let's say that the student knows the answer. In Bayesian statistics we can represent it as evidence --- something we know. 
- 
-In ProbLog we can add evidences by writing simply: 
- 
-<code prolog> 
-evidence(student_knows_the_answer,​ true). 
-</​code>​ 
- 
-Add this line to the model a re-evaulate it. What is the impact of knowledge on the probability of passing the exam. Isn't it motivating? ​ 
- 
-Try to add different evidences, e.g. that student doesn'​t know the answer (use ''​false''​ instead of ''​true''​). 
- 
-===== Students Live In Groups ===== 
- 
-Our model seems to be pretty realistic and shiny but has one major drawback --- it models assessment of only one student, while in reality most of the students live in herds (or so called ''​groups''​) and teacher has to assess one work per student. ​ 
- 
-The naive approach would be to duplicate facts and rules for every student, e.g. 
- 
-<code prolog> 
-0.9::​joan_has_illegible_writing. 
-0.9::​maxwell_has_illegible_writing. 
- 
-0.5::​joan_tossed_heads. 
-0.5::​maxwell_tossed_heads. 
- 
-joan_pass_exam :- joan_has_illegible_writing,​ joan_tossed_heads. 
-maxwell_pass_exam :- maxwell_has_illegible_writing,​ maxwell_tossed_heads. 
- 
-query(joan_pass_exam). 
-query(maxwell_pass_exam). 
-</​code>​ 
- 
-It could work, but it certainly wouldn'​t be a manageable solution for more than three students... In order to model bigger herds we will use power of first-order logic! 
- 
-==== First-Order Logic To The Rescue ====  
- 
-First things first --- we need to redefine our probabilistic facts. 
- 
-<code prolog> 
-0.9::​has_illegible_writing(maxwell). 
-0.9::​has_illegible_writing(joan). 
- 
-0.5::​heads(maxwell). 
-0.5::​heads(joan). 
- 
-0.4::​knows_the_answer(maxwell). 
-0.4::​knows_the_answer(joan). 
-</​code>​ 
- 
-The facts are //​predicates//​ now and therefore have arguments. It's neat, but in comparison to our previous approach, it doesn'​t save too much space... But let's focus on the rules --- we would like to say: "​Student passes the exam if he/she has illegible handwriting and when tossed the coin for him/her, we got heads. In ProbLog (and Prolog) an unspecified object is represented by a variable, which starts with a big letter. 
- 
-<code prolog> 
-pass_exam(Student) :- has_illegible_writing(Student),​ heads(Student). 
-pass_exam(Student) :- \+ has_illegible_writing(Student),​ knows_the_answer(Student). 
-</​code>  ​ 
- 
-Try to run the following model and check what is the probability of passing the exam by Joan. Try to add some evidence to the model. 
- 
-==== Non-Deterministic Rules Save The Day ==== 
- 
-Still, we have some redundancy in the code. Our probabilistic facts are simply duplicated and every programmer should instantly think of a way to get rid of the duplicates. In order to do that we should first extract somewhere the list of students. We will use plain deterministic (probability equal 100% can be omitted) facts. 
- 
-<code prolog> ​ 
-student(maxwell). student(joan). 
-</​code>​ 
- 
-Then we will rewrite our original probabilistic facts as probabilistic rules: 
- 
-<code prolog> 
-0.9::​has_illegible_writing(Name) :- student(Name). 
-</​code>​ 
- 
-Do the same with the other fact and check the new model. Isn't it neat? 
- 
-===== Levels of illegibility ===== 
- 
-Okay. We haven'​t be fair --- there is no way to simply classify all the handwriting in only two classes. There is at least third one: '​partly_legible'​. In order to model that we will need to introduce a second argument to our probabilistic fact. 
- 
-<code prolog> 
-0.5::​handwriting(Name,​ illegible) :- student(Name). 
-0.4::​handwriting(Name,​ partly_legible) :- student(Name). 
-0.1::​handwriting(Name,​ legible) :- student(Name). 
-</​code>​ 
- 
-We also have have to add a new rule, that takes care of what happens when teacher is able to read only a part of the answer. 
- 
-<code prolog> 
-0.3::​pass_exam(Student) :- handwriting(Student,​ partly_legible),​ \+ knows_the_answer(Student). 
-0.7::​pass_exam(Student) :- handwriting(Student,​ partly_legible),​ knows_the_answer(student). 
-</​code>  ​ 
- 
- 
- 
- 
-===== First Assignment ===== 
- 
-Based on the first ProbLog model you've just finished, you have to model a classical problem of //burglary vs earthquake//​. The story is rather simple: 
- 
-//You live in the Los Angeles, beautiful city where earthquakes are not uncommon. You've installed a new alarm system in your house that was meant to protect you from burglaries. Unfortunately alarm is too sensitive and sometimes it get startled by an earthquake. Because of that you've set the alarm to not call the police. Instead of that both your neighbors Joan and Maxwell are calling you as soon as they hear the alarm (if they are around).// 
- 
-Question: what is the probability of burglary, if you are called only by Joan? How much probable is that it was only an earthquake? 
- 
-==== Priors ==== 
- 
-The tables below contain information about prior probabilities of events. 
- 
-  
en/dydaktyka/problog/intro.txt · Last modified: 2019/06/27 15:49 (external edit)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0