Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
en:dydaktyka:problog:lab1 [2017/05/29 14:44]
msl
en:dydaktyka:problog:lab1 [2019/06/27 15:49] (current)
Line 1: Line 1:
-====== Probabilistic Programming ​--- Medical Cases ======+====== Probabilistic Programming ​— Diagnosis and Prediction ​======
  
 This class will cover use cases of the Bayesian methods in the medical domain. First part of the class is based on article: "Local computations with probabilities on graphical structures and their application to expert systems"​ by Lauritzen, Steffen L. and David J. Spiegelhalter. Second part is inspired by "An intercausal cancellation model for bayesian-network engineering. International Journal of Approximate Reasoning"​ by S.P. Woudenberg, L. C. van der Gaag, and C. M. Rademaker. This class will cover use cases of the Bayesian methods in the medical domain. First part of the class is based on article: "Local computations with probabilities on graphical structures and their application to expert systems"​ by Lauritzen, Steffen L. and David J. Spiegelhalter. Second part is inspired by "An intercausal cancellation model for bayesian-network engineering. International Journal of Approximate Reasoning"​ by S.P. Woudenberg, L. C. van der Gaag, and C. M. Rademaker.
Line 11: Line 11:
 ==== Structure ==== ==== Structure ====
  
-First task of the "​knowledge engineer"​ is to find a structure of Bayesion ​network which fits the story. There exist automatic tools to learn the structure from examples, but in this case the structure should be clear enough to create the network by hand.+First task of the "​knowledge engineer"​ is to find a structure of Bayesian ​network which fits the story. There exist automatic tools to learn the structure from examples, but in this case the structure should be clear enough to create the network by hand.
  
 === Assignments === === Assignments ===
  
   - Draw (on paper?) a Bayesian network describing the story from the previous section.   - Draw (on paper?) a Bayesian network describing the story from the previous section.
-  ​- Write the corresponding ​ProbLog ​program: +    * if you're stuck, here is a {{ :​en:​dydaktyka:​problog:​diagnosis_network.png?​linkonly |big hint}} 
-    - there is no need for the first order logic here +  ​- Write the corresponding ​Problog ​program: 
-    use arbitrary ​probabilities+    - Hints:  
 +      * logical or in Problog: <code prolog>​a_or_b :- a.  
 +a_or_b :- b. </​code>​ 
 +      * you can have lung cancer even if you are not a smoker, so don't forget to mention this fact in the model. There are two ways to do that: 
 +        * classical (Bayesian network):<​code prolog>​0.1::​cancer :smoker. 
 +0.01::​cancer :- \+ smoker.</​code>​  
 +          * pros: it's a classical approach with clean Bayesian semantics 
 +          * cons: including new variables into the model can grow number of rules exponentially;​ also it may be required to modify already existing rules 
 +        * Problog '​additive way':<​code prolog>​0.1::​cancer :- smoker. 
 +0.01::​cancer.</​code> ​  
 +          * pros: including new variable doesn'​t involve change of the old rules; you just add new rule per variable 
 +          * cons: the probabilities ​will be different than in the Bayesian network; so you can't just copy the Bayesian network structure ;) 
 +      * x-ray can be positive even if you're healthy
  
 ==== Probabilities ==== ==== Probabilities ====
Line 26: Line 38:
 === Learning ===  === Learning === 
  
-The simplest way to have realistic priors is to not have any priors at all :) In other words --- we assume, we know nothing about probabilities. In ProbLog ​you can state this fact by using ''​t(_)''​ predicate, e.g.+The simplest way to have realistic priors is to not have any priors at all :) In other words --- we assume, we know nothing about probabilities. In Problog ​you can state this fact by using ''​t(_)''​ predicate, e.g.
  
 <code prolog> <code prolog>
Line 34: Line 46:
 Says you do not know nothing about probability of patient being smoker. Says you do not know nothing about probability of patient being smoker.
  
-Now when we have admitted our lack of knowledge, we can start learning! In ProbLog ​learning can be achieved either by command line tool:+Now when we have admitted our lack of knowledge, we can start learning! In Problog ​learning can be achieved either by command line tool:
 <code bash> <code bash>
 problog lfi problog lfi
Line 60: Line 72:
 The learning should result in new model with new probabilities. The learning should result in new model with new probabilities.
  
-<WRAP center round important ​60%> +<WRAP center round important ​80%> 
-If you receive an "​Inconsistent Evidence"​ error, include leak probabilities in the model. Leak probabilities are probabilities stating that some random variable can be assigned to a value without any particular reason, e.g. here we state that variable ''​var''​ can't be true if no reason ​is true.+If you receive an "​Inconsistent Evidence"​ error, ​it means that your model is not compatible. Sometimes the simplest way to solve this problem is to include leak probabilities in the model. Leak probabilities are probabilities stating that some random variable can be assigned to a value without any particular reason, e.g. here we state that variable ''​var''​ can't be true because of an external ​reason.
  
 <code prolog> <code prolog>
Line 68: Line 80:
 0.0::var. % leak probability 0.0::var. % leak probability
 </​code>​ </​code>​
-Make sure to include leak probabilities such that all possible ev + 
-idence ​can be linked to a possible world (otherwise ProbLog will return an “Inconsistent Evidence” error).+Command line tool can do it automatically ​(check ''​prolog lfi --help''​).
 </​WRAP>​ </​WRAP>​
  
Line 77: Line 89:
   - replace all probabilities in your model with ''​t(_)''​   - replace all probabilities in your model with ''​t(_)''​
   - put some random learning data in the on-line IDE and check results of learning   - put some random learning data in the on-line IDE and check results of learning
-  - download {{ :​en:​dydaktyka:​problog:​patients_10000.txt |data of 100 000 patients}}+  ​- can you think of possible difficulties of this approach? 
 +  ​- download {{ :​en:​dydaktyka:​problog:​patients_10000.txt |data of 10 000 patients}}
     - try to use it in on-line IDE     - try to use it in on-line IDE
     - try to use it offline (cmdline)     - try to use it offline (cmdline)
       - you may have to ask the teacher to install the problog for you       - you may have to ask the teacher to install the problog for you
       - you may have to limit number of iterations learning takes       - you may have to limit number of iterations learning takes
 +      - you may have to provide to a file for the output model
 +      - if your computer is slow, you may try to use {{ :​en:​dydaktyka:​problog:​patients_100.txt |the smaller file}}
     - what are the learned probabilites?​     - what are the learned probabilites?​
     - what is the probability of a smoker with positive x-ray to have a lung cancer?     - what is the probability of a smoker with positive x-ray to have a lung cancer?
 +
 +==== Learning + Priors ====
 +
 +The previous section was neat, but reality is really harsh and it's realy difficult to get good learning data of 10 000 patients. Therefore we need to go for compromise and combine priors with learning. To do that in Problog it's enough to put a probability in the ''​t/​1''​ predicate, e.g. to tell that in the society 50% people smoke, but you are not sure of this, you should write:
 +
 +<code prolog>
 +t(0.50)::​smoker.
 +</​code>​
 +
 +Now, let's say we have found some wise books that say:
 +
 +  * 30% of the population smokes
 +  * 0.01% of the population has been in Asia 
 +  * dyspnea is mostly caused by asthma and causes other than TB, lung cancer, or bronchitis
 +  * smoking has greater impact on lung cancer than on bronchitis
 +
 +=== Assignments === 
 +
 +  - introduce the knowledge from the wise books in your model
 +    * you do not need to add asthma or plethora of different causes
 +    * probabilities don't have to be exact to be useful
 +  - download {{ :​en:​dydaktyka:​problog:​patients_100.txt |data of 100 random patients}} and do the learning
 +    * do you see any problem with the learned model?
 +    * fix the problem ;) 
 +
 +==== Sampling ====
 +
 +Now, if you want to generate learning data yourself, you can do it via so called sampling of the model.
 +To do that you have to add queries for every variable you want to sample and then:
    
 +  * ''​problog sample ​ --help''​
 +  * select sample in the on-line IDE
 +
 +=== Assignments ===
 +
 +  - generate 100 random patients and feed them to the model
 +    - it may be easier using the command line tool ;) 
 +
 +
 +==== Predicting Treatment'​s Effects ====
 +
 +Every doctor has to prescribe some kind of treatment. Our task is to provide him with tools that could help predict the treatments effects. Here is a new story:
 +
 +//Doctor has to treat patients with primary type-1 osteoporosis. Two common treatments for osteoporosis are calcium supplementation and medication with bisphosphonates. The bisphosphonates are much more effective than calcium, but the concurrent intake of both medications will fully cancel out the effect of the bisphosphonates and the effect of the calcium supplementation is cancelled out partially. ​ //
 +
 +In other words:
 +  * calcium treats osteoporosis (let's say with 15% effectiveness)
 +  * bisphosphonates also treat osteoporosis (let's say with 85% effectiveness)
 +  * if you take both:
 +    * calcium still works but is less effective (let's say 50% weaker)
 +    * bisphosphonates do not work at all 
 +
 +There is a special syntax in Problog to say that a variable has a negative impact on (inhibits) another variable. You just have to put a negation before the rule head, e.g. to say that variable ''​a''​ can be true when variable ''​b''​ is true: <code problog>​\+a :- b.</​code>,​ and to write that there is 50% chance that ''​a''​ won't be true if ''​b''​ is true: <code problog>​0.50::​\+a :- b.</​code>​
 +
 +
 +=== Assignments ===
 +
 +  - write a corresponding Problog program:
 +    * you may have to introduce additional variables for every kind of treatment to indicate if the treatment is inhibited, e.g. calcium effects in **something** that treats the osteoporosis
 +  - what is the chance of successful treatment when we use both calcium and bisphosphonates?​
en/dydaktyka/problog/lab1.1496061868.txt.gz · Last modified: 2019/06/27 16:00 (external edit)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0