--- en:dydaktyka:problog:lab3 [2019/01/14 11:31]
msl [Entity Classification]
+++ en:dydaktyka:problog:lab3 [2019/06/27 15:49] (current)
@@ Line 1: / Line 1: @@
+====== Statistical Relational AI ======
+Statistical Relational AI (StaRAI) is a branch of Artificial Intelligence lying at the intersection between statistical and logical methods, applied to relational data.
+This class will cover the most common types of tasks considered by the StaRAI methods.
+Materials used in the class come from a workshop conducted by Marco Lippi at ACAI'2018 summer school in Ferrara.
+Questions:
+  - What is hidden under the term "relational data"?
+  - Could modern "deep" learning methods work be used in the same context?
+====== Link Prediction ======
+Given a relational model of a domain (e.g. graph of connections in the social network) we have to learn how to predict connection between nodes in similar networks.
+Questions:
+  - What types of networks can we spot in real life?
+  - What are the possible applications of the link predictor?
+  - What does "similar network" mean? How can we validate the predictor?
+  - What learning features can be found in the network?
+===== Toy Problem =====
+{{ :en:dydaktyka:problog:toy_link_1.png?200|}}Let assume we have a very tiny network, similar to the one shown on the right. In this problem all links are undirected and unlabeled. Nodes have labels shown using different colors.
+Our ask is to train a link predictor using [[https://dtai.cs.kuleuven.be/problog/|Problog]]. In case somebody forgot Problog installation is fairly easy given a working Python environment (''pip install problog'' and optionally ''problog install'' on Linux). In case it wasn't simple enough, one can try to use the [[https://dtai.cs.kuleuven.be/problog/editor.html|on-line interface]]. The evidence file for the problem can downloaded from {{ :en:dydaktyka:problog:link_prediction_data.pl | this link}}.
+You can start from {{ :en:dydaktyka:problog:link_prediction_empty.pl |this point}}.
+== Questions: ==
+  - How would you write a Problog model for this task?
+  - Do you find this kind of predictor satisfying? Would you call it "relational"?
+====== Entity Classification ======
+Another problem is to classify entities in the network, e.g. given a social graph, guess gender of the people involved.
+== Question: ==
+  - What are the possible applications of such classificator?
+===== Toy Problem =====
+Given a hypertext documents (or simply: linked text documents) classify them by the topic. Read the Problog model below:
+<code prolog>
+.5::topic(P,sport) :- hasword(P,game).
+.7::topic(P,food) :- hasword(P,bread).
+.9::topic(P,food) :- link(P,Q), topic(Q,food).
+hasword(p1,bread).
+hasword(p2,game).
+hasword(p3,coffee).
+link(p3,p1).
+evidence(topic(p2,sport)).
+</code>
+== Questions ==
+  - What network's features have impact on the result?
+  - Try to learn a similar (but bigger) model from the following evidence {{ :en:dydaktyka:problog:hypertext_classification_data.pl|data}} and {{ :en:dydaktyka:problog:hypertext_classification_network.pl|network definition}}. Is there any issue with creating such a model?
+  - Could you learn similar classifier using classic machine learning classifiers?
+==== Information Retrieval ====
+Let's make the toy problem a bit more interesting and create a basic search engine. Download and read a {{ :en:dydaktyka:problog:information_retrieval_data_full.pl |following file}} that contains a basic set of data about an information retrieval scenario.
+== Questions ==
+  - What has to be changed in our classifier to make it a basic search engine?
+  - Learn model parameters from scratch, using the file you have downloaded previously.
+===== Intermission: Why? =====
+Before moving to the next topic, let us analyze what good have we done.
+  - What have our programs learned? What's the output?
+  - What data had to be provided? What's the input?
+  - Is there any advantage over other methods you know? Is there any disadvantage?
+===== Structure Learning =====
+Sometimes we do not have domain knowledge - sometimes we analyze just chaotic data we can make no sense at all. Sometimes we need so called structure learning - algorithm learning not only parameters but also structure of the model.
+== Questions: ==
+  - What applications of structure learning can you imagine?
+  - Do you know any related problems/methods?
+==== Quick ProbFOIL Tutorial ====
+ProbFOIL is a probabilistic version of the famous FOIL induction system, that can learn problog models from data.
+It is based on Problog and installation is analogous (''pip install probfoil'').
+The input of ProbFOIL consists of two parts: settings and data. These are both specified in Prolog (or ProbLog) files, and they can be combined into one.
+The data consists of (probabilistic) facts. The settings define
+  * target: the predicate we want to learn
+  * modes: which predicates can be added to the rules
+  * types: type information for the predicates
+  * other settings related to the data
+To use:
+<code bash>
+probfoil data.pl
+</code>
+Multiple files can be specified and the information in them is concatenated. (For example, it is advisable to separate settings from data).
+Several command line arguments are available. Use ''--help'' to get more information.
+==== Settings format ====
+=== Target ===
+The target should be specified by adding a fact
+<code prolog>
+learn(predicate/arity).
+</code>
+=== Modes ===
+The modes should be specified by adding facts of the form
+<code prolog>
+mode(predicate(mode1, mode2, ...).
+</code>, where ''modeX'' is the mode specifier for argument ''X''. Possible mode specifiers are:
+  * ''+'': input - the variable at this position must already exist when the literal is added
+  * ''-'': output - the variable at this position does not exist yet in the rule (note that this is stricter than usual)
+  * ''c'': constant - a constant should be introduced here; possible value are derived automatically from the data
+=== Types ===
+For each relevant predicate (target and modes) there should be a type specifier. This specifier is of the form
+<code prolog>
+base(predicate(type1, type2, ...).
+</code>, where ''typeX'' is a type identifier. Type can be identified by arbitrary Prolog atoms (e.g. ''person'', ''a'', etc.)
+=== Example generation ===
+By default, examples are generated by quering the data for the target predicate. Negative examples can be specified by adding zero-probability facts, e.g.:
+<code prolog>
+.0::grandmother(john, mary).
+</code>
+Alternatively, ProbFOIL can derive negative examples automatically by taking combinations of possible values for the target arguments. Note that this can lead to a combinatorial explosion. To enable this behavior, you can specify the fact
+<code prolog>
+example_mode(auto).
+</code>
+=== Example ===
+Try to learn model from the following file:
+<code prolog>
+% Modes
+mode(male(+)).
+mode(parent(+,+)).
+mode(parent(+,-)).
+mode(parent(-,+)).
+% Type definitions
+base(parent(person,person)).
+base(male(person)).
+base(grandmother(person,person)).
+% Target
+learn(grandmother/2).
+% How to generate negative examples
+example_mode(auto).
+</code>
+You'll have to define a family using ''male'' and ''parent'' facts.
+Start with simple family and then add new members as needed.
+===== Big Fat Assignment ======
+  - Try to learn structure of the Information Retrieval model, you've done earlier by hand.
+  - Is the learned model satisfying? If not, what is the problem? Try to fix it by changing learning data by hand.
+  - Modify model to consider more than only one query. What has to be changed?

en/dydaktyka/problog/lab3.1547461860.txt.gz · Last modified: 2019/06/27 16:00 (external edit)

Show page Old revisions

Media Manager Back to top

AIwiki

Menu

Dla Studentów

Old specialized AI courses

SMaDA/SMaIDA/AIDA

Informatyka (EAIiIB)

Studia Dr

Inne materialy dydaktyczne

Archiwum

Dyplomanci

Geist Season of Code

HeKatE

Public

Differences