--- pl:dydaktyka:semweb:2016:labs:sparql [2016/09/24 17:34]
kkutt [4 SPARQL Endpoint [20 minutes]]
+++ — (aktualna)
@@ Linia 1: / Linia 1: @@
-====== Querying the Semantic Web with SPARQL ======
-^  Last verification: | 20151104 |
-^  Tools required for this lab: | -- |
-===== Before the lab =====
-Reading:
-  * [[http://www.cambridgesemantics.com/sparql-by-example/slides.html|SPARQL by Example]]
-  * {{:pl:dydaktyka:semweb:sparql-cheat-sheet.pdf|SPARQL by Example: the Cheat Sheet}} (from http://www.slideshare.net/LeeFeigenbaum/sparql-cheat-sheet)
-  * [[#if_you_want_to_know_more|If you want to know more...]]
-===== Lab instructions =====
-==== - Introduction [5 minutes] ====
-  - What can we do with our RDF models? In this section some "magic" will happen on [[wp>Periodic_table|Periodic Table]] saved in [[http://www.daml.org/2003/01/periodictable/PeriodicTable.owl|RDF]]!
-  - Open <wrap caution>[[http://sparql.org/sparql.html|SPARQLer]]</wrap> (a general purpose SPARQL query processor).
-  - Paste ''<nowiki>http://www.daml.org/2003/01/periodictable/PeriodicTable.owl</nowiki>'' into "Target graph URI (or use FROM in the query)" field, select ''Text'' in "Output" dropdown list.
-  - Run the following two queries (paste code in text field and click ''Get Results''):<code | select.rq>PREFIX table: <http://www.daml.org/2003/01/periodictable/PeriodicTable#>
-PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
-SELECT ?element ?name
-WHERE {
-  ?element table:group ?group .
-  ?group table:name "Noble gas"^^xsd:string .
-  ?element table:name ?name .
-}
-ORDER BY ASC(?name)</code><code | construct.rq>PREFIX table: <http://www.daml.org/2003/01/periodictable/PeriodicTable#>
-PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
-PREFIX rdfs: <http://w3.org/2000/01/rdf-schema#>
-CONSTRUCT {
-  ?element rdfs:label ?name .
-}
-WHERE {
-  ?element table:group ?group .
-  ?group table:name "Noble gas"^^xsd:string .
-  ?element table:name ?name .
-}
-ORDER BY ASC(?name)</code>
-     * Both queries run on the same dataset
-     * Both queries extract the same data: list of all elements in [[wp>Noble_gas|Noble gases]] group with their names
-     * Analyze queries and results: how they differ?
-   - 8-) What do ''SELECT'' queries do?
-   - 8-) What do ''CONSTRUCT'' queries do?
-==== - SPARQL = Pattern matching [10 minutes] ====
-  * General Idea: **SPARQL is an RDF graph pattern matching system.**
-  * E.g.: there is a triple saved in RDF: <code>:JamesDean :playedIn :Giant .</code>
-  * Now we can simply replace part of the triple with a question word (with a question mark at the start) and we get simple queries, e.g.:
-    * //Query:// '':JamesDean :playedIn **?what** .'' \\ //Answer:// '':Giant''
-    * //Query:// ''**?who** :playedIn :Giant .'' \\ //Answer:// '':JamesDean''
-    * //Query:// '':JamesDean **?what** :Giant .'' \\ //Answer:// '':playedIn''
-  - Let's get back to our [[.:intro#foaf_10_minutes|FOAF files]]. Do you have yours? 8-O
-  - Execute queries on your foaf file (or on ''<nowiki>http://home.agh.edu.pl/~kkutt/foaf.rdf</nowiki>'' file) to retrieve:
-    * friends who have name and e-mail defined
-    * friends who have name and e-mail defined and optional homepage
-    * friends who have name and e-mail defined and optional homepage, sorted by name descending
-  - 8-) Put the constructed queries in the report.
-    * **Hints:**
-      * [[http://www.w3.org/TR/sparql11-query/#optionals|SPARQL 1.1 documentation]] may be useful for specifying optional values
-      * [[http://xmlns.com/foaf/spec/|FOAF Vocabulary Specification]]
-      * Type in the location of your file in the ''Target graph URI '' OR use ''FROM'' construct to define your data source, e.g. <code>
-PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
-PREFIX foaf: <http://xmlns.com/foaf/0.1/>
-SELECT DISTINCT ?name
-FROM <http://home.agh.edu.pl/~kkutt/foaf.rdf>
-WHERE {
-    ?x rdf:type foaf:Person .
-    ?x foaf:name ?name
-}
-LIMIT 10</code>
-==== - Constraints: FILTER [10 minutes] ====
-  * After matching RDF graph pattern, there is also possibility to put some constraints on the rows that will be excluded or included in the results. This is achieved using FILTER construct. Let's try it now on the [[.:intro#foaf_10_minutes|FOAF files of your friends]].
-  * Prepare and execute queries on chosen FOAF file to retrive:
-    * people whose name starts with 'K'
-    * people having e-mails on student.agh.edu.pl server
-    * people whose name starts with 'K' **or** who have e-mails on student.agh.edu.pl server, make search caseinsensitive
-    * name of people, who have homepage **or** e-mail on student.agh.edu.pl server
-  * **Hints:**
-    * cooperate, ask your friends to give you the URI of their FOAF :!:
-    * SPARQL 1.1 Documentation parts about [[http://www.w3.org/TR/sparql11-query/#termConstraint|constraints]] and [[http://www.w3.org/TR/sparql11-query/#alternatives|alternatives]] may be useful
-  * 8-) Put the queries in the report.
-==== - SPARQL Endpoint [20 minutes] ====
-  * SPARQL queries may be asked against RDF file as we did in previous sections. But there is also possibility to use special purpose web service called SPARQL Endpoint. It wraps some data set and provides a service that responds to the SPARQL protocol, providing access to the data set.
-  * Many SPARQL Endpoints are available today, providing information about a variety of subjects. In this section we will use [[http://dbpedia.org/|DBpedia]] SPARQL Endpoint at **http://dbpedia.org/sparql**.
-  - DBpedia is a dump of Wikipedia annotated using RDF. So, like Wikipedia, DBpedia should contain some information about Poland. What we can do? \\ We don't know what URI Poland has in DBpedia, but we know the name Poland, and from previous lab we know rdfs:label property. Maybe this will help us? Let's try!
-  - Open the [[http://sparql.org/sparql.html|SPARQLer]]. What we know so far? There should be some URI (''?country'') that probably has a relation ''rdfs:label'' with object ''"Polska"@pl''. This can be easily translated into SPARQL WHERE clause: <code>?country rdfs:label "Polska"@pl .</code>
-  - To execute this query properly we also have to specify that we are asking ''<nowiki>http://dbpedia.org/sparql</nowiki>'' Endpoint. We can't use FROM clause, because this one is designated for RDF graphs. We should use SERVICE instead. Final query:<code>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
-SELECT ?country
-WHERE {
-    SERVICE <http://dbpedia.org/sparql> {
-        ?country rdfs:label "Polska"@pl .
-    }
-}</code>**Note 1**: There must be something in ''Target graph URI'' field (this can be FOAF URI from previous section) -- this URI will not be used in query but SPARQLer needs it to execute query... \\ **Note 2: ** You can use more than one ''SERVICE'' clauses in SPARQL query -- it gives you a possibility to combine results from different SPARQL Endpoints.
-  - Success! There is something that has ''rdfs:label'' ''"Polska"@pl''! \\ Now expand this query to find information about Poland population:
-    - 8-) How you can do this using only SPARQLer?
-    - 8-) Put the final query in the report.
-      - Hint: result should look like this: <code>--------------
-| population |
-==============
-| 38483957   |
---------------</code>
-  - 8-) Prepare a query that returns a list of 10 countries in Europe with the biggest population. Put the query in the report.
-==== - Aggregation [15 minutes] ====
-  * SPARQL provides grouping and aggregation mechanisms known from SQL:
-    * grouping: GROUP BY
-    * aggregation: COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT, and SAMPLE
-    * filter on groups: HAVING
-    * See [[http://www.w3.org/TR/sparql11-query/#aggregates|SPARQL 1.1 documentation]] for wider description.
-  - Poland is divided into 16 voivodeships (PL: województwo), and then into 380 counties (PL: powiat). In this task, we will examine it closer.
-  - Prepare a query (in [[http://sparql.org/sparql.html|SPARQLer]], against DBpedia) which returns list of voivodeships and number of counties inside them. List should consist only of voivodeships with 7 or more counties and should be ordered by number of counties.
-  - Results should look like that:<code>------------------------------------------------
-| voivodeship                       | counties |
-================================================
-| "Masovian Voivodeship"@en         | 15       |
-| "Greater Poland Voivodeship"@en   | 12       |
-| "Lesser Poland Voivodeship"@en    | 11       |
-| "Podkarpackie Voivodeship"@en     | 10       |
-| "Pomeranian Voivodeship"@en       | 9        |
-| "Warmian-Masurian Voivodeship"@en | 9        |
-| "West Pomeranian Voivodeship"@en  | 9        |
-| "Opole Voivodeship"@en            | 8        |
-------------------------------------------------</code> or in Polish:<code>--------------------------------------------------
-| wojewodztwo                          | powiaty |
-==================================================
-| "Województwo mazowieckie"@pl         | 15      |
-| "Województwo wielkopolskie"@pl       | 12      |
-| "Województwo małopolskie"@pl         | 11      |
-| "Województwo podkarpackie"@pl        | 10      |
-| "Województwo pomorskie"@pl           | 9       |
-| "Województwo warmińsko-mazurskie"@pl | 9       |
-| "Województwo zachodniopomorskie"@pl  | 9       |
-| "Województwo opolskie"@pl            | 8       |
---------------------------------------------------</code>
-  - **Hint** -- useful URIs:
-    * county: ''<nowiki>http://dbpedia.org/resource/Powiat</nowiki>''
-    * voivodeship: ''<nowiki>http://dbpedia.org/resource/Voivodeship_(Poland)</nowiki>''
-  - 8-) Put the query in the report.
-==== - SPARQL as rule language [10 minutes] ====
-  * So far, we have seen that the answers to questions in SPARQL can take the form of a table. In this section we will take a look at CONSTRUCT queries which answers take the form of an RDF graph. You have already seen one such example in [[#introduction_5_minutes|Introduction]].
-  * CONSTRUCT queries provides a way to introduce "rules" into RDF datasets:
-    - Let's back to [[.:rdfmodel#modeling_knowledge_with_rdf_triples_30_minutes|The Bold and the Beautiful]] model you prepared previously. Probably you had a problem which relations should be placed in RDF file: ''is_father_of'' or ''is_child_of'' or maybe both of them?
-    - CONSTRUCT queries make this simpler. In the initial data set you can put one of them, let's assume it was ''is_father_of''. Now, you can execute CONSTRUCT query that creates inverse relation: <code>PREFIX bb: <http://yourname/b-and-b#>.
-CONSTRUCT {
-  ?child bb:is_child_of ?father .
-}
-WHERE {
-  ?father bb:is_father_of ?child
-}</code>
-    - Or maybe ''is_uncle_of'' relation will be useful? No problem! <code>PREFIX bb: <http://yourname/b-and-b#>.
-PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
-CONSTRUCT {
-  ?uncle bb:is_uncle_of ?child .
-}
-WHERE {
-  ?uncle bb:is_sibling_of ?parent;
-         a bb:Man.
-  ?child bb:is_child_of ?parent
-}</code>
-    - You don't have ''is_sibling_of'' but instead you have ''is_sister_of'' and ''is_brother_of''. Simply prepare query (or queries) that creates ''is_sibling_of'' for you.
-      * 8-) Put this query in the report.
-  * OK, we created some new RDF triples using CONSTRUCT query. What now? Depending on your plans, you can:
-    * Add these triples back to the original dataset,
-    * Create new dataset (e.g. save results in RDF file).
-  * And then simply execute queries against this new knowledge.
-  * 8-) What 3 rules you will find useful in your [[.:rdfmodel#modeling_knowledge_with_rdf_triples_30_minutes|The Bold and the Beautiful]] model (or in the [[.:rdfmodel2|Multimedia library]] model). Put 3 CONSTRUCT queries in the report.
-==== - ASK and DESCRIBE queries [10 minutes] ====
-SPARQL also provides two more query types: ASK and DESCRIBE.
-  * **ASK queries** simply provide Yes/No answer and no information about founded triples (in case of "Yes" answer).
-    * E.g. Is there anyone with name "Krzysztof Kluza" in this data set? <code>PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
-ASK  { ?x foaf:name  "Krzysztof Kluza" }</code> If you run this query against ''<nowiki>http://home.agh.edu.pl/~kkutt/foaf.rdf</nowiki>'' file answer will be yes.
-    * 8-) Prepare query that checks something interesting in your [[.:rdfmodel2|Multimedia library]] :)
-      * If you have no idea what you can check, you can simply prepare a query that checks if there is anything that is a ''MusicCD'' and was published by ''Warner Music Group'' (if you don't have such classes in your library, use analogous class that you have).
-  * **DESCRIBE queries** return all knowledge associated with given Subject URI(s).
-    * The simplest DESCRIBE query specifies only the URI that should be described: <code>DESCRIBE <http://home.agh.edu.pl/~kkutt/foaf.rdf#me></code> (it should be executed against ''<nowiki>http://home.agh.edu.pl/~kkutt/foaf.rdf</nowiki>'' file)
-    * There is also possibility to select URI(s) from data set using constraints defined in WHERE clause. Read about it in [[http://www.w3.org/TR/sparql11-query/#describe|SPARQL 1.1 documentation]].
-    * 8-) Prepare query that describes all ''MusicCD'' items from your [[.:rdfmodel2|Multimedia library]] (if you don't have a ''MusicCD'' class in your library, use analogous class that you have).
-==== - "Negation" under Open World Assumption [5 minutes] ====
-  * RDF vs SQL:
-    * RDF: Open World Assumption
-    * SQL: Closed World Assumption
-  * Let's imagine that we are preparing query about all the living actors who played in [[wp>Return of the Jedi|Star Wars Episode VI: Return of the Jedi]].
-    * Idea scheme of this query in RDF: <code>SELECT ?actor
-WHERE {?actor :playedIn :ReturnOfTheJedi .
-       NOT EXISTS {?actor :diedOn ?deathdate . }
-}</code>
-    * Idea scheme of this query in SQL: <code>SELECT actor_name
-FROM movies
-WHERE title = "Return of the Jedi"
-AND NOT EXISTS (SELECT *
-                FROM deaths
-                WHERE movies.actor_name = deaths.name);</code>
-  * These queries look like the same but they are different!
-    * 8-) What are Open World Assumption (OWA) and Close World Assumption (CWA)?
-    * 8-) What is the difference between these two queries (refer to the knowledge of OWA and CWA)?
-===== Control questions =====
-   * How we create the SPARQL queries?
-   * What are the four SPARQL query types and how they differ? What is the form of the result in these queries?
-   * What is SPARQL Endpoint?
-===== Materials =====
-SPARQL:
-  * [[http://www.w3.org/TR/sparql11-query/|SPARQL 1.1 Query Language]]
-  * [[http://www.w3.org/TR/sparql11-overview/|SPARQL 1.1 Overview]]
-  * [[http://www.cambridgesemantics.com/semantic-university/learn-sparql|Learn SPARQL @Cambridge Semantics]]
-  * {{..:..:2014:eis2014semweb-rdfsinuse.pdf|RDF/S in use}} -- lecture 2014/2015 (part about Querying RDF)
-Exemplary queries in SPARQL:
-  * [[pl:dydaktyka:semweb:2014:projects:loddemo|Linked Open Data - demo]] (//in Polish//)
-  * [[https://blog.semantic-web.at/2015/09/29/sparql-analytics-proves-boxers-live-dangerously/|SPARQL analytics proves boxers live dangerously]]
-Tools:
-  * [[http://sparql.org/sparql.html|SPARQLer]] -- general purpose tool for executing SPARQL queries
-  * [[http://sparql.org/query-validator.html|SPARQLer Query Validator]]
-  * [[http://legacy.yasgui.org/|YASGUI]] -- online visual tool for querying SPARQL Endpoints
-  * [[http://www.ldodds.com/projects/twinkle/|Twinkle: A SPARQL Query Tool]]
-  * [[http://jena.apache.org/tutorials/sparql.html|Apache Jena -- SPARQL]]
-  * [[http://querybuilder.dbpedia.org/|DBpedia query builder]]
-Open Data Sets:
-  * http://news.ycombinator.com/item?id=1493768
-  * [[http://www.s3space.com/?p=383|Long list of SPARQL Endpoints]]
-DB2RDF (RDF and Relational Databases):
-  * [[http://esw.w3.org/RdfAndSql|RDFandSQL]]
-  * [[http://sourceforge.net/apps/mediawiki/bio2rdf/index.php?title=Main_Page|Bio2RDF]]
-  * [[http://www.w3.org/wiki/ConverterToRdf#SQL|ConvertToRDF -- SQL]]

pl/dydaktyka/semweb/2016/labs/sparql.1474731258.txt.gz · ostatnio zmienione: 2019/06/27 15:55 (edycja zewnętrzna)

Pokaż stronę Poprzednie wersje

Menadżer multimediów Do góry

AIwiki

Menu

Dla Studentów

Old specialized AI courses

SMaDA/SMaIDA/AIDA

Informatyka (EAIiIB)

Studia Dr

Inne materialy dydaktyczne

Archiwum

Dyplomanci

Geist Season of Code

HeKatE

Public

Różnice