Spis treści

Querying the Semantic Web with SPARQL

Last verification: 20180914
Tools required for this lab:

Before the lab

Video [minimum!]:

Reading:

Lab instructions

During this lab we will use two services to execute SPARQL queries:

  1. SPARQLer (a general purpose SPARQL query processor) will be used for querying RDF files.
  2. YASGUI (Yet Another Sparql GUI) will be used for querying SPARQL Endpoints (it has more powerful editor, but it can't be used against simple RDF files :-()

1 Introduction [5 minutes]

  1. What can we do with our RDF models? In this section some „magic” will happen on Periodic Table saved in RDF!
  2. Open SPARQLer.
  3. Paste http://www.daml.org/2003/01/periodictable/PeriodicTable.owl into „Target graph URI (or use FROM in the query)” field and select text output option.
    • There is also a backup (if original URI cannot be resolved): http://krzysztof.kutt.pl/didactics/semweb/PeriodicTable.owl
  4. Run the following two queries (paste code in text field and click Get Results):
    select.rq
    PREFIX table: <http://www.daml.org/2003/01/periodictable/PeriodicTable#>
    PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
     
    SELECT ?element ?name
    WHERE {
      ?element table:group ?group .
      ?group table:name "Noble gas"^^xsd:string .
      ?element table:name ?name .
    }
    construct.rq
    PREFIX table: <http://www.daml.org/2003/01/periodictable/PeriodicTable#>
    PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
    PREFIX rdfs: <http://w3.org/2000/01/rdf-schema#>
     
    CONSTRUCT {
      ?element rdfs:label ?name .
    }
    WHERE {
      ?element table:group ?group .
      ?group table:name "Noble gas"^^xsd:string .
      ?element table:name ?name .
    }
    • Both queries run on the same dataset
    • Both queries extract the same data: list of all elements in Noble gases group with their names
    • Analyze queries and results: how they differ?
  5. 8-) What do SELECT queries do?
  6. 8-) What do CONSTRUCT queries do?

2 SPARQL = Pattern matching [10 minutes]

  1. Now, let's do some more queries against Periodic Table. Prepare the following ones:
    • elements which have name and symbol defined
    • elements which have name and symbol defined and are placed in period_7 period
    • elements which have name and symbol defined and are placed in period_7 period and have OPTIONAL color (some of them does not have color!)
    • elements which have name and symbol defined and are placed in period_7 period and have OPTIONAL color, sorted by name descending
  2. 8-) Put the constructed queries in the report.

3 Constraints: FILTER [10 minutes]

4 SPARQL Endpoint [20 minutes]

  1. DBpedia is a dump of Wikipedia annotated using RDF. So, like Wikipedia, DBpedia should contain some information about Poland. What we can do?
    We don't know what URI Poland has in DBpedia, but we know the name Poland, and from previous lab we know rdfs:label property. Maybe this will help us? Let's try!
  2. Open the YASGUI.
  3. What we know so far? There should be some URI (?country) that probably has a relation rdfs:label with object „Polska”@pl. This can be easily translated into SPARQL WHERE clause:
    ?country rdfs:label "Polska"@pl .
  4. To execute this query properly, enter the http://dbpedia.org/sparql URI in the dropdown list at the top.
  5. Then, specify the query:
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT ?country
    WHERE { 
        ?country rdfs:label "Polska"@pl .
    }
  6. Success! There is something that has rdfs:label „Polska”@pl!
    8-) Now expand this query to find information about Poland population and put the final query in the report.
    • Hint: result should look like this:
      --------------
      | population |
      ==============
      | 38483957   |
      --------------
  7. 8-) Prepare a query that returns a list of 10 countries in Europe with the biggest population. Put the query in the report.

5 Aggregation [15 minutes]

  1. Poland is divided into 16 voivodeships (PL: województwo), and then into 380 counties (PL: powiat). In this task, we will examine it closer.
  2. Prepare a query (in YASGUI, against DBpedia) which returns list of voivodeships and number of counties inside them. List should consist only of voivodeships with 7 or more counties and should be ordered by number of counties.
  3. Results should look like that:
    ------------------------------------------------
    | voivodeship                       | counties |
    ================================================
    | "Masovian Voivodeship"@en         | 15       |
    | "Greater Poland Voivodeship"@en   | 12       |
    | "Lesser Poland Voivodeship"@en    | 11       |
    | "Podkarpackie Voivodeship"@en     | 10       |
    | "Pomeranian Voivodeship"@en       | 9        |
    | "Warmian-Masurian Voivodeship"@en | 9        |
    | "West Pomeranian Voivodeship"@en  | 9        |
    | "Opole Voivodeship"@en            | 8        |
    ------------------------------------------------

    or in Polish:

    --------------------------------------------------
    | wojewodztwo                          | powiaty |
    ==================================================
    | "Województwo mazowieckie"@pl         | 15      |
    | "Województwo wielkopolskie"@pl       | 12      |
    | "Województwo małopolskie"@pl         | 11      |
    | "Województwo podkarpackie"@pl        | 10      |
    | "Województwo pomorskie"@pl           | 9       |
    | "Województwo warmińsko-mazurskie"@pl | 9       |
    | "Województwo zachodniopomorskie"@pl  | 9       |
    | "Województwo opolskie"@pl            | 8       |
    --------------------------------------------------
  4. Hint – useful URIs:
    • county: http://dbpedia.org/resource/Powiat
    • voivodeship: http://dbpedia.org/resource/Voivodeships_of_Poland
  5. 8-) Put the query in the report.

6 SPARQL as rule language [10 minutes]

7 ASK and DESCRIBE queries [10 minutes]

SPARQL also provides two more query types: ASK and DESCRIBE.

8 "Negation" under Open World Assumption [5 minutes]

9 Wikipedia, DBpedia, Wikidata

If you are interested in querying the huge amount of data available in Wikipedia, there are two projects you may be interested in:

They overlap in part, but are independent of each other and have different uses. For you, a student of the Semantic Web Technologies course, it does not matter much. They are simply large knowledge bases with which you can do a lot of things.
If you want to dive into this data you can start with a Big set of SPARQL queries against Wikidata.

Control questions

If you want to know more...

SPARQL:

Sample queries in SPARQL:

Tools:

Open Data Sets:

DB2RDF (RDF and Relational Databases):