Różnice

Różnice między wybraną wersją a wersją aktualną.

Odnośnik do tego porównania

Both sides previous revision Poprzednia wersja
Nowa wersja
Poprzednia wersja
pl:miw:2009:miw09_semweb_rdfize [2009/09/25 06:34]
jsi08
pl:miw:2009:miw09_semweb_rdfize [2019/06/27 15:50] (aktualna)
Linia 11: Linia 11:
     * sample applications/​websites (see e.g.: [[http://​www.opencalais.com/​gallery|OpenCalaisGallery]] )     * sample applications/​websites (see e.g.: [[http://​www.opencalais.com/​gallery|OpenCalaisGallery]] )
  
-====== Spotkania ====== + 
-===== 20090319 ====+ 
  
 ====== Projekt ====== ====== Projekt ======
Linia 30: Linia 31:
  
  
-===== Semantic Web technology overview ​=====+==== Semantic Web technology overview ====
  
 Semantic Web technologies can be considered in terms of layers, each layer resting on and extending the functionality of the layers beneath it. Although the Semantic Web is often talked about as if it were a separate entity, it is an extension and enhancement of the existing Web rather than a replacement of it. Semantic Web technologies can be considered in terms of layers, each layer resting on and extending the functionality of the layers beneath it. Although the Semantic Web is often talked about as if it were a separate entity, it is an extension and enhancement of the existing Web rather than a replacement of it.
Linia 278: Linia 279:
 In this example there are two implicitly defined entities: the person'​s photo and their address. Since the address property always relates to an entity of type address, there is no need to explicitly include a line with typeof="​v:​Address"​. Similarly, a photo always relates to a URL pointing to an image, so there is no need to explicitly define a typeof property. In this example there are two implicitly defined entities: the person'​s photo and their address. Since the address property always relates to an entity of type address, there is no need to explicitly include a line with typeof="​v:​Address"​. Similarly, a photo always relates to a URL pointing to an image, so there is no need to explicitly define a typeof property.
  
-====== ​Sprawozdanie ​====== +==== Microformats ​==== 
-====== ​Prezentacja ​====== + 
-====== ​Materiały ======+Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging). 
 + 
 +{{:​pl:​miw:​2009:​micro-diagram.gif|}} 
 + 
 +**Microformats are:** 
 + 
 +    * A way of thinking about data 
 +    * Design principles for formats 
 +    * Adapted to current behaviors and usage patterns (“Pave the cow paths.”) 
 +    * Highly correlated with semantic XHTML, AKA the real world semantics, AKA lowercase semantic web, AKA lossless XHTML 
 +    * A set of simple open data format standards that many are actively developing and implementing for more/better structured blogging and web microcontent publishing in general. 
 + 
 +**Microformats are not:** 
 + 
 +    * A new language 
 +    * Infinitely extensible and open-ended 
 +    * An attempt to get everyone to change their behavior and rewrite their tools 
 +    * A whole new approach that throws away what already works today 
 +    * A panacea for all taxonomies, ontologies, and other such abstractions 
 +    * Defining the whole world, or even just boiling the ocean 
 +    * Any of the above 
 + 
 +**The microformats principles** 
 + 
 +    * Solve a specific problem 
 +    * Start as simple as possible 
 +    * Design for humans first, machines second 
 +    * Reuse building blocks from widely adopted standards 
 +    * Modularity / embeddability 
 +    * Enable and encourage decentralized development,​ content, services 
 + 
 +==== Embedded RDF ==== 
 + 
 +This document describes how a subset of RDF can be embedded into XHTML or HTML by using common idioms and attributes. No new elements or attributes have been invented and the usages of the HTML attributes are within normal bounds. This scheme is designed to work with CSS and other HTML support technologies. 
 + 
 +Note: hereafter the term HTML will be used to include both XHTML and HTML except where otherwise stated. 
 + 
 +=== Embeddable RDF === 
 + 
 +The subset of RDF that is used in this embedding scheme is called HTML Embeddable RDF. It allows some very important parts of the RDF model to be embedded but does not attempt to extend this to the full RDF model. This is very deliberate. Other attempts at embedding RDF in HTML have required the introduction of new syntax to express all the various RDF concepts. 
 + 
 +The relationship is: all HTML Embeddable RDF is valid RDF, not all RDF is Embeddable RDF. 
 + 
 +=== Extracting data from Embedded RDF === 
 + 
 +**Embedded RDF Extractor** 
 +This service extracts Embedded RDF from HTML using this XSLT stylesheet. 
 +http://​research.talis.com/​2005/​erdf/​extract 
 + 
 +==== Existing ontologies ​==== 
 + 
 +A more complex part of the Semantic Web is to design an ontology that matches up to your data. Arriving at the right ontology is usually a critical element of successful implementation of Semantic Web projects. Fortunately,​ many ontologies already exist. 
 + 
 +**Some ontologies in use on the Web today** 
 +| **Dublin Core** ​ | This metadata element standard for cross-domain information resource description provides a simple and standardised set of conventions for describing things online in ways that make them easier to find. | 
 +| **SIOC** ​ | Semantically-Interlinked Online Communities Project is an ontology that expresses the information contained both explicitly and implicitly in Internet discussion methods, such as blogs or forums mailing lists. | 
 +| **FOAF** ​ | The Friend of a Friend ontology describes individuals,​ their activities and their relations to other people and objects. FOAF allows the description of social networks in a distributed fashion. | 
 +| **DOAP** ​ | Description Of A Project is an ontology to describe open-source projects | 
 +| **ResumeRDF** ​ | This ontology expresses a Resume or Curriculum Vitae (CV), including information such as work and academic experience or skills. | 
 + 
 +In addition, many ontologies are domain specific in fields such as technology, environmental science, chemistry and linguistics. These will apply to fewer Web sites than those listed above, however. A lot of your data is likely to fit into at least one of the areas covered by the listed ontologies, in which case you can incorporate them in your planning. 
 + 
 + 
 +==== Existing semantic add tools ==== 
 + 
 +Whether you fully embrace the Semantic Web in your Web site infrastructure,​ or just want to make your existing content more useful, there are probably several opportunities to add structure to existing content on your Web site. This is the domain of Microformats,​ RDFa and GRDDL. Below are listed more common information types that you can easily mark up as structured data. 
 + 
 +//​Opportunities for structured markup and automatic transformation//​ 
 +| Information type | Structured Markup | 
 +| People and Organizations | hCard, RDF vCard | 
 +| Calendars and Events | hCalendar, RDF Calendar | 
 +| Opinions, Ratings and Reviews | VoteLinks, hReview | 
 +| Social Networks | XFN, FOAF | 
 +| Licenses | rel-license | 
 +| Tags, Keywords, Categories | rel-tag | 
 +| Lists and Outlines | XOXO | 
 + 
 +Adding the structured markup to your page is fairly simple. Below is shown a fragment of HTML containing contact information without, and then with, the additional markup required for the RDF vCard, respectively. 
 + 
 +//​Unstructured contact information//​ 
 +<code html> 
 +<div class="​contactinfo">​ 
 +  Rob Crowther. Web hacker 
 +  at 
 +  <a href="​http://​example.org">​ 
 +    Example.org 
 +  </​a>​. 
 +  You can contact me 
 +  <a href="​mailto:​robertc@example.org">​ 
 +    via e-mail 
 +  </a> or on my work phone at 0123 456789. 
 +</​div>​ 
 +</​code>​ 
 + 
 +Below you can see the contact information with additional markup required for the RDF vCard. 
 + 
 +//Contact Information using vCard// 
 +<code html> 
 +<div xmlns:​contact="​http://​www.w3.org/​2001/​vcard-rdf/​3.0#"​ class="​contactinfo" ​ about="​http://​example.org/​staff/​robertc">​ 
 +  <span property="​contact:​fn">​Rob Crowther</​span>​. 
 +  <span property="​contact:​title">​Web hacker</​span>​ 
 +  at 
 +  <a rel="​contact:​org"​ href="​http://​example.org">​ 
 +    Example.org 
 +  </​a>​. 
 +  You can contact me 
 +  <a rel="​contact:​email"​ href="​mailto:​robertc@example.org">​ 
 +    via e-mail 
 +  </​a>​ 
 +  or on my  
 +  <span property="​contact:​tel">​ 
 +    <span property="​contact:​type">​work</​span>​ 
 +    phone at 
 +    <span property="​contact:​value">​0123 456789</​span>​ 
 +  </​span>​. 
 +</​div>​ 
 +</​code>​ 
 + 
 +As you can see span elements added to delimit the semantically significant bits of text, and attributes that indicate what they mean. You added the namespace "​contact"​ linked to the RDF VCard vocabulary. Next, you indicated that this element is about the resource represented by the URI http://​example.org/​staff/​robertc. Then, you added metadata using the rel attribute for link relationships and the property attribute on non-links. The only slightly complex part is the telephone because you need to specify a type as well as the number. To achieve this, you nest the type and value elements inside the tel element. Adding this structure allows users to add the contact details to their address book with a single click of the mouse. 
 + 
 +Other automatic processing is possible with the other structured forms; for example, Technorati makes use of the rel-tag microformat to categorize its vast aggregation of blog posts. A rel-tag is shown below, and as you can see, it is simply a link that makes use of the rel attribute. The significant part is the last bit of the URI, after the final /. This is the tag (using the normal URI encoding conventions where a space is represented by the plus sign). 
 + 
 +//rel-tag for Technorati for the tag '​semantic web'//​ 
 + 
 +<code html> 
 +<a href="​http://​technorati.com/​tag/​semantic+web"​ rel="​tag">​ 
 +  Semantic Web 
 +</​a>​ 
 +</​code>​ 
 + 
 +==== Tools for extracting semantic annotations ==== 
 + 
 +**Browser Plugins** 
 + 
 +    * **Fuzz** is most useful for detecting embedded semantic information in web pages. It is currently the most compliant in-browser RDFa parser. 
 +    * **JavaScript** Bookmarklets to drag to your IE, Firefox, Safari toolbar to extract, display and interact w/ RDFa. Great for demonstration purposes and for quickly checking RDFa you've added to your page. 
 +    * **MozCC** is an extension for Mozilla-based applications,​ including Mozilla Firefox and Songbird, which provides a convenient way to examine metadata -- including Creative Commons licenses -- embedded in web pages. 
 +    * **Operator**,​ a Firefox plugin and an extension. 
 +    * **Semantic Radar for Firefox**, a semantic metadata detector for Mozilla Firefox.  
 + 
 +**Web-based Services** 
 + 
 +    * **FOAFr** enables the conversion of your RDF FOAF file into RDFa. 
 +    * **irs** allows to create typed links between resources and export as XHTML+RDFa (note: early alpha). 
 +    * **mle** creates SIOC in XHTML+RDFa from hypermail archives. 
 +    * **RDFohloh** exports the information stored in Ohloh in RDF/XML, N3 and XHTML+RDFa. 
 +    * **Sindice** is a semantic web crawler that parses and indexes RDFa. 
 +    * **SPARQLBot**,​ an IRC bot consuming RDFa; allows you to perform user-defined SPARQL queries on data sources from IRC. See the SPARQLBot documentation for how to use it. 
 +    * **Yahoo Search Monkey** crawls the web looking for RDFa data that use particular vocabularies. 
 +    * **Swignition** a parser for metadata embedded in HTML, extracts a number of other microformats as well  
 + 
 +**Applications** 
 + 
 +    * **Swignition** a parser for metadata embedded in HTML, extracts a number of other microformats as well 
 +    * **Exhibit** can import data on the fly from RDFa in pages within the same domain. 
 +    * **Krextor** is a generic XML→RDF extraction library (with a command-line frontend) with support for XHTML+RDFa input (currently in an experimental stage) and support for RDFa integrated into other host languages. 
 +    * **OpenLink** Data Spaces, a distributed collaborative applications suite for creating and exploiting presence on the Linked Data Web that includes support for RDFa consumption and generation (all ODS Web pages can optionally include RDFa generated by the system as an addition mechanism for exposing RDF in a given data space). 
 +    * **TopBraid** Composer is an enterprise-class modeling environment for developing Semantic Web ontologies and building semantic applications. It supports RDFa. 
 +    * **Virtuoso**,​ a SPARQL compliant Quad Store that includes an RDFization layer (within its SPARQL processor) with support for RDFa.  
 + 
 + 
 +==== Tools for adding semantic annotations ==== 
 + 
 +**In-browser RDFa Tools** 
 + 
 +    * **Fuzz** is a native Semantic Web Processor implemented as a Firefox Add-on. It can be used to view triples generated on a web page, which is helpful when authoring RDFa by hand. Checking your triples is as easy as refreshing the page and clicking the Fuzz icon. 
 + 
 +**Content Management Plugins** 
 + 
 +    * **Wordpress** - A GPL-licensed XHTML+RDFa parent theme for Wordpress 2.7 is now available, authored by Sam Pablo Kuper. (Download) 
 +    * **MediaWiki** - An extension that allows outputing semantic data of Mediawiki pages in the RDFa format. 
 + 
 +**HTML+RDFa Editors** 
 + 
 +    * **TopBraid** - Composer can parse such documents to extract RDFa metadata from HTML pages. The metadata can then be treated like any other RDF source, and users can perform RDFS and OWL reasoning or SPARQL queries on it. Mash-up applications can be developed using RDFa together with the other data integration capabilities and visualization tools in TopBraid Composer. An implementation report is available. 
 +    * **W3C'​s Amaya** now also supports RDFa editing. 
 +    * **RDFa** Documents extension (Version 0.1.0) For Dreamweaver versions 8 to CS4  
 + 
 +**Exporting Content as RDFa** 
 + 
 +    * SWAML exports mailing lists in RDF/XML and XHTML+RDFa using SIOC.  
 + 
 +==== RDF Generators ==== 
 + 
 +    * **Cypher** - Cypher Generates RDF and SPARQL/​SeRQL representation of natural language statements and phrases  
 +    * **FOAF-o-matic** - FOAF-o-matic Online FOAF generator  
 +    * **INQLE** - http://​code.google.com/​p/​inqle/​ Intelligent network of Querying and Learning Engines. Open Source server, with Jena SDB back-end datastore. Runs automated, random machine learning experiments on semantic data. Stores any discovered correlations as RDF, and leverages such correlations in performing future experiments. Provides tools for loading spreadsheet data into the RDF database.  
 +    * **Krextor** - Krextor is a framework for extracting RDF from various XML languages (see this wiki page for details)  
 +    * **Ontos** - Ontos API is a public web service which returns rich semantic metadata in standard RDF-based formats for input plain text content you submit. Ontos recognizes entities and relations between them using natural language processing techniques. Although basic types of entities (people, companies, places etc.) are pre-defined,​ the user can also create OWL-driven dictionaries for custom types of entities, merge entities across documents, etc. For more details, tips and updates see Ontos API home, the official blog, and the community group.  
 +    * **Open Calais** - Open Calais from [Reuters http://​www.reuters.com] is a web service that automatically attaches rich semantic metadata to the content you submit. Using natural language processing, machine learning and other methods, Calais categorizes and links your document with entities (people, places, organizations,​ etc.), facts (person ‘x’ works for company ‘y’), and events (person ‘z’ was appointed chairman of company ‘y’ on date ‘x’). The metadata results are stored centrally and returned as RDF constructs.  
 +    * **OpenLink Virtuoso** - OpenLink Virtuoso (elsewhere on this page) delivers SQL2RDF directly, and via the Sponger and its cartridges, also delivers RDF from GRDDL, RDFa, microformats,​ and many more inputs.  
 +    * **RDFa distiller** - downloadable Python package as well as online service to generate RDF from RDFa pages  
 +    * **Semantic Hacker** - Semantic Hacker'​s technology provide a weighted representation of the concepts contained in a piece of text. (It does not provide RDF directly yet...)  
 +    * **Triplify** - Triplify is a small plugin for Web applications,​ which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.  
 +    * **Zemanta** - Zemanta API is a web API that delivers relevant tags, links, categories and pictures from your unstructured data/​content. It is semantic standards compliant, with RDF output and ability to disambiguate to entities from Linking Open Data.  
 + 
 + 
 +==== Sample applications/​websites ​==== 
 + 
 +=== Wandora === 
 + 
 +**Wandora** is a general purpose knowledge extraction, management, and publishing application based on Topic Maps and Java. More precisely Wandora is an open source desktop application to build and manage topic maps. Wandora has graphical user interface, layered presentation of knowledge, several data storage options, rich data extraction, import and export capabilities,​ and open plug-in architecture. Wandora'​s license is GNU GPL. 
 + 
 +{{:​pl:​miw:​2009:​wandoras_extractors.gif|}} 
 + 
 +Wandora suits well for knowledge mashups. Wandora is capable to extract and convert various open data feeds to Topic Maps format (see image below). Beyond Topic Maps conversion this feature allows Wandora user to aggregate multidimensional knowledge bases where information from Flickr meets Geonames and Delicious, for example. Read more at documentation. 
 + 
 +This site, WandoraWiki,​ is a home of Wandora software application and provides you an access point to the Wandora application and it's documentation. To contribute Topic Maps, Wandora Team has also converted several well known ontologies to Topic Maps format. These converted ontologies; WordNet, OpenCyc, Gene Ontology, Gellish, and Finnish General Upper Ontology (YSO) are available for download here in WandoraWiki.  
 + 
 +=== SemanticProxy === 
 + 
 +**SemanticProxy.com** is part of the Calais Initiative from Thomson Reuters. 
 + 
 +In the future the entire web will be one giant tightly interconnected information asset. Beyond just publishing information for humans, every site will expose its content in a way that's readable by machines. Those machines will mix, match, filter and aggregate information to greatly improve things for us humans. We're not there yet - but that's the vision of the Semantic Web. 
 + 
 +SemanticProxy.com is a little taste of what that future might look like. What SemanticProxy does is simple: it translates the content of any URL on the web to its semantic representation in RDF, HTML or Microformats. If you're an RDF crawler hungry for a little semantic content, just point yourself at SemanticProxy.com and we'll fill you up. 
 + 
 +Right now SemanticProxy.com is optimized for performance on 30 of the world'​s largest English-language news sites. In coming releases we'll continue to refine and extend its capabilities to additional areas. While you'll find it works best with news, feel free to experiment with other sites like Wikipedia or ... whatever. The results can be very encouraging. 
 + 
 +SemanticProxy isn't intended to be a great place for humans to visit. However, little semantic machines love to come by and spend a few high-quality milliseconds.  
 + 
 + 
 +=== Gnosis === 
 + 
 +Delivering Semantic Web Services - Directly to Your Desktop 
 + 
 +Gnosis is a browser extension for Firefox and Internet Explorer that automatically analyzes content as you browse and provides a variety of tools to explore the people, places, companies, and other items that you’re reading about. In this latest release, Gnosis capability has been extended to include the following:​ 
 + 
 +    * Support for Firefox 3.5 
 +    * Entity Count and Relevance Score for each entity that the viewer displays. 
 +    * Entity Disambiguation:​ The entity name displayed is the disambiguated name where applicable (companies, geographies,​ electronic products) 
 + 
 +After installing Gnosis, simply navigate to the news site that you are interested in - one of the news sites listed below would be a good place to start. Right-click and select ClearForest Gnosis or click on the Gnosis button in the toolbar. After 1-2 seconds your page has been processed and you’re ready to explore! 
 + 
 + 
 + 
 +==== RDFa Implementations ==== 
 + 
 + 
 +There are a number of RDFa implementations. 
 + 
 +**Python** 
 + 
 +    * RDFa Distiller 
 +      Author: Ivan Herman 
 +      http://​www.w3.org/​2007/​08/​pyRdfa/​ 
 + 
 +    * rdfadict 
 +      Author: Nathan Yergler 
 +      http://​pypi.python.org/​pypi/​rdfadict 
 + 
 +    * rdflib 
 +      Author: Elias Torres 
 +      http://​svn.rdflib.net/​trunk/​rdflib/​syntax/​parsers/​RDFaParser.py 
 + 
 +**PHP** 
 + 
 +    * ARC 2 
 +      Author: Benjamin Nowack (semsol) 
 +      http://​arc.semsol.org/​download 
 +    * RDFa Monkey 
 +      Author: Ruben Thys 
 +      http://​www.avthasselt.sohosted.com/​rdfamonkey/​ 
 + 
 +**XSLT** 
 + 
 +    * RDFa2XSLT 
 +      Author: Fabien Gandonhttp://​ns.inria.fr/​grddl/​rdfa/​ 
 + 
 +**JavaScript** 
 + 
 +    * RDFa Bookmarklets 
 +      Author: Ben Adida 
 +      http://​www.w3.org/​2006/​07/​SWD/​RDFa/​impl/​js/​ 
 + 
 +**Ruby** 
 + 
 +    * RDFa on Rails 
 +      Author: Cédric Mesnage 
 +      http://​rdfa.rubyforge.org/​ 
 + 
 +    * ruby-rdfa 
 +      Author: Benjamin Lyu 
 +      http://​code.google.com/​p/​ruby-rdfa/​ 
 + 
 +**C/C#** 
 + 
 +    * librdfa 
 +      Author: Manu Spornyhttp://​rdfa.digitalbazaar.com/​librdfa/​ 
 + 
 + 
 + 
 + 
  
pl/miw/2009/miw09_semweb_rdfize.1253853275.txt.gz · ostatnio zmienione: 2019/06/27 15:57 (edycja zewnętrzna)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0