Kniblet Tutorial Part 1
From n² wiki
Introduction and Overview
The aim of this tutorial is to demonstrate how a practical application can be built rapidly using the Talis Platform, using Web standards and conventional tools. We will be building an application called Kniblet, a simple but practical online knowledgebase.
We'll begin with an overview of the technologies which will be used, followed by a review of relevant Semantic Web standards.
The technologies we will be using are as follows:
- PHP
- Apache
- Talis Platform
- Semantic Web standards
- URI/IRI
- HTTP
- XML, HTML
- RDF
- SPARQL
PHP (with various libraries) will be used for application-specific logic, with Apache acting as the host. The user interface will primarily be regular HTML navigation and forms. The Talis Platform will be used to store RDF data and content.
Talis Platform
The Talis Platform is a Web based environment for building Semantic Web applications and services. It is a hosted system which provides an efficient, robust storage infrastructure for documents and metadata. The Platform is accessed using a suite of Web based services which provide sophisticated data management, query, indexing and search features.
Access to Platform stores (and auxiliary services) is entirely Web-based, using RESTful HTTP. Application-specific logic and user interface is decoupled from storage by this interface. Applications themselves will run in user space - either on separate Web servers or local desktops. The underlying application data model will typically be expressed using the RDF model.
PHP Libraries
Kniblet will be using various PHP libraries to simplify development :
- Konstrukt is a lightweight and intuitive controller framework designed to make RESTful Web application development straightforward.
- ARC RDF Classes for PHP provide a convenient programming interface for the RDF model
- Moriarty provides a useful and comprehensive facade to Talis Platform services, with some extra niceties for manipulating RDF built in
Review of Semantic Web Technologies
The following is a quick refresher on the technologies we'll be using with an emphasis on the roles they play in the Semantic Web. We will go into detail where necessary later in the tutorial, but fortunately we won't need to go too deep as the tools we will be using offer convenient interfaces.
URI - Uniform Resource Identifier (RFC 3986)
URIs provide global names for resources. The definition of a resource is essentially anything that can be identified with a URI, which includes:
- documents - as found on the Web right now
- things - real-world objects, people, places, concepts
- relationships between things
HTTP - Hypertext Transfer Protocol (RFC 2616)
The HTTP protocol provides access to URI-named documents as found on the Web. As well as the nuts & bolts of the specification, the use of HTTP on the Web generally follows a specific architectural style known as Representational State Transfer (REST). While other approaches are possible, there are advantages in ensuring applications follow this architectural style, which can be summarised as working to a common interface. See also the W3C's Architecture of the World Wide Web recommendation.
HTML - Hypertext Markup Language (W3C Specifications)
The format most commonly used for human-readable documents on the Web.
XML - Extensible Markup Language (W3C XML Specifications)
A specification commonly used for expressing data as text-based documents, as well as XHTML. RDF has RDF/XML as its official serialization. Many tools exist for working with XML, and although RDF/XML itself is ill-suited for these, XML has many other uses on the Semantic Web.
RDF - Resource Description Framework (W3C Specifications)
General-purpose data model based on:
- URIs for things
- URIs for relationships between things
In its simplest form, RDF is a set of statements (triples) of the form:
<subject>, <property>, <object>
where <subject>, <property> and <object> are URIs. The <subject> is the resource being described, <property> is a particular characteristic and <object> is a resource which correspond to the value of the characteristic for the resource in question.
Because the <object> resource of a statement can also be the <subject> of another statement, a set of triples can also be considered a (node & arc) graph model, hence the description of the Semantic Web as the Giant Global Graph.
The <object> of an RDF triple may also be a literal value such as a simple string, and resources without URIs (blank nodes) may appear in either the <subject> or <object> positions.
RDF data may be expressed in various different formats, the official serialization being RDF/XML. The Turtle (N3) format is considerably more human-friendly than RDF/XML.
Links - defined in specifications above
References to other URIs in URI-identified documents:
- HTML links
- RDF statements - links between subject-object resources
- links in other formats (e.g. RSS item references)
URIs plus document formats enable links
links plus HTTP enable "the basic follow-your-nose way the web works" (see Tim Berners-Lee's diagram)
Linked Data - conventions based on Tim Berners-Lee's (Linked Data design note)
The use of links in data on the Web, starting with the four rules :
- Use URIs as names for things
- Use HTTP URIs so that people can look up those names.
- When someone looks up a URI, provide useful information.
- Include links to other URIs. so that they can discover more things.
In RDF the property rdf:seeAlso is useful for pointing to a source of related data (without specifying how the data is related) and the property owl:sameAs can be used to tie together different URIs which identify the exact same resource. See also: Linking Open Data project.
There are also conventions for making clear the difference between URIs which identify documents on the Web, and those which identify real-world things (see (Cool URIs for the Semantic Web).
SPARQL - Sparql Protocol and RDF Query Language (W3C Specification)
While Linked Data is a useful conceptual model for data on the Web, where a specific set of RDF data is of interest it can be more convenient to maintain that in an RDF store (triplestore) as a kind of database similar to those of SQL. SPARQL is a query language, which operates over RDF data sets. With RDF stores and SPARQL we have a combination similar to that of SQL databases, but the design is better suited to direct use on the (Semantic) Web, largely thanks to the use of URIs as identifiers.
SPARQL queries may be applied to online datasets using HTTP (a GET method with the query itself as a parameter), the results most commonly returned in the SPARQL Query Results XML Format, although services may support other formats such as SPARQL/JSON.
The SPARQL specification only defines read-only operations. Proposals do exist for SPARQL update (to add and delete statements from a store), but for an online store the same facilities can be provided fairly directly using standard HTTP methods. The Talis Platform supports the Changeset Protocol a set of conventions for modifying an online RDF store.

