Kniblet Tutorial Part 8
From n² wiki
Simple Tagging
As we add more and more aricles we're going to need another way to organise and discover them. One simple way is to allow users to tag their articles with keywords. We could then display the tags against each article and link them to "tag pages" which cluster articles together that share the same tag. We're going to use the following URI pattern for the tag pages:
http://kniblet.com/tags/{tag}
When the user visits a page following that pattern they'll see a list of articles that have been assigned that tag. The first thing to do is to modify our article editing form to allow users to enter a set of tags. This is as simple as adding another text input box:
<label for="tags">Tags:</label> <input type="text" name="tags" id="tags" size="80" value="" /> (separate tags with commas) <br />
We can test that out very easily. Now we need to handle the submission of the tag information as part of the form data. All that is done via the ArticleController's POST method which, in turn, delegates to the Article class. The first thing to do is to modify our POST method so that it grabs the tags from the form data and passes them to the Article class. We need to add a lines to read the tags field into a variable:
$tags = $this->POST['tags'];
and modify the line that passes it to the article's calculate_changeset method:
$changeset = $article->calculate_changeset( Array('title' => $title, 'body' => $body, 'tags'=>$tags), $user, $reason);
Now we can turn our attention to ensuring that the tags are included in the changeset being generated. This is simply a matter of splitting the tag data on commas and adding each individual tag as a new triple in our graph. We're choosing to use a new property http://schemas.talis.com/kniblet/tag to represent the tag for simplicity. There are several pre-existing tag ontologies that define structures around tags and the acts of tagging but our model is to treat tags as simple keywords attached to an article so literal valued properties will suffice.
We add the following just after the code adding the title and body properties to the graph in calculate_changeset:
$tag_array = split(',', $vars['tags']); foreach ($tag_array as $tag) { $new->add_literal_triple( $this->uri, 'http://schemas.talis.com/kniblet/tag', trim($tag) ); }
The changeset building process will automatically add the appropriate triples. We're ready to test this now. We can edit an article and enter some tags and then save the data. To see the tags at this stage we can view the RDF version of the article which might look like this:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:k="http://schemas.talis.com/kniblet/" xmlns:dir="http://schemas.talis.com/2005/dir/schema#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://kniblet.com/articles/adopting-a-stray-cat"> <rdf:type rdf:resource="http://rdfs.org/sioc/ns#Post"/> <k:tag>adopting</k:tag> <k:tag>pets</k:tag> <k:tag>howto</k:tag> <k:tag>cat</k:tag> <k:body>Adopting a *stray cat* can be a very simple task...</k:body> <dc:title>Adopting A Stray Cat</dc:title> </rdf:Description> </rdf:RDF>
To make the tags display on our page we just need to read the relevant triples and convert to template variables so we can access them in our page template. This is done in the Article's get_vars method. We need to find all the http://schemas.talis.com/kniblet/tag properties and read their values into a PHP array. To do this we're going to use the triple index provided by Moriarty's SimpleGraph. This is a simple data structure that can represent an RDF graph as a set of nested arrays and hashes. The RDF above would be represented as the following structure:
{ "http://kniblet.com/articles/adopting-a-stray-cat" : { "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : [ { "value" : "http://rdfs.org/sioc/ns#Post", "type" : "uri" } ], "http://schemas.talis.com/kniblet/tag" : [ { "value" : "adopting", "type" : "literal" }, { "value" : "pets", "type" : "literal" }, { "value" : "howto", "type" : "literal" }, { "value" : "cat", "type" : "literal" } ], "http://schemas.talis.com/kniblet/body" : [ { "value" : "Adopting a *stray cat* can be a very simple task...", "type" : "literal" } ], "http://purl.org/dc/elements/1.1/title" : [ { "value" : "Adopting A Stray Cat", "type" : "literal" } ] } }
Here's our code for reading the tags into an array for our template:
$tags = array(); $index = $this->graph->get_index(); foreach ($index[$this->uri][ 'http://schemas.talis.com/kniblet/tag' ] as $tag_info) { $tags[] = $tag_info['value']; } $vars['tags'] = $tags;
$index[$this->uri][ 'http://schemas.talis.com/kniblet/tag' ] gives us an array of all the tag properties of our article. We just iterate through them reading the values into our array. This array is then added to our array of template variables.
We can now modify article.tpl.php to look for the array of tags and display them:
<?php if (count($tags) > 0) { echo '<div>Tagged as: '; for ($i = 0; $i < count($tags); $i++) { if ($i > 0) { echo ", "; } echo htmlspecialchars($tags[$i]); } echo '</div>'; } ?>
Now we have a nice display of the tags assigned to each article.
Tag Pages
We can see tags for each article and while that might be mildy interesting for users of Kniblet it would be a lot more useful if we could click on the tags to discover related articles. A quick change to the template enables the linking:
<?php echo '<a href="/tags/' . htmlspecialchars($tags[$i]) . '">' . htmlspecialchars($tags[$i]) . '</a>'; ?>
Now we need to handle requests to those URIs. This means we need another Konstrukt controller. We want to follow the ArticleList/Article pattern so we need a controller for a TagList which could forward to a TagController. But do we actually need a controller for Tags? Our tag behaviour is very simple: just display a list of articles that have that tag. It's simple enough to handle directly in the TagListController's forward method. We can always refactor the code later if we introduce more complex behaviour. Our skeleton TagListController looks like this:
class TagListController extends k_Controller { function forward($name) { $vars = array('tag' => $name); return $this->render("templates/tag.tpl.php", $vars); } function GET() { $vars = array(); return $this->render("templates/taglist.tpl.php", $vars); } }
We can create stub templates too. Now we're actually handling the URIs we specified for tags but we're not displaying the information we need. To do that we need to write a SPARQL query to select all the articles with the current tag plus those articles' titles. A select query like the following should do it:
prefix dc: <http://purl.org/dc/elements/1.1/>
prefix k: <http://schemas.talis.com/kniblet/>
select ?article ?title
where {
?article k:tag "tag" ;
dc:title ?title .
}
We incorporate this into our forward method:
function forward($name) { $vars = array('tag' => $name); $query = 'prefix dc: <http://purl.org/dc/elements/1.1/> prefix k: <http://schemas.talis.com/kniblet/> select ?article ?title where { ?article k:tag "' . $name . '" ; dc:title ?title . }'; $store = new Store(STORE_URI); $sparql = $store->get_sparql_service(); $vars['articles'] = $sparql->select_to_array($query); return $this->render("templates/tag.tpl.php", $vars); }
Now we're ready to use this in our tag template tag.tpl.php:
<h1>Tag: <?php echo htmlspecialchars($tag); ?></h1> <p>Articles tagged with <?php echo htmlspecialchars($tag); ?></p> <?php if (count($articles) > 0) { echo '<ul>'; foreach ($articles as $item) { echo '<li>'; echo '<a href="' . htmlspecialchars($item['article']['value']) . '">' . htmlspecialchars($item['title']['value']) . '</a>'; echo '</li>'; } echo '</ul>'; } ?>
So we now have a fully interlinked tagging system. Articles can be tagged and those tags link to hub pages that display all other articles with that tag. It's very simple but quite a powerful and intuitive interlinking mechanism.
Flexible Data Modelling
An important point to make about this section of the tutorial is the ease with which we changed our data model and the minimal impact it had on our existing code. The only code we had to change was in saving and displaying the tags. We didn't have to change any queries for accessing the article data. Contrast this with a typical relational database system. We would probably already have an Article table and would need to add a Tags table joined with an article_id foreign key relationship. To generate the data for displaying an article we would need to modify our SQL to join the two tables together. Then, instead of having a single row in our result set we would have as many rows as there are tags so we'd have to introduce logic to step through these rows and aggregate the tags for display. Our application code would be more tightly coupled to the physical layout of the data in the database.
Searching Tags
We'd like searches for articles to search the tags assigned to each article too. This requires us to alter the store's field/predicate map and query profile. We add the tag predicate to the field/predicate map and PUT it to the store's config:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bf="http://schemas.talis.com/2006/bigfoot/configuration#" xmlns:frm="http://schemas.talis.com/2006/frame/schema#" xml:base="http://api.talis.com/stores/kniblet-dev1/" > <bf:FieldPredicateMap rdf:about="config/fpmaps/1"> <frm:mappedDatatypeProperty> <rdf:Description rdf:about="config/fpmaps/1#title"> <frm:property rdf:resource="http://purl.org/dc/elements/1.1/title"/> <frm:name>title</frm:name> </rdf:Description> </frm:mappedDatatypeProperty> <frm:mappedDatatypeProperty> <rdf:Description rdf:about="config/fpmaps/1#body"> <frm:property rdf:resource="http://schemas.talis.com/kniblet/body"/> <frm:name>body</frm:name> </rdf:Description> </frm:mappedDatatypeProperty> <frm:mappedDatatypeProperty> <rdf:Description rdf:about="config/fpmaps/1#tag"> <frm:property rdf:resource="http://schemas.talis.com/kniblet/tag"/> <frm:name>tag</frm:name> </rdf:Description> </frm:mappedDatatypeProperty> </bf:FieldPredicateMap> </rdf:RDF>
We also add the tag field to our query profile with a weight of 1, which means it's slightly more relevant than text in the body:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bf="http://schemas.talis.com/2006/bigfoot/configuration#" xmlns:frm="http://schemas.talis.com/2006/frame/schema#" xml:base="http://api.talis.com/stores/kniblet-dev1/"> <bf:QueryProfile rdf:about="config/queryprofiles/1"> <bf:fieldWeight> <rdf:Description rdf:about="config/queryprofiles/1#title"> <bf:weight>2.0</bf:weight> <frm:name>title</frm:name> </rdf:Description> </bf:fieldWeight> <bf:fieldWeight> <rdf:Description rdf:about="config/queryprofiles/1#body"> <bf:weight>0.5</bf:weight> <frm:name>body</frm:name> </rdf:Description> </bf:fieldWeight> <bf:fieldWeight> <rdf:Description rdf:about="config/queryprofiles/1#tag"> <bf:weight>1.0</bf:weight> <frm:name>tag</frm:name> </rdf:Description> </bf:fieldWeight> </bf:QueryProfile> </rdf:RDF>
Once we've done that we can search for articles and the store will rank the results based on matches in the title, body and any associated tags.
Faceting
There is another way we can expose the tags that are relevant to a given search. This is via the Platform Store's facet service. The facet service takes a query and a list of fields and returns an indication of the top indexed terms for each field for the given query. The facet service supplies a rough popularity score for each matching term. This isn't the same as a document count although for small numbers of indexed documents it does approximate one. Our store's facet service is at:
http://api.talis.com/stores/kniblet-dev1/services/facet
As usual with platform services going straight to that URI with a browser gives you a form t
By way of example, suppose we had the following articles:
- Animal Care 101: Walking a Dog (tagged as dogs, exercise)
- Animal Care 101: Grooming Horses (tagged as horses, health)
- Animal Care 101: Healthy Hamsters Need Wheels (tagged as hamsters, health, exercise)
- Animal Care 101: First Aid for Cats (tagged as cats, health)
If we use the facet service to search for "animal" and specify "tag" as one of the fields to facet on then we'd expect the following tags with popularity numbers:
- health: 3
- exercise: 2
- dogs: 1
- horses: 1
- hamsters: 1
- cats: 1
The output of the facet service is an XML document similar to this one:
<facet-results xmlns="http://schemas.talis.com/2007/facet-results#"> <head> <query>dog</query> <fields>tag</fields> <top>10</top> <output>xml</output> </head> <fields> <field name="tag"> <term value="health" number="3" /> <term value="exercise" number="2" /> <term value="dogs" number="1" /> <term value="horses" number="1" /> <term value="hamsters" number="1" /> <term value="cats" number="1" /> </field> </fields> </facet-results>
Moriarty supplies a FacetService class that encapsulates sending the facet request and parsing the results into a PHP array. It follows the usual pattern:
$store = new Store($uri); $fs = $store->get_facet_service(); $facets = $fs->facets_to_array('query', array('field1','field2'));
We can use this in our article search to provide a list of tags that contain articles relevant to the search. In the GET method of ArticleListController we can add the following lines:
$facet_service = $store->get_facet_service(); $vars['facets'] = $facet_service->facets_to_array($query, array('tag'));
That puts the array of facets into a template variable so we can display the relevant tags. We add this to articlelist.tpl.php:
<?php
if (isset($facets)) {
echo '<ul>';
foreach ($facets['tag'] as $tag) {
echo '<li><a href="/tags/' . htmlspecialchars($tag['value']) . '">' . htmlspecialchars($tag['value']) . '</a></li>';
}
echo '</ul>';
}
?>Now, whenever we search we get a list of tags that may be relevant as well as a list of articles that match our search. Clicking on the tag takes the user to the relevant tag page to view all the articles that have that tag assigned. With a little more effort we could style the tags like a tag cloud using the popularity numbers to increase the font sizes and/or weights.
Summary
- Adding a new data type just requires new application logic
- Moriarty supplies a triple index to access RDF very simply
- SELECT queries can be used to pick out specific properties and values from RDF data
- Stores each provide a facet service that returns top terms for a given query

