Using Jena as a SPARQL endpoint

I’ve been involved in a few projects at work over the last couple of years that have made use of Semantic Web technologies (triple stores, RDF, OWL, SPARQL etc). For most of these I’ve made of ARC, a really great PHP library by Ben Nowack for interacting with RDF and triple stores. As great as ARC is, it does have a few drawbacks such as being limited to MySQL triple stores, some issues with OPTIONAL queries and it doesn’t entirely support the SPARQL specification.

For these reasons and for general flexibility, my current project wanted to be able to easily swap the underlying triple store from ARC to Jena as needed so I needed to investigate how to expose a Jena triple store as a SPARQL endpoint. After working this out, I now really really appreciate how easy ARC makes this.

Jena doesn’t appear to ship with the ability to expose the ARQ SPARQL processor as a SPARQL endpoint and hence you need to make use of a separate piece of software called Joseki. The following is the list of things I needed to do to get this working in my environment. Note that your setup may have different requirements and also I may have completely misunderstood the best way of doing this!

  1. Setup a database to use as your triple store and get a JDBC driver so Joseki can interact with it from Java
  2. Download and extract Joseki
  3. Add the JDBC driver to the Joseki classpath (e.g. for Windows by adding the following line to bin\joseki_path.bat: set CP=%CP%;C:\my_jdbc_driver\my_jdbc_driver.jar)
  4. Add the following to joseki-config.ttl:
     
       rdf:type            joseki:Service ;
       rdfs:label          "My Project SPARQL/Update" ;
       joseki:serviceRef   "sparql/myproject/update" ;
       joseki:dataset       ;
       joseki:processor    joseki:ProcessorSPARQLUpdate .
    
     
       rdf:type            joseki:Service ;
       rdfs:label          "SPARQL" ;
       joseki:serviceRef   "sparql/myproject/read" ;
       joseki:dataset       ;
       joseki:processor    joseki:ProcessorSPARQL_FixedDS .
    
     
       rdf:type            ja:RDFDataset ;
       rdfs:label          "My Project" ;
       ja:defaultGraph      .
    
     
       rdf:type            ja:RDBModel ;
       ja:connection       [
                             ja:dbType "MySQL" ;
                             ja:dbURL           ;
                             ja:dbUser         "myproject-database-username" ;
                             ja:dbPassword     "myproject-database-password" ;
                             ja:dbClass        "com.mysql.jdbc.Driver"
                            ] ;
       ja:reificationMode    ja:minimal ;
       ja:modelName        "DEFAULT" .
        
  5. Set the JOSEKIROOT environment variable to the location you extracted Joskei
  6. Run Joseki (from it’s directory) by executing bin/rdfserver.bat

Note that I wanted to be able to make use of SPARUL to update data using the SPARQL endpoint. In ARC I can use SPARQL+ (which is effectively the same for my purposes) on the same endpoint as normal SPARQL queries. For Joseki however, I needed to expose two different endpoints, one for standard SPARQL queries and one for updating.

The one thing I haven’t yet worked out how to do it to be able to use named graphs in my Jena triple store when inserting data. I discovered that the SPARUL update specification requires you to create the graph first (unlike ARC’s SPARQL+) but executing e.g. CREATE GRAPH <http://mygraph/&gt; seems to fail silently as any following INSERT INTO <http://mygraph/&gt; statement fails saying that the graph doesn’t exist. Something to keep investigating. It may be something to do with support for the different types of Jena store (RDB, SDB, TDB, etc) which I don’t fully understand yet (I think my instructions above are using RDB which appears to be old but I couldn’t get TDB or SDB working at all).

So all in all I’m pleased to have worked out how to set this up but I will most certainly continue to use ARC where possible as Jena environments seem unnecessarily complex (although this might simply be because it tends to support the W3 specifications fully!).

Wireless on a Dell Mini 10v in Ubuntu 9.10

I installed the Ubuntu 9.10 Netbook Remix release candidate on my new Dell Mini 10v and the wireless didn’t work out of the box. I think this is because there isn’t an open source driver and Ubuntu doesn’t ship with proprietary drivers installed. Now the 9.10 has been released this problem may have disappeared, but in case anyone else sees this, the way to solve it is to install the proprietary wireless driver (Broadcom STA) yourself.

This is pretty easy using the Ubuntu restricted drivers tools: Ubuntu Menu -> System -> Hardware Drivers (in the 2nd box of applications). Note that you’ll need an Internet connection to actually install this so hopefully you can make use of a wired connection temporarily! When I initially installed this driver there were actually two to choose from but it was the Broadcom STA driver that worked. Now I only see a single option available.

Hardware Drivers

Indicator applet API changes in Ubuntu 9.10

The API for the indicator applet has changed in Karmic and a little internal IBM Python application that I’ve written stopped working. Only a couple of minor changes were needed but trying to track down exactly what these were was not as easy a task as I’d have liked.

Creating the indicator
The class used to represent an indicator appears to have changed from IndicatorMessage to Indicator so I threw in the following code to try the new one and fallback to the old one:

try:
  # Ubuntu 9.10 and above
  indicator = indicate.Indicator()
except:
  # Ubuntu 9.04
  indicator = indicate.IndicatorMessage()

Drawing attention
Previously, indicators automatically made the indicator applet draw your attention with a green dot. In Karmic the green dot appears to have been replaced with a change of the envelope colour to black but it is no longer automatic. To make this work you need to set the draw-attention property:

indicator.set_property('draw-attention', 'true');

Note that they’ve also added a count property to display how many notifications are from the same source.

Building .deb packages for Python applications

Building .deb packages for Python applications

Recently I wanted to build a .deb package for an internal IBM application I was writing so that users could easily install it and also so we could distribute them through some internal repositories. This proved a bit harder than I expected so this is a quick summary of how I ended up doing it. Note that your requirements might be entirely different!

The first thing to do is to create the files required by the packaging process. I discovered that the dh_make command can create a load of sample files that can be used as part of this process. To do this, create a directory in the format [package-name]-[version] (e.g. my-great-app-1.0) and run dh_make from within it (I specified ‘s’ for single binary when prompted). This will create a load of sample files in a ‘debian’ subdirectory. Delete any of these you don’t need (which is probably most of them); I kept the following:

  • changelog – change history for all versions of the app (keep to the format specified by the Debian Policy Manual)
  • compat – no idea why I needed this but things don’t work properly later if I don’t
  • control – the details of the package you are creating (see the specification for all configuration options)
  • dirs – the list of directories in which your app will install files (e.g. /usr/bin, /usr/share/pyshared/my-great-app, /usr/share/applications)
  • README.Debian – the README for your app
  • rules – a MakeFile with instructions for how to create the package (for my Python app the important bit here was in the ‘install’ section; here I created a $(CURDIR)/debian/my-great-app subdirectory and copied all files into it as if it were /, e.g. binary to $(CURDIR)/debian/my-great-app/usr/bin/my-great-app)

Once I’d created all those files and put them in my-great-app/packaging/debian and my source in my-great-app/src I created a simple build script my-great-app/bin/build. This looked something like the following:


#!/bin/bash

export VERSION=1.0
export DEBFULLNAME="Gareth Jones"
export DEBEMAIL="my-real-email-not-this@somewhere.com"

cd ../build
sudo rm -rf my-great-app*
mkdir -p my-great-app-$VERSION
cp -u ../src/*.py ../src/*.desktop ../src/*.ico ../packaging/my-great-app my-great-app-$VERSION
tar -czf my-great-app-$VERSION.orig.tar.gz my-great-app-$VERSION/
cd my-great-app-$VERSION
mkdir debian
cp -u ../../packaging/debian/* debian/
gksu dpkg-buildpackage

This should create you a my-great-app_1.0-1_all.deb and the my-great-app_1.0-1_i386.changes, my-great-app_1.0-1.dsc and my-great-app_1.0-1.tar.gz files your repository maintainer might want.

A really useful video I found for helping me fill in the contents of the debian control files (and getting me through the whole process) was here. Definitely worth checking out if you need to do this yourself.

PackageKit presentation

On Wednesday we had the pleasure of Richard Hughes joining us at Hursley to talk about PackageKit. I’ve heard of it but never quite bothered finding out any more than the name but having gone to the presentation I’m pretty glad. PackageKit is (yet another) attempt at making software updating/installation easier on Linux. There are many existing tools for this already but PackageKit seems to be particularly interesting because it’s not actually trying to replace anything; it works with and makes use of the existing tools whilst providing some real value on top. Below is a very quick summary of Richard’s presentation.

Existing stuff

  • Good packaging formats
  • Depency solvers, downloaders and UIs bolted on
  • Can’t have automatic updates (needs password authentication)
  • Can’t use fast-user switching (lock out install applications/databases)
  • Errors/warnings in English only and really confusing to average user
  • Installation is done by package names not application names (many to many relationships)
  • Can power down during update – bit dangerous!

PackageKit implementation

  • The ‘glue’
  • Integrates with existing tools (including dependency mangement etc)
  • Improves authentication (uses PolicyKit – fine grained control)
  • System activited daemon (only running when you need it)
  • Only need to write simple integration between tools and PackageKit (doesn’t even need to be complete and done for most tools already) plus thin UI
  • Uses DBUS (two layers – one for full control, one “just do it”)
  • Applications can integrate directly (e.g. install clipart from openoffice)
  • Installation/update by application not package (users know what it is they’re installing!)
  • Doesn’t allow shutdown during installs

PackageKit project

  • Easy to contribute (git with anonymous access – merged to release daily)
  • Rapid development (roughly one minor release per month)
  • Shipped with Fedora 9 (and others)
  • Strong interest from OpenMOKO, Ubuntu (and others)

I’ve installed an old-ish release on my Ubuntu machine (straight from the repositories) and it looks pretty good. Definitely gonna pay attention to this project, it looks like a big step in the right direction.

Building a RESTful Web application with PHP

Recently I’ve been putting together a Web application for a research project. I decided it was about time I really looked properly into REST so my Web interfaces are better structured. I won’t go into all the benefits here, you can read for yourself. Suffice to say it seems like a good approach to take.

This is quite a long article and you might only be interested in some of it so here are the sections:

If you have an suggestions for improvement, please let me know – this was a first attempt!
Continue reading

Merging and converting OpenOffice and PDF documents

For my BSL level 3 course I have a variety of OpenOffice documents for different forms etc which make up the portfolio. I wanted a way to merge these into a single document but they each had different margins and all sorts; this meant doing it within OpenOffice was proving to be a pain. My solution was to write a quick shell script to convert each document into PDF and then merge the multiple PDFs. The script looks something like the following:


# convert to pdf
unoconv -f pdf myfile1.odt myfile2.odt ...
# merge pdfs
pdftk myfile1.pdf myfile2.pdf ...
# remove individual pdf documents
rm myfile1.pdf myfile2.pdf ...

A handy little hack to generate my entire portfolio.