Using Jena as a SPARQL endpoint

I’ve been involved in a few projects at work over the last couple of years that have made use of Semantic Web technologies (triple stores, RDF, OWL, SPARQL etc). For most of these I’ve made of ARC, a really great PHP library by Ben Nowack for interacting with RDF and triple stores. As great as ARC is, it does have a few drawbacks such as being limited to MySQL triple stores, some issues with OPTIONAL queries and it doesn’t entirely support the SPARQL specification.

For these reasons and for general flexibility, my current project wanted to be able to easily swap the underlying triple store from ARC to Jena as needed so I needed to investigate how to expose a Jena triple store as a SPARQL endpoint. After working this out, I now really really appreciate how easy ARC makes this.

Jena doesn’t appear to ship with the ability to expose the ARQ SPARQL processor as a SPARQL endpoint and hence you need to make use of a separate piece of software called Joseki. The following is the list of things I needed to do to get this working in my environment. Note that your setup may have different requirements and also I may have completely misunderstood the best way of doing this!

  1. Setup a database to use as your triple store and get a JDBC driver so Joseki can interact with it from Java
  2. Download and extract Joseki
  3. Add the JDBC driver to the Joseki classpath (e.g. for Windows by adding the following line to bin\joseki_path.bat: set CP=%CP%;C:\my_jdbc_driver\my_jdbc_driver.jar)
  4. Add the following to joseki-config.ttl:
     <#myProjectUpdate>
       rdf:type            joseki:Service ;
       rdfs:label          "My Project SPARQL/Update" ;
       joseki:serviceRef   "sparql/myproject/update" ;
       joseki:dataset      <#myProject> ;
       joseki:processor    joseki:ProcessorSPARQLUpdate .
    
     <#myProjectRead>
       rdf:type            joseki:Service ;
       rdfs:label          "SPARQL" ;
       joseki:serviceRef   "sparql/myproject/read" ;
       joseki:dataset      <#myProject> ;
       joseki:processor    joseki:ProcessorSPARQL_FixedDS .
    
     <#myProject>
       rdf:type            ja:RDFDataset ;
       rdfs:label          "My Project" ;
       ja:defaultGraph     <#myProjectDB> .
    
     <#myProjectDB>
       rdf:type            ja:RDBModel ;
       ja:connection       [
                             ja:dbType "MySQL" ;
                             ja:dbURL           ;
                             ja:dbUser         "myproject-database-username" ;
                             ja:dbPassword     "myproject-database-password" ;
                             ja:dbClass        "com.mysql.jdbc.Driver"
                            ] ;
       ja:reificationMode    ja:minimal ;
       ja:modelName        "DEFAULT" .
        
  5. Set the JOSEKIROOT environment variable to the location you extracted Joskei
  6. Run Joseki (from it’s directory) by executing bin/rdfserver.bat

Note that I wanted to be able to make use of SPARUL to update data using the SPARQL endpoint. In ARC I can use SPARQL+ (which is effectively the same for my purposes) on the same endpoint as normal SPARQL queries. For Joseki however, I needed to expose two different endpoints, one for standard SPARQL queries and one for updating.

The one thing I haven’t yet worked out how to do it to be able to use named graphs in my Jena triple store when inserting data. I discovered that the SPARUL update specification requires you to create the graph first (unlike ARC’s SPARQL+) but executing e.g. CREATE GRAPH <http://mygraph/> seems to fail silently as any following INSERT INTO <http://mygraph/> statement fails saying that the graph doesn’t exist. Something to keep investigating. It may be something to do with support for the different types of Jena store (RDB, SDB, TDB, etc) which I don’t fully understand yet (I think my instructions above are using RDB which appears to be old but I couldn’t get TDB or SDB working at all).

So all in all I’m pleased to have worked out how to set this up but I will most certainly continue to use ARC where possible as Jena environments seem unnecessarily complex (although this might simply be because it tends to support the W3 specifications fully!).

This entry was posted in techy solutions and tagged , , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.
  • Ziya Akar
    Hi,

    Do you have any example about sending Sparql Update request to a Joseki server programmatically in Java?

    Thank you.
  • Sorry, all my interaction with the servers was using ARC which is PHP-based. I assume this can be achieved with Jena using Joseki as a SPARQL endpoint so probably worth checking out the Jena documentation.
  • Thomas
    I just learned that #mem (and obviously #jdbc) datasets are not supported. In Joseki-3.4.1 the creation of named graphs are supported with TDB 0.8.3 (0.8.5 will not work) datasets. See http://openjena.org/wiki/TDB for more information for downloads and joseki integration.
  • Thanks Thomas - useful info.
  • Martin Giese
    Hmm. Is there no way to just create a Jena Model, maybe an inference model, in Java, and then create an endpoint that fetches data from that Model?
  • Hey Martin. I imagine you could do something like that but I already had a MySQL database and was using Joseki to avoid having to write any Java. That's probably the right way to try though.
blog comments powered by Disqus