SPARQL to SQL
Two weeks ago I mentioned that, for performance reasons, I would need to reimplement part of the SPARQL engine so that it offloads most of the work to the database. Richard Cyganiak’s sparql2sql engine does just that, but it doesn’t work with the latest version of ARQ (distributed with Jena), and it doesn’t support NG4J. As development and support of the engine have been abandoned due to time constraints and numerous other projects, I’ve taken it upon myself to figure out the code and adapt it to the current versions of ARQ and NG4J. I’m pleased to report that I’ve had great success!
So far I only have very basic SELECT queries working. As in: SELECT * WHERE { GRAPH ?g { ?s ?p ?o } }, without any literals in the data. Not too useful, but very promising as the hard part is now over. The code is written specifically for the HSQL database, but I plan to abstract that out so that other databases can be supported (I’ll need MySQL support for Valhalla, the server software).
The original sparql2sql engine didn’t support translation of OFFSET and LIMIT. Apparently this was handled by ARQ, but the current ARQ implementation doesn’t seem to handle them in quite the same way, so I’ve written code which translates them into SQL. This allows me to efficiently retrieve subsets of the data, which was my original objective. I still need a way to count the number of statements a query returns without iterating through them all first, though. I’m still amazed that SPARQL doesn’t support any aggregation functions, particularly COUNT(*).  I’ll probably just extend the sparql2sql engine with a getSelectCount() function that I can use programmatically.
Technorati Tags:
Filed in 
Trackbacks/Pingbacks
