Enabling the Distributed Family Tree

This is the official research blog for the Distributed Family Tree, an open network of genealogical data and metadata.  In a nutshell, the big idea is that we can combine all available genealogical information on the Internet into a single distributed network.  The foundation for this network is the substance of the Master's Thesis that I am currently working on.

SPARQL to SQL

Two weeks ago I mentioned that, for performance reasons, I would need to reimplement part of the SPARQL engine so that it offloads most of the work to the database.  Richard Cyganiak’s sparql2sql engine does just that, but it doesn’t work with the latest version of ARQ (distributed with Jena), and it doesn’t support NG4J.  As development and support of the engine have been abandoned due to time constraints and numerous other projects, I’ve taken it upon myself to figure out the code and adapt it to the current versions of ARQ and NG4J.  I’m pleased to report that I’ve had great success!

So far I only have very basic SELECT queries working.  As in: SELECT * WHERE { GRAPH ?g { ?s ?p ?o } }, without any literals in the data.  Not too useful, but very promising as the hard part is now over.  The code is written specifically for the HSQL database, but I plan to abstract that out so that other databases can be supported (I’ll need MySQL support for Valhalla, the server software).

The original sparql2sql engine didn’t support translation of OFFSET and LIMIT.  Apparently this was handled by ARQ, but the current ARQ implementation doesn’t seem to handle them in quite the same way, so I’ve written code which translates them into SQL.  This allows me to efficiently retrieve subsets of the data, which was my original objective.  I still need a way to count the number of statements a query returns without iterating through them all first, though.  I’m still amazed that SPARQL doesn’t support any aggregation functions, particularly COUNT(*).  I’ll probably just extend the sparql2sql engine with a getSelectCount() function that I can use programmatically.

    Trackbacks/Pingbacks


  1. […] Having implemented a functional, though not complete, SPARQL to SQL query engine, I now have the basic tools I need to press forward with writing a simple client program to navigate the Genealogy Web. I excitedly began to put components together when, alas, I ran into another snafu. […]


  2. […] In implementing the filter box, I explored several of the SWT controls, discovered how listeners work (so that I could create my own), toyed with UI and worker threads to make the UI more responsive while loading large amounts of data, and implemented some additional SPARQL functionality in the sparql2sql engine. So far things are looking great. I really should get on to the next step now. Technorati Tags: Genesis, ISO-639, AQS, sparql2sql […]


  3. […] a very well written SPARQL to SQL translator.  The lack of such a translator is what originally motivated me to do my own custom plumbing.  It still lacks SQL FILTER evaluation, though, which perhaps can be my contribution to the […]

Leave a Reply