Enabling the Distributed Family Tree

This is the official research blog for the Distributed Family Tree, an open network of genealogical data and metadata.  In a nutshell, the big idea is that we can combine all available genealogical information on the Internet into a single distributed network.  The foundation for this network is the substance of the Master's Thesis that I am currently working on.

Data Model Extensibility, Part 4

This is the fourth of a five part series on the DFT data model. Part one covered the fundamentals: RDF, OWL, and Named Graphs. Part two demonstrated how basic genealogical information can be recorded in RDF. Part three showed how to record information that changed over the lifetime of an individual, such as surname.

Citing Your Sources

Yesterday we used the example of Brandon and Samantha’s marriage to illustrate how to record information that changed over the lifetime of an individual. Let’s continue with that example to demonstrate how to record provenance. Recall that we had the following statements and graph:

#brandon           gc:name      "Brandon Gilbert"
#samantha          gc:name     "Samantha Davis"Â

#brandon+samantha  rdf:type     gc:Marriage
#brandon           gc:married   #brandon+samantha
#samantha          gc:married   #brandon+samantha

#graph             gc:duration  #dur
#dur               rdf:type     gc:TimeSpan
#dur               gc:start     #brandon+samantha

#graph { #samantha  gc:name  "Samantha Gilbert"  }

We happen to have a marriage certificate which gives some additional information: that the couple was married on December 22, 1903, in Albany, NY. Let’s throw that, along with the marriage information above, into a new graph:

#marriageGraph {
  #brandon+samantha  rdf:type     gc:Marriage
 Â
#brandon+samantha  gc:date     "December 22, 1903"
  #brandon+samantha  gc:place     "Albany, NY"
  #brandon           gc:married   #brandon+samantha
  #samantha          gc:married   #brandon+samantha
}

Now we can record the provenance of this data. First, we create a source called #certificate, along with descriptive properties:

#certificate  rdf:type  gp:Source
#certificate  gp:title  "Marriage Certificate"
#certificate  gp:date   "December 22, 1903"
#certificate  gp:clerk  "M. C. Lawrence"

Then we attach the source to the graph of marriage data:

#marriageGraph   gp:source  #certificate

Note the use of the gp: prefix, which is short for the Genealogy Provenance (GP) ontology. Like with the gc: prefix, any use of this prefix is purely suggestive because the ontology hasn’t been concretely defined yet.

Next Time

In the last part I’ll show how the DFT can accomodate contradictory data.

    Trackbacks/Pingbacks


  1. […] In part four we’ll see how we can use graphs to record other additional information about genealogical statements (specifically provenance, or source data). Technorati Tags: genealogy, DFT, RDF, Named Graphs […]

Leave a Reply