Enabling the Distributed Family Tree

This is the official research blog for the Distributed Family Tree, an open network of genealogical data and metadata.  In a nutshell, the big idea is that we can combine all available genealogical information on the Internet into a single distributed network.  The foundation for this network is the substance of the Master's Thesis that I am currently working on.

Data Model Extensibility, Part 1

In a past post I mentioned that the data model I’m using is very extensible. I’d like to explore that model a little over the course of the next week to demonstrate just how it can be extended.Â

Subjects, Predicates, and Objects

The foundation of the data model is the Resource Description Framework (RDF), a language for representing information with well-defined semantics as a graph. The basic building blocks of RDF are resources, properties, and values:

  • Resources are the things being represented, and are identified by Universal Resource Identifiers (URIs) such as http://www.example.com/example.htm or uri:uuid:e2f03e8d-72a1-4f7d-a9f5-1cab5f9cae80. Such a URI may or may not be the actual location of a Web page; it doesn’t really matter so long as the identifier is unique.
  • Properties are the attributes of things, also identified by URIs.  An important characteristic is that they have well-defined semantic meaning. For example, http://purl.org/dc/elements/1.1/title is a property that can be used to describe the formal name of some resource, such as this Web site.
  • Values of the attributes of things are the last basic building block, and can be identified by URIs or literal values. For example, the title of this Web site is the string literal “Enabling the Distributed Family Tree”.

These three building blocks are also known respectively as subjects, predicates, and objects. A subject and object connected by a predicate is called a statement, or triple.

The Semantic Meaning of Predicates

Available properties are defined in ontologies. An ontology is a sort of schema that dictates the semantic meaning of resources and predicates and how they relate to each other. The current standard for publishing RDF ontologies is with the OWL Web Ontology Language.

For example, http://purl.org/dc/elements/1.1/title is a property of the Dublin Core ontology, and is defined just so:

URI:Â http://purl.org/dc/elements/1.1/title
Label:Â Title
Definition:Â A name given to the resource.
Comment:Â Typically, a Title will be a name by which the resource is formally known.
Type of Term: Â element
Status:Â recommended
Date Issued:Â 1999-07-02

Properties are often given an abbreviated name.  This property, for example, is usually referred to as dc:title.

Statements About Statements

A collection of statements is a graph: subjects and objects can be thought of as nodes, while predicates serve as the connecting arcs. RDF can be extended to support multiple graphs through a concept known as Named Graphs. With this extension, each graph can be given a unique identifier which can then be used as the subject or object of other statements.

Next Time

In part two I’ll show how basic genealogical information can be recorded in RDF.

    Trackbacks/Pingbacks


  1. […] This is the second of a five part series on the DFT data model. Part one covered the fundamentals: RDF, OWL, and Named Graphs. […]


  2. […] This is the third of a five part series on the DFT data model. Part one covered the fundamentals: RDF, OWL, and Named Graphs. Part two demonstrated how basic genealogical information can be recorded in RDF. […]


  3. […] This is the fourth of a five part series on the DFT data model. Part one covered the fundamentals: RDF, OWL, and Named Graphs. Part two demonstrated how basic genealogical information can be recorded in RDF. Part three showed how to record information that changed over the lifetime of an individual, such as surname. […]


  4. […] This is the last of a five part series on the DFT data model. Part one covered the fundamentals: RDF, OWL, and Named Graphs. Part two demonstrated how basic genealogical information can be recorded in RDF. Part three showed how to record information that changed over the lifetime of an individual, such as surname. Part four showed how to cite sources. […]


  5. […] Sound familiar? I thought so too. I was especially excited to read about their analysis of why a vertical representation is desirable. The benefit that really stood out to me was rapid schema evolution: the ability to frequently add new attributes without incurring significant costs. This is key to the extensibility of the DFT data model. […]

Leave a Reply

cheap generic kamagra kamagra uk viagra, Viagra Buy generic viagra levitra and cialis pills