Wednesday, June 28, 2006

RDF impressions at YAPC

While attending YAPC 2006 in Chicago, I've been once again impressed with the need to get a set of RDF tools working in Perl. There are so many interesting tools that could be implemented if only there were some basic RDF modules. We need something to do for RDF what DBI does for database access, but with flexibility that matches RDF's. I've heard that RDF::Helper may be a start in that direction, but it doesn't look nearly visionary enough. One of my highest priorities is to get this basic RDF toolchain working. I could use Redland, but I really want my RDF-related modules to work regardless of the implementation that a user chooses. For instance, at work, I developed an application that stores it's data as RDF in a Sesame repository. I should be able to switch from Redland to Sesame by changing very little code and have everything Just Work.

Here's a short list of some of the RDF modules that I want — and the inspiring/reminding talks where appropriate.

  • RDFI — Generic RDF interface like DBI for RDF

  • I attended Joe McMahon's talk about Designing for Pluggability which reminded me of the importance of interchangeable parts and modifiable parts in modern software design. RDF is a great match for this pluggable design approach because it allows plugins to extend the data model as well as the code. I've been working on a timecard application for a while and I keep coming back to RDF as the perfect, extensible data model for that application. I don't want to write all the functionality into the core application. If someone wants to record the color of their hair at the time they clocked in, they should be free to extend the data model to accomodate that desire. That data should then become a first class citizen available for querying (SPARQL) and modification like any other data.

  • RDF::Sync — synchronize the RDF in two repositories

  • During the lightning talks, Jesse Vincent presented a brief segment (during Adam Kennedy's talk) where he described the ideal situation where one's data lives on his laptop and out on the net. He wants to get off the airplane and have his laptop data automatically sync with the data that's stored out on the net.

    I've wanted this same thing since I often switch between: connected desktop, connected laptop, disconnected laptop. I'm sick of trying flawed approaches to synchronize my bookmarks. When I'm off in the wilderness and want to find an email that I saw three months ago, I don't want to wait until I can get a net connection again. My data should be where I want it, when I want it.

    Because RDF can be used as a universal data storage format, and it's easy to synchronize that data between different repositories, I think that RDF is a great solution to this problem.

  • RDF::MSG — calculate the minimum self-contained graphs for an arbitrary RDF graph.

  • RDF synchronization and RDF signing both require a technique for splitting an RDF graph into atomic chunks. The technique is easy to implement, I just have to bundle it into a handy Perl module and toss it on the CPAN

Ingy döt Net gave a mostly incoherent lightning talk (by design, I'm sure), but amidst the swearing and random auditory noise, there were some good pieces about an idea that he called CogBase. I didn't get a chance to talk with him about it afterwards, but his notion seems to coincide with an idea I've had for a while: a single, versioned repository for all my data like contacts, email, bookmarks, websites visited, …. I think that RDF is the solution to this universal database. Look at the Description section of the CogBase module and compare that with what the RDF data model provides. 8 of the 15 requirements are already met by RDF. Solutions for versioning and access control issues are still an open question within the RDF community, but I don't think they're insurmountable. With those two pieces solved only the item about object-orientedness of schemas is still missing.

There are just too many cool apps that can be built once the tools are in place. I just need the time and resources to work on it. Let's see if I can implement it in less than the 3 years these ideas have been rolling in my head. Maybe next year I'll submit a collection of talks about RDF

  • Introduction to RDF

  • Using RDF as a Data Model

  • Extensible Applications with RDF

or some such.

No comments: