Gene Smith's Lifestream

About Gene Smith's Lifestream

Hello, my name is Gene Smith...You can follow me on Twitter, where I"m constantly dropping indispensible knowledge, such as this nugget. Whew! can you stand it!?! I'm also on Facebook and Linkedin. Plead your case and tell my why I should add you to my growing and exclusive network/list of friends. Um, on second thought, don't bother, I'll probably just add you anyway. I also "blog" at my place of employment, Ignite Social Media. I'm working on a bunch of web applications, which I'll probably mention here at some point whenever I get around to finishing them.

Search

Graph Databases and Social Data

September 28, 2009 7:46 pm

Recently, Ben Scofield, Viget's Technology Director, gave a great presentation on "The Future of Data." In giving everyone a general overview of the database landscape, he pointed out that traditional relational database systems, such as Postgres, Oracle, Microsoft SQL Server, and MySQL, were being assaulted from two different directions. On one side you have browser-based HTML5 DOM-based storage and on the other the side you have a whole slew of non-sql/post-relational systems, such as CouchDB, MongoDB, graph databases (such as Neo4j, and Google's hypergraphdb), among other systems.

Fascinating discussion, but what really caught my attention was the mention of the "graph database," which was a new one on me as well as others in the room. Graph databases have been around since the 80's, but have found new life given the rise of social data. Emil Eifrem of Neo4J sums up the advantages that graph databases have over traditional RDBMS when dealing with social data.

Most applications today handle data that is deeply associative, i.e. structured as graphs (networks). The most obvious example of this is social networking sites, but even tagging systems, content management systems and wikis deal with inherently hierarchical or graph-shaped data.

This turns out to be a problem because it’s difficult to deal with recursive data structures in traditional relational databases. In essence, each traversal along a link in a graph is a join, and joins are known to be very expensive. Furthermore, with user-driven content, it is difficult to pre-conceive the exact schema of the data that will be handled. Unfortunately, the relational model requires upfront schemas and makes it difficult to fit this more dynamic and ad-hoc data.

A graph database uses nodes, relationships between nodes and key-value properties instead of tables to represent information. This model is typically substantially faster for associative data sets and uses a schema-less, bottoms-up model that is ideal for capturing ad-hoc and rapidly changing data.

Very interesting/cool.

For a more in depth write-up on graph databases, check out Lorenzo Alberton's article on "Graphs in the database: SQL meets social networks" and Neo4J's "Social networks in the database: using a graph database".

You can watch a presentation that Emil gave about graph databases on InfoQ.com.

And you can view Ben's slides on the "Future of Data" here:

blog comments powered by Disqus