Graph Databases and Social Data
September 28, 2009 7:46 pmRecently, Ben Scofield, Viget's Technology Director, gave a great presentation on "The Future of Data." In giving everyone a general overview of the database landscape, he pointed out that traditional relational database systems, such as Postgres, Oracle, Microsoft SQL Server, and MySQL, were being assaulted from two different directions. On one side you have browser-based HTML5 DOM-based storage and on the other the side you have a whole slew of non-sql/post-relational systems, such as CouchDB, MongoDB, graph databases (such as Neo4j, and Google's hypergraphdb), among other systems.
Fascinating discussion, but what really caught my attention was the mention of the "graph database," which was a new one on me as well as others in the room. Graph databases have been around since the 80's, but have found new life given the rise of social data. Emil Eifrem of Neo4J sums up the advantages that graph databases have over traditional RDBMS when dealing with social data.
Most applications today handle data that is deeply associative, i.e. structured as graphs (networks). The most obvious example of this is social networking sites, but even tagging systems, content management systems and wikis deal with inherently hierarchical or graph-shaped data.
This turns out to be a problem because it’s difficult to deal with recursive data structures in traditional relational databases. In essence, each traversal along a link in a graph is a join, and joins are known to be very expensive. Furthermore, with user-driven content, it is difficult to pre-conceive the exact schema of the data that will be handled. Unfortunately, the relational model requires upfront schemas and makes it difficult to fit this more dynamic and ad-hoc data.
A graph database uses nodes, relationships between nodes and key-value properties instead of tables to represent information. This model is typically substantially faster for associative data sets and uses a schema-less, bottoms-up model that is ideal for capturing ad-hoc and rapidly changing data.
Very interesting/cool.
For a more in depth write-up on graph databases, check out Lorenzo Alberton's article on "Graphs in the database: SQL meets social networks" and Neo4J's "Social networks in the database: using a graph database".
You can watch a presentation that Emil gave about graph databases on InfoQ.com.
And you can view Ben's slides on the "Future of Data" here:
- Tags:
- database
- mongodb
- graph database
- neo4j