Handling of large semi-structured data volumes with acceptable performance is sometimes a challenge with traditional RDBMS approach. The nature of some relation-reach data schemes with multiple inter-connections can be mirrored really good into the realm of graph databases.
Neo4j is the leading graph database providing full OLTP/ACID support. It is built around notion of connected data and schema-free data modeling, where relations/connections are the first class citizens in the model, opposed to traditional RDBMS systems like Oracle or to some NoSQL data stores like Cassandra. It provides good performance, flexibility and facilitates agile development.
An extensible multy-connected master data management system for a large industrial company included multiple functional domains and complex cross-domain relations. Proper data modeling and proper usage of underlying technologies are the key factors to the successful Neo4j implementation.
Below are some features related to the data modeling and prototyping with Neo4j (Version 2.0.0, community and enterprise was used):
- Easy setup, simple and clear architecture
- Good performance for querying, internal heavy usage of Lucene
- REST API layer is a good entry point for simple use cases
- Possibility to create domain specific APIs atop on Neo4J REST API in managed extensions
- Very good performance using native traversal framework
- Powerful Cypher query language with simple syntax
- Proper transaction demarcation is crucial for batch operation performance and memory consumption
- Data model is really flexible and allows extensions for new use cases emergence without refactoring of the existing structure
- Some performance tuning possibilities are available in configuration