Saturday, February 8, 2014

The opinion on NoSQL

A general impression among the cool kids in the block is traditional RDBMS really stink and organizing data as columnar structures or key value pairs is really incredible and provides great set of functionality and performance that traditional RDBMS has suffered.
So in analyzing the NoSQL evolution, there are 3 main paradigms under which the products can be classified

  1. Key Value pair - Eg Redis
  2. Document Style - Eg Mongo Couch
  3. ColumnFamily - Eg Cassandra

The underlying filesystem / storage structures span a wide variety from the Bigtable variants, Dyanmo, BSON ( Binary JSON used in Mongo) to ColumnFamily as in Cassandra or B-Tree as in CouchDB.

The main argument is NoSQL is schema less, there is no concept of a table with set number of columns and the data rather can be unstructured.
I say, fair enough, its a good fit for such data.
The problem i have is this movement which is NO-SQL for a reason is marching very fast towards SQL. 
The classic constructs of a SQL databases such as Primary Keys, a Query Language, 
Right from CQL of Cassandra to Hive or even the dead Unql the NoSQL community is struggling to provide SQL like interfaces to query the nosql stores.

Does that not reveal a systemic issue, the conversion of Relational data modelers into a nosql data model experts is no overnight task, Organizations who have undertaken this path right, first take the RDBMS experts who understand the business, data and relationships and put them through a strong NoSQL journey. This includes formal training, in house presentations, vendor demos, vendor talks, local user group participations, attending conferences, knowledge share with other organizations doing similar things etc.
This creates a good energy in the grass roots of an organization to enrich the appetite to take on workloads and model them in the NoSQL paradigm where its an appropriate fit.
Most common failures in NoSQL implementation i have been seeing are in projects where an Oracle RBDMS expert is asked to build a Cassandra CF overnight or a J2EE development lead is asked to design a mongo or couchdb schema.
Both are recipes for failures. What is needed is some serious investment in hiring talent, cross pollinating existing DBMS talent to appreciate the NoSQL model and organically create transformation for new workloads, migration of old workloads etc.

One note though, the NoSQL paradigm is very interesting for good use cases, but i feel the NoSQL movement is in a state of denial as the problems of traditional DBMS are re-manifesting itself into the NoSQL world.