Many of the NoSQL tools out there, such as MongoDB, Couchbase, Hadoop, and others, purport to be leading a revolution and breaking the bonds of servitude to the restrictive, inflexible, established, relational market. They claim users need more, users need better … and they are there to help. Of course, when speaking about those relational flaws, the comments always focus on problematic aspects of a DBMS’ physical implementation.
One vendor lauds striping data content across storage devices as being a NoSQL improvement, completely ignoring the fact that several relational DBMS vendors employ that tactic as well. Are these tools offering any theoretical foundation, some version of logic behind them, to provide some substance to the case for usurping set theory or predicate calculus? No. OK, so there is graph theory behind the graphical databases, but that graph theory already existed behind hierarchical databases when relational databases replaced most of the hierarchical databases back in the day.
Many of these NoSQL rebels are also retrofitting SQL or some variant into their solutions in order to reassure users that they have a familiar and comfortable interface. Admittedly, SQL has always had some controversy surrounding how well it really supports relational theory or even how tight a language SQL may or may not be. But those thoughts aside, as these NoSQL vendors keep harping on the performance of their physical implementations over traditional relational databases’ physical implementations, are they really doing anything more substantial than rehashing old arguments over C# versus Visual Basic? At times, it seems the vendors are simply caught up in their own rhetoric.
If any of these sales teams truly understood relational theory, they would understand that relational theory is not about any physical implementation. Relational theory is about the users of the data.
Users need to be able to consider the data as if it were tables with columns and rows. There is no requirement that the internals be implemented as columns and rows. Key-value pairs, documents, columnar, graphs, or hierarchies may or may not exist; relational theory does not care, because any physical implementation is fine. As each of these tools incorporates a SQL-type interface, they are allowing users to start conceiving these data stores as tables with identifiable rows and columns, which means these tools are relational, at least as far as relational theory is concerned.
Does this mean that these tools are implemented in the same restrictive, logical-must-be-exactly-physical way that most SQL DBMS vendors have implemented? Thank goodness, no, and viva la difference! The relational database world has been held hostage by too many vendors who could see only a world where logical and physical must be one and the same thing. In embracing that coalescent logical-physical view, the industry has been denied the possibility of what might otherwise have been.
Long term, I believe many of these new tools will merge with, and into, the existing SQL DBMS market, and we all will have much stronger and flexible relational tools as a result. Relational theory says that the content of a unique column within a unique row must be “atomic.”
But atomic is one of those fuzzy words. Similar to the way that one man’s ceiling is another man’s floor, one person’s atomic value can be another man’s universe. Atomic does not necessarily mean a single string, either numeric or Boolean. Atomic could be interpreted to mean that a value needs to be singular from some perspective, not atomic from any and all perspectives. Users do need more, and users do need better. The established SQL DBMS vendors have grown soft by focusing on rationalizing their own shortcomings, and in so doing, they have lost their innovative edge. The NoSQL/big data efforts have provided a seriously needed shot across the bow; we can only hope that the established vendors use this call to arms appropriately.