Have you noticed that relational "technology" seems to be taking a beating these days? With new “data models” and types of database systems purported to be better than relational, it seems that the industry is all too ready to stick a fork in relational. So, is relational technology outdated? There are all sorts of pretenders to the throne for data management including Hadoop, XML, and the various types of NoSQL database products. But why? Are there truly any shortcomings in the relational way that require its wholesale replacement?
Before we move ahead to consider these questions, let’s first admit that there are no “true” relational database management systems. Although IBM Db2, Oracle Database, Microsoft SQL Server, PostgreSQL, and many others are based on the relational model, none of them support all of the features that would make them truly relational. It is probably better to refer to them as "SQL DBMS products."
This raises the question: Prior to sounding the death knell for relational, shouldn't we at least try a fully relational DBMS first? Keep in mind that the three largest software companies in the world today are the three biggest providers of relational—well, SQL?database technology: IBM, Microsoft, and Oracle. These companies’ database products are successful because they are useful and, in fact, help companies manage and access their data.
You may be asking: Why is it important for my DBMS to be relational? My DBMS works just fine as it is?how could relational help? Let's look at some issues (although we will not cover all of them).
With SQL, you have a language that is built by committee—with all of the problems that this implies. One issue arises due to the fact that you can write the same data request in many different ways. This can cause performance problems and confuse programmers. Another problem is that current SQL database systems permit the creation of tables that can contain duplicate rows. The relational model is based on set theory, and a set cannot contain duplicate members.
Even more importantly, there are significant relational features that current SQL database systems omit, such as domains. A domain is basically the set of all valid values for a particular column (or attribute). It is flexible, so you can define the valid values more precisely than a mere data type. But a domain actually defines the valid operations and comparisons that can be performed on columns assigned to that domain. Because SQL DBMS products lack domains, it is completely legal to code the following predicates:
- WHERE AVG(SALARY) > PHONE_NUMBER
- WHERE SHOE_SIZE = IQ
- WHERE AGE = FIRST_NAME
These are not useful predicates. But if each of these columns were assigned to a different domain, comparisons between them would be disallowed because comparisons between incompatible domains would not be permittedand values outside the scope of the domain could not be assigned to the column. All of these capabilities would serve to boost data integrity and data quality.
Then there is the whole missing data and nulls issue. Although nulls originally were defined in the relational model by its inventor, Ted Codd, subsequent research by Chris Date and others has shown the problems that nulls can create. Better solutions for handling missing data have been proposed.
This short diatribe should show you that there are some problems in existing SQL DBMS implementations. But, instead of throwing the baby out with the bathwater, wouldn't it be better to see what a truly relational DBMS could be capable of providing?
So, what is a relational DBMS? At a high-level, it is one that is implemented based on Codd's 12 rules. Relational is traditionally defined as adhering to these rules, as well as Codd's original 1970 paper. (For those interested in additional research, an interview with Chris Date offers a good discussion of relational concepts: www.red-gate.com/simple-talk/opinion/opinion-pieces/chris-date-and-the-relational-model.)
What about NoSQL?
There are four different types of NoSQL database systems: wide column stores, key/value, document stores, and graph database systems. A fifth option is the multi-model DBMS which spans multiple types of NoSQL and SQL database engines. The NoSQL options evolved over time to support the data needs of specific development niches. Most NoSQL offerings do not require ACID transaction integrity or a fixed schema, which allows them to deliver easier development for mobile applications, content management, profile management, and other real-time big data use cases.
But do not let anybody tell you that NoSQL will overtake relational/SQL or that ACID is not important for most business applications. NoSQL does not replace relational/SQL, it augments it. It is also important to remember that many of the NoSQL offerings are relatively immature, especially when compared to the 40-year history of relational/SQL database systems. Businesses whose mission-critical applications rely on battle-hardened SQL databases should be loath to replace them wholesale with NoSQL. NoSQL has its place … just not as a replacement for most existing relational and SQL database system implementations.