<< back Page 2 of 3 next >>

Expect Increasing Market Consolidation in Database Management

In the relational arena, Oracle still is the undisputed champion with Microsoft running a strong second. SAP and IBM also have significant share, as do open source databases MySQL and PostgreSQL.

In the Hadoop space, key vendors include Cloudera, Hortonworks, and MapR, while in the nonrelational—“NoSQL”—market, there are a few strong players, including MongoDB, which continues to grow its community at a very impressive rate. With 30 million downloads to date, it claims to have the most widely deployed NoSQL solution, while Cassandra has far fewer installs, but can point to more massive global deployments. MarkLogic, Redis, and CouchBase, as well, remain active players.

Graph database vendor Neo4J remains strong in the dedicated graph database segment. Increasingly, however, there is also integration of graph compute engines in major non-relational and relational databases.

The new database segments that were created in 2008/2009 seem to have largely solidified by now, and there don’t seem to be any significant new niches to be filled. The future can always hold surprises, but I don’t expect to see any really significant new database vendors emerging over the next few years.

Technology Consolidation

Choosing a DBMS today is a sort of Hobson’s choice between multiple not-quite-right technologies. Whichever database you choose is probably missing at least one desirable feature found in another, mostly incompatible, system.

“Which is the best database for me?” Often, there is no right answer to the question, because each choice implies some form of compromise. An RDBMS may be superior in terms of query access and transactional capability, but fail to deliver the network partition tolerance required as an application grows. A popular NoSQL database system may deliver the best cross data center availability, but fail to integrate with BI systems. And so on …

The database vendors are feeling this pressure. Relational database customers ask for features from the NoSQL systems, while NoSQL vendors are under constant pressure to close the functionality gap with the more traditional RDBMS. What database buyers want is a future in which these features become configuration options within a single database architecture rather than forcing a choice between multiple not-quite-right systems.

These challenges are being addressed by the vendors. Oracle, for example, has responded to this pressure in two ways, first, by offering non-relational technologies within its portfolio such as the Oracle Big Data Appliance, which embeds Cloudera Hadoop. More significantly, Oracle is layering on non-relational features within the RDBMS engine itself. These include JSON-storage and JavaScript querying capabilities, which resemble those of MongoDB, a sharded distributed database option, as well as a graph compute engine.

The Hadoop vendors are constantly adding more richness to their SQL query engines—transactional capabilities to the Hive SQL layer, for instance. The operational NoSQL vendors also invest heavily in SQL or SQL-like languages. Cassandra has CQL (Cassandra Query Language), MongoDB offers a SQL bridge (BI connector), while CouchBase offers an almost complete SQL dialect (N1QL).

Microsoft is arguably making the strongest explicit claim for a converged database system with its Azure Cosmos DB Database as a Service (DBaaS). Cosmos DB claims to support four data models—key-value, column-family, document, and graph. Data in Cosmos DB can be queried using SQL, JavaScript, MongoDB, or graph query languages, and the database supports multiple transactional modes ranging from strong to eventual consistency models.

Database as a Service Comes of Age

The next-generation databases that broke the dominance of the relational model were inspired by the cloud but were not initially true cloud databases. While these databases were based on technologies invented at cloud mega-companies such as Amazon and Google, they were initially run on-premise or within an infrastructure cloud environment which emulated a traditional data center architecture.

Databases have been slower to migrate to the cloud than other elements of computing infrastructure for two main reasons:

  1. A database running in the cloud may need to expose its ports to the public internet. Typically, databases have been buried deep behind firewalls, and the administrators have been reluctant to abandon the border security that, to date, has been deemed necessary.
  2. Databases often transfer significant amounts of data across the network—moving a database into the cloud creates a latency that is not usually tolerable unless the application also moves into the cloud. Therefore, databases rarely move into the cloud ahead of the applications they serve.

These two factors combined prevented a mass exodus to DBaaS systems in the first few years of the cloud. But the indications are that DBaaS adoption is poised to accelerate.

<< back Page 2 of 3 next >>


Subscribe to Big Data Quarterly E-Edition