<< back Page 2 of 3 next >>

NoSQL, NewSQL, and Emerging Blended Enterprise Data Environments

By Joe McKendrick

May 19, 2015

NoSQL and NewSQL database types now seen across enterprises include key-value stores, which enable the storage of schema-less data, aligned as a key and actual data; graph databases, which employ structures with nodes, edges, and properties to represent and store data; column family databases that store and access data within columns instead of by rows, as is the case with relational databases; document databases, which facilitate simple storage and retrieval of document aggregates; and object databases, which represent information in the form of the same objects that are employed in object-

oriented programming. These emerging databases are often built on internet protocols—typically, REST APIs and Java-Script Object Notation, or JSON, which are lightweight data interchange formats.

There are a number of notable database projects that have arisen under the aegis of the Apache Software Foundation, starting with Hadoop and including NoSQL databases such as BigTable, HBase, Cassandra, Hive, Voldemort, and DynamoDB. Commercial NoSQL databases have also evolved while the big RDBMS vendors, including IBM, Microsoft, and Oracle, have responded to these changing markets with their own features to compete with the new offerings, from in-memory technology to support for lightweight data interchange formats such as JSON.

In many cases, new database technologies are coexisting with traditional ones, and increasingly being integrated within blended database environments. Organizations aren’t necessarily going with one type of data environment exclusively, but instead are opting for best-of-breed approaches that match the right type of database to the right requirement. This adds a new layer of complexity to data environments as well, requiring architectural design and integration work to bring these various data environments together. Ultimately, enterprise data architectures will need to be designed and configured to support these blended approaches, with the acknowledgement that each type of database has its strengths, as well as specific roles based on the workloads being handled.

The factors that influence which type of database—or data environment—is best suited for an organization’s requirements include issues such as time and resources, the availability of skills to support new technologies, and the need for self-service, as well as more data-driven requirements in terms of scaling and support for atomicity, consistency, isolation, and durability (ACID).

Keeping Pace With New Data Sources

Today’s data managers and professionals simply may not have the time or resources to keep pace with all the new data sources being added or taken offline on a daily basis. Still, the pressure to deliver data and reporting as needed, often on a real-time basis, keeps growing. Enterprises need to tap into, integrate, and explore the vast array of unstructured data flowing through their systems. At the same time, the ability to maintain database performance is getting more complex and difficult. Needed is a fresh perspective, and NoSQL and NewSQL environments provide new types of workarounds.

Data—and the technology that supports it—is no longer confined to the domain of the IT department. Marketing, finance, and operations increasingly deploy their own solutions, either derived from the cloud, via mobile, or as shadow IT. However, an enterprise foundation is still needed to enable sharing of these resources. NoSQL databases enable decision makers from across the enterprise to identify and enable new data sources without the need to change database configurations.

Weighing the Pros and Cons of Data Management Systems

NoSQL databases are highly accessible and available, but it is often said that many of these database products do not emphasize consistency—the ACID aspect—as is second nature for relational databases. Recently, however, there has been movement within the NoSQL vendor community to embrace and support ACID—hence the arrival of “NewSQL” databases to address this requirement.

In addition, many of today’s applications require rapid turnaround of code and greater flexibility to deliver across multiple platforms. Developers often find NoSQL and NewSQL databases to be quicker to implement and build applications, and don’t require schemas, tables, and SQL queries.

Large datasets—as well as expanding userbases—often inhibit the performance of RDBMSs, and additional licenses must be purchased as businesses grow. This often also calls for expensive consulting engagements to help untangle or address such performance issues. NoSQL databases, which often run on clusters of commodity servers, are built to scale out incrementally as business demands increase.

The key is to target the appropriate type of database and format at the type of data in question. Relational database formats work well for datasets requiring tabular structures, such as transaction records. More complex data forms—such as simulations, or 3D modeling—may fit better into NoSQL settings.

Emerging Business Demands

These days, data models are continually shifting as business demands change. Data schemas also need to be carefully considered and planned, but may be a moot exercise in the NoSQL world. Changes in schema often require that the database itself temporarily go offline, thereby complicating development and deployment work. However, when there are ever-changing and unpredictable forms of data flowing through enterprises from a multitude of sources, any effort to try to design a database schema may be for naught.

<< back Page 2 of 3 next >>