With NoSQL, a DBA’s Future Shines Brightly From the Past

Just when you thought NoSQL meant the end of SQL, think again, and realize why you need to hold on to your relational database administrator like it was 1999. NoSQL has proven to be a resilient next-generation database technology for increasingly common internet-era specialized workloads. Now approaching a decade after its arrival on the scene, NoSQL is moving beyond architectural marvels to practical tools in the software development toolkit and, in that process, unveiling tried-and-true capabilities formerly known to be the scalpels of the enterprise relational database. Let’s go back to the future and take a look at how the DBA is becoming as relevant as ever while NoSQL evolves for the enterprise.

Agility and the Hype of Schema-Less

In the early days of NoSQL, no characteristic was as inextricably tied to the technology as its “schema-less” nature. After all, what could be less a relational database than one which had no schema enforcement? In the relational database world, schema is commonly known as the realm of the database administrator (DBA), that place where a developer had to pick up the phone and seek assistance before moving forward with development. Not having to pick up that phone introduced a new level of agility into the development process and thereby propelled projects forward and the “schema-less” NoSQL database into the developer heart and mind.

Turns out though, schema provides a kind of structure for data, and through structure, data becomes easier to process in an optimized manner and maintain over longer periods of time—longer periods than the typical developer stays on a particular software project. Looking at NoSQL today, what we observe is the vast majority of NoSQL databases now offering schema in the form of tables. What once were schema-less technologies have evolved to bring the core values of relational schema to the architectural wonder of a NoSQL database. Along with a table-based schema, returns a scenario in which DBA activities such as indexing, security, and data governance are a normal function within the enterprise organizations leveraging this class of technology.

The lessons of the past come full circle as early NoSQL adopters come to appreciate that a data lifecycle does not stop at the first release. Fortunately, all was not lost in the schema-less excursion, as the need for agility has come to the forefront and is making its way into the feature list for leading relational databases. Database technology leaders are absorbing new lessons while remembering the values on which they were founded, and advancing the industry for a new generation of developers and DBAs alike.

Eventual Consistency Leaves Room for Database Transactions

Another characteristic of early NoSQL involved shunning the sacred ACID properties of the relational database in favor of “eventually consistent” persistence. These ACID relational database properties of atomicity, consistency, isolation, and durability formed the foundation of the relational database transaction, providing the resiliency that drove relational technology deep into the enterprise for mission-critical software.

It was said in the early days of NoSQL that the transaction had to be sacrificed in order to achieve the NoSQL promise of performance at scale for internet era application workloads. Dropping the database transaction meant some very difficult data integrity problems would be left unattended as expensive, complex, and error-prone alternatives such as manual log auditing and compensating operations became a necessary part of business operations. Today, we find the classic database transaction making its way back into the NoSQL database solution. Nearly all NoSQL databases now offer some form of transaction, as the brittle nature of the transactionless system has proven too unwieldy for enterprise-class solutions, and user demand has driven the transaction requirement back into the NoSQL product line.

Here again though, incumbent relational database leaders have taken a new lesson from a younger, ambitious generation of developers. They have recognized that not all data operations are equal and modernized their systems to accommodate these less stringent, eventually consistent operations. It is true that often critical transactional updates of things such as an account balance are intertwined with mundane profile updates and providing the hooks to mix these operations opens the door to more efficient database systems. While NoSQL evolves to provide transactional semantics, schema and transactions come together again as a place where developers often engage with their DBA colleagues to plan how to design effectively for the needs of mixed transactional data semantics combined with NoSQL eventual consistency.

Specialized APIs Yield to Standards-Based Query

At its roots, what is a NoSQL database if not a database that does not have the DBA’s SQL? Indeed, we could have seen many specialized APIs developed for accessing NoSQL databases as there were NoSQL database products, with each vendor having a proprietary, nonstandards-based way to store and retrieve data.

First, this absolutely reeked of vendor lock-in and even the enticed NoSQL developer could appreciate that problem. In addition, the lack of an industry-standard way to store and request data meant that an entire world of tools and advanced data processing solutions could not access the NoSQL data, leaving it siloed within the enterprise. Finally, the extensive investment in skills made over the last decade and that employee skill transferability could not be preserved for the enterprise as technologists move between organizations. And while all those things could be reason enough to consider standardization, the NoSQL vendors themselves had yet another reason—productivity—since, in the face of a continual change in the landscape of software components, there was a practical inability to create custom interfaces to each and every new product arriving on the market.

It just all made sense, and so today, we find that leading NoSQL database products have embraced some subset of the SQL language to store and retrieve data. The standardization ushers in a new level of productivity for NoSQL vendors and is making way for extended drivers based on SQL (e.g., ODBC/JDBC ), so that hoards of existing tools can now be used to “de-silo” NoSQL data. Long after the developer has moved on, the organizational need for the data that landed in these NoSQL systems will endure and the enterprise DBA armed with his tools of choice will unlock the value in the data within those systems.

Database Tuning by the Experts

Now let’s get down to the nitty-gritty and think about what a DBA does with database schema, now that it’s showing up in our NoSQL databases. The DBA masterfully optimizes access to data by performing audit logging of database operations and making advanced decisions such as indexing to speed up queries and adding constraints to guard against rogue data integrity operations, changing configuration settings to allocate memory or to facilitate larger numbers of connections, and more. As NoSQL systems have evolved to incorporate relational database characteristics such as schema, standardized query languages, and transactional semantics, the traditional skills of the DBA are coming to the forefront to deal with long-term manageability. We as enterprise professionals should never forget that data lives longer and has more value than the application from which it is born. New patterns of access will emerge and new external needs for data will arise, and in these contexts, the enduring skills of the enterprise DBA are instrumental in delivering organizational agility and extracting value.

Database Administration Now and in the NoSQL Future

NoSQL is often thought of as easy-to-use technology because it is typically very easy to get up and running and to create a first prototype application. However, that ease of use is largely from the developer’s point of view, an artifact of how comparatively simple it is to install a single instance database server, connect to it, and start writing some schema-less data storage and retrieval.The realities of managing NoSQL in larger-scale production are commonly deferred, holding onto the belief that NoSQL will simply “expand” as needed to handle larger data and more concurrent user requests.

NoSQL is database technology with new architectural principles that enable it to scale out across many machines while remaining highly available in the face of unexpected hardware failures. This distributed nature of NoSQL’s architecture makes it more complex to manage than the typical scale-up process model of the relational database in the same way that managing an Oracle Real Application Cluster is significantly more complex than managing a single instance Oracle Database. Designed to scale out immediately for availability reasons, however, the additional complexity of managing NoSQL comes to light very soon after moving toward production and, in fact, many of the leading NoSQL products don’t scale up well, having been purpose-built to deal with scaling out horizontally.

Care and feeding of a growing NoSQL cluster require a very different skill set than that of the software developer who designed it into the application. It’s the skill set of the DBA , placing processes on specific servers, opening ports for monitoring, setting up cluster-wide security and user management, dealing with disaster recovery, backup and restore, and all of those things the DBA has been doing to keep the enterprises mission-critical relational database healthy and serving the organization.

In this new era of “right tool for the job” software development, the data management tool chest has expanded and technologies such as NoSQL and Hadoop are serving to complement the proven practice of transactional data processing and analytics using relational databases. Over the last years, these new technologies have evolved to take on more of the capabilities proven in the evolution of relational technology. And, in that process, traditional relational database management skills affiliated with DBAs have become increasingly relevant. Further, these different database technologies serve different data processing workloads, and future applications will necessarily leverage all of these individual technologies together in order to handle workload diversity when bringing new solutions to market. A convergence of capabilies around proven practices and standards will support the DBA’s existing skill set, serving a critical role in managing the entire stack of database technology for a future generation of applications.

Image courtesy of Shutterstock.


Subscribe to Big Data Quarterly E-Edition