The Game-Changing Technologies Powering the Data-Driven Enterprise in 2021 and Beyond

<< back Page 3 of 3


Graph databases—databases that store data along with connections to ensure rapid access to all relationships—are emerging as a key component of real-time environments. Jim Webber, chief scientist at Neo4j, has seen a surge of use cases for supply chain applications. “As global businesses slowed production due to COVID-19, we saw a rise in demand from organizations that wanted to implement graph databases into their supply chain strategies and ensure business continu­ity.” The advantage gained using graph databases, Webber added, is that they allow arbitrary associativity numbers and types of relationships—rather than just joining tables on keys. “Since graphs are a more useful data structure than tables, they are better equipped to process con­nections in real time.” The biggest chal­lenge with moving to graph databases, he pointed out, is “unlearning everything we have learned about relational databases and accepting that graphs are different.”


Vectorized databases—often seen as columnar databases that employ vector­ized data to more rapidly and efficiently utilize CPU cache—are seeing increased demand, as they “deliver next-genera­tion performance on behavioral data from sensors and machines,” said Nima Negahban, chief technology officer and co-founder at Kinetica.

Vectorized databases “offer orders of magnitude performance improvements on common big data analytic workloads like aggregations, predicate joins, equi­joins, derived columns, window func­tions, graph solvers, and certain GIS [geographic information system] func­tions,” Negahban said. “Traditional data­bases have their internal data structures and their query processing logic designed for using as little compute as possible to process a given query or accept a muta­tion. This works well for problems where the actual amount of data that needs to be analyzed can be minimized using indexes. However, modern data-driven decision making requires the ability to aggregate, filter, and sort large amounts of data, which does not lend itself well to traditional indexing techniques.”


While many breeds of next-genera­tion databases—graph databases, vec­torized databases, time series databases, and streaming databases—are seen as must-haves for 2021 and beyond, the tried-and-true relational database man­agement systems will continue to play a critical role in data environments. “Even with the growing popularity of these next-gen technologies, the traditional relational database management systems will still hold sway in the database mar­ketplace,” said Sri Raghavan, data science and advanced analytics product market­ing for Teradata.

Still, with a wider variety of database types available, data managers can be selective in their deployment choices, Raghavan said. “Most enterprises today are strong on RDBMSs, with a num­ber of them also having other data­base types such as graph or NoSQL,” he said. “In fact, applications developed on one are also in some cases supported by the other.” However, he continued, while there is some compatibility among database types, there are specific tasks that one will be better suited for than another. “NoSQL and graph databases make it easy to retrieve vast volumes of data that are analyzed natively and are then made available for access to a wide range of third-party tools for further access and analysis. While this is possible with RDBMSs too, the ease of retrieval, analytics, and sharing across a wide eco­system of tools is better with next-gener­ation solutions.”

It’s also important to note that com­modity hardware is more compatible with NoSQL databases than RDBMS alterna­tives, Raghavan cautioned. RDBMS envi­ronments “typically need purpose-built, optimized hardware for data access and storage. NoSQL databases are designed for access and expansion across multiple cheap, commodity servers.” In addition, “NoSQL databases are more develop­er-friendly; they are more amenable to frequent code changes and modifications that can be done across shorter develop­ment sprints.”


While the RDBMS will continue to play an important role for many orga­nizations, clearly, the one-size-fits-all approach of years past is well behind us. A new breed of data systems is taking on more workloads and, as the 2020s prog­ress, we are likely to see additional data management technologies and tech­niques come to the fore, allowing for more immediate and ubiquitous data access and enabling companies to truly compete on data.

<< back Page 3 of 3