10 Predictions for Big Data in 2020

The big data space is constantly in flux. New and emerging technologies plan to disrupt reliable legacy solutions while others hope to work in harmony with tools in the market. What are the next explosive trends for 2020?

Here, executives of leading companies provide 10 predictions for what's ahead in 2020 for big data. From blockchain to business intelligence, these are just some of the many solutions that could take the big data world by storm.

Big Data is well and truly dead, but the data lake looms large. “Large scale, feature rich data warehouses, in cloud and on premises, have improved radically, to provide multi-petabyte scale using MPP architectures. That scale is made effectively possible by pushing compute and data closer together, and allowing SQL to express JOIN semantics and aggregations, which can be optimized by the database. These factors have killed “big data” as we knew it, but one element of big data lives on --- the data lake.” -Brian Bulkowski, CTO, Yellowbrick Data

Decentralization will become the new buzzword in 2020. “We are centralized now with large tech companies having control over a lot of the data. This next year we’ll see companies begin to position themselves as decentralized in one way or another. The control of data will begin to shift back to the consumer and end-user. People will stop becoming the product.” - Rich Chetwynd, product manager, OneLogin

Ransomware meets its nemesis – and it’s called blockchain. “While it may take a while for blockchain to be adopted in the financial markets or in other consumer-applications, its real-world use-case as a mechanism to prevent ransomware attacks will gain swift adoption. StorageCraft is way ahead of the rest of the industry with an already implemented blockchain file system. I fully expect a ‘catch up’ scramble amongst the data management vendors. We’re the only ones to provide an immutable file system where data cannot be overwritten or deleted by ransomware. Our fully auditable, immutable and unchangeable view of the history of data at rest means organizations – even with distributed environments - know if, when and where a ransomware infection occurred. Our ability to also provide continuous, immutable snapshots of data, means we can return data to its pre-ransomware state.” -Douglas Brockett, president, StorageCraft

Organizations will continue to investigate how blockchain can facilitate the creation of new business networks and (subsequently) new business models.  “Blockchain will continue to be a supporting / enabling technology, but it will change from what we see today.  Scalability will continue to be a challenge, as will blockchain interoperability (either between blockchain frameworks or between enterprise systems and a blockchain).  However, the largest barrier to blockchain adoption will not be the technology itself, but rather the complexity involved with bringing together diverse networks of companies or users and the associated legal and regulatory impacts (e.g. Libra). Organizations will look at tokenizing certain assets, but the introduction of open cryptocurrencies will continue to be a challenge.” - TIBCO CTO Nelson Petracek

Apache Arrow becomes fastest project to reach 10M downloads/month. “Apache Arrow (co-created by Dremio) has firmly established the industry-standard for columnar, in-memory data representation and sharing, powering dozens of open source & commercial technologies and making data science 100 to 1000X faster. Arrow has already achieved over 6M monthly downloads in the three years since release, with downloads continuing to grow exponentially. As a result, we predict Arrow will reach 10M downloads/month in 2020, faster than any other Apache project. And with the release of Apache Arrow Flight (also co-created by Dremio) this past October, the performance benefits of Arrow are being extended to the Remote Procedure Call (RPC) layer further increasing data interoperability. While Arrow Flight is just getting started, we predict that by 2025 it will replace decades-old ODBC/JDBC as the de facto way in which all modern data systems communicate.” - Dremio's CEO Tomer Shiran

Big data democratization will make everyone data analysts. “Big data has been a buzzword for so long, it has lost value. But, in 2020 and beyond, we’ll see it begin to provide real, tangible results. One reason for this is that data warehousing tools have improved and are no longer inhibitors to accessing enterprise insights in real time. Going forward, employees and stakeholders – from IT to the Board of Directors – will be able to more easily tap into the data well and become analysts themselves. And, with the democratization of data, the focus will shift from how to access data to: 1) asking the right questions of data, and 2) identifying who within your company is best positioned to analyze and glean answers from that data.”- Chris Patterson, senior director of product management, Navisite

Kubernetes for everything: “Kubernetes recently surpassed Docker as the most talked about container technology. In the future, every data technology will run on Kubernetes. We may not quite get there in 2020, but Kubernetes will continue to see rising adoption as more major vendors base their flagship platforms on it. There are still some kinks to be ironed out, such as issues with persistent storage, but those are currently being addressed with initiatives like BlueK8s. The entire big data community is behind Kubernetes, and its continued domination is assured.”- Unravel Data CEO Kunal Agarwal

Small data gets big: “Going forward, we’ll no longer require massive big data sets to train AI algorithms. In the past, data scientists have always needed large amounts of data to perform accurate inferences with AI models. Advances in AI are allowing us to achieve similar results with far less data.” -Zinier CEO Arka Dhar

DataOps Will Reign in the 2020s: “With an increasing number of systems, use cases and the sheer volume of data, data pipelines will be top of mind for organizations in 2020 and beyond. Businesses will continue to pursue more advanced data, analytics, and AI initiatives across their organization, which will necessitate DataOps sophistication to keep pace with the accelerating data development lifecycle. DataOps is by no means a new term or methodology, but increasingly, businesses will begin adopting DataOps practices to be able to scale and deliver on their investments in data, analytics and machine learning applications.“ – Sean Knapp, founder and CEO, Ascend

BI tools go far and wide: “The march toward being a data-driven enterprise has become a self-fulfilling prophecy, and in 2020 it will mean more than “consume this report.” We now have a generation of young business professionals who are used to asking questions and getting immediate answers. This dynamic has resulted in increased pressure on IT to make it easier and faster to get answers to questions, and that pressure is only going to multiply in 2020.” -Peter Guagenti, CMO, MemSQL

Image by xresch from Pixabay