Video produced by Steve Nathans-Kelly
Many organizations are continuously copying, transforming, and aggregating data into various large scale, complex, proprietary systems for the purpose of digital transformation in order to gain some competitive edge.
Unfortunately, this process ends up exacerbating the problem by creating more silos, complexity, and making the process of extracting knowledge more difficult overall, according to Gemini Data's Matthew Deyette. He reviewed where we are, how we got here, and what must come next with regard to leveraging big data analytics in a keynote at Data Summit 2019.
Deyette identifies data availability as the primary roadblock to extracting knowledge in this video clip from his presentation.
"Most of you are suffering from the very conditions that are listed here," he said, referencing a chart on barriers to digital transformation with 91% of data lakes being delayed or over budget, 80% of legacy technologies preventing digital transformation, and 51% of organizations using less than 51% of the data collected. "There are a lot of data silos, but data availability is the primary roadblock. For those of you that are data scientists, you'll say, 'Well, data cleansing is one of the largest concerns.'”
However, he noted, without access to the data, whether it's in a standard format or transformed into a new data warehouse, it is not possible to have data management. "Data availability is where it's at."
DBTA’s next Data Summit conference will be held May 19-20, 2020, in Boston, with pre-conference workshops on Monday, May 18.
According to Dyette, "The real challenging statistic from this perspective is the last two numbers. If 51% of the people are using less than half their data, and we all know how expensive that is just to collect it, so, for that matter, you're collecting something and then putting it to no use at all. So, how valuable is it? Well, it's as valuable as what you're spending money on, but if then you're not extracting any ROI, you're suffering."
Data lakes can be very useful, he noted. "However, they take a long time to implement, they're very expensive, I personally have suffered as a technology manager where you spend a lot of money training people, and then at the end of that training a percentage of them leave for another opportunity, so either they pay a premium to keep the talent, or you lose the talent, which you spent money on, again, training. So it can be a little frustrating. Some of these technologies are proprietary both in language and management, so there's a difficulty there in maintaining that."
There will always be a new silo, and it doesn't matter if it's a data warehouse, a data lake, cloud, another Internet of Things device, or some other Technology 4.0 thing that has yet to be branded yet by Gartner, said Deyette.
"It doesn't matter; they're all silos, they're all proprietary. Extracting the information is the challenge, and it will not--with current technology--go away. We have to address something. There has to be some new approach."
Many presenters have made their slide decks available on the Data Summit 2019 website at www.dbta.com/DataSummit/2019/Presentations.aspx.
To access the full presentation, "The Evolution of Big Data Analytics," go to https://datasummit.brightcovegallery.com/detail/videos/data-summit-2019-keynotes/video/6040667400001/sponsored-keynote-presented-by-gemini-data---the-evolution-of-big-data-analytics?autoStart=true#links.