VIDEO: Data Lakes Are Growing While Hadoop Is Shrinking

Video produced by Steve Nathans-Kelly

Data lakes are a convenient and cost effective way to store a lot of data with completely diverse structures so that you don't have to build the model ahead of time, explained Paul Sonderegger, senior data strategist, Oracle, during Data Summit 2018.

"You just pour in extracts out of other databases. You pour in X amount, you pour in JSON documents, you pour in documents, just regular long-form documents, whatever you got. And, that's great. This storage mechanism holds all this great diversity of data. And then, you put on top, some set of tools that allow you to maybe transform it, change its shape, and do some analysis on it. That's the basic purpose of a data lake," said Sonderegger.

Data has been called many things such as the new oil and the new electricity, but, said Sonderegger, it is really the new capital, on a par with financial and human capital  for creating new products and services. When we say that data is a kind of capital, it’s not a metaphor; it is literal, he said, explaining, “In economics, capital is an asset produced through some process and is then a necessary input to some other good or service. Data fulfills this definition.”

To access more Data Summit 2018 videos, go to

Many PowerPoint presentations from Data Summit 2018 have been made available for review at

Data Summit 2019, presented by DBTA and Big Data Quarterly, is scheduled for May 21-22, 2019, at the Hyatt Regency Boston, with pre-conference workshops on May 20.