How to Choose a Scalable Data Management Solution (VIDEO)

Video produced by Steve Nathans-Kelly

A.M. Turing Award Laureate and database technology pioneer Michael Stonebraker delivered a welcome keynote at Data Summit 2019, titled “Big Data, Technological Disruption, and the 800-Pound Gorilla in the Corner.” 

In his presentation, Stonebraker—who is an MIT Adjunct Professor and Tamr co-founder—outlined data management solution considerations as part of a larger discussion of the big data challenges facing enterprises today.

Traditionally, vendors will sell you extract, transform, and load packages, and on top of that, they will tell you that you need master data management tools, and that's available from the usual suspects—and that stuff does not scale. Just does not scale; that will work great on small problems," he said.

Stonebraker continued, "ETL is used in front of the load for data warehouses all over the world, and I've asked many, many data warehouse administrators how many data sources are you integrating into your data warehouse. The typical answer is '10, I'll give you 20, twist my arm hard, I'll give you 50.' There's no way this scales to the 250 data sources that Toyota has. So there's too much manual effort, it simply doesn't scale. MDM doesn't scale either. If I had more time, I would tell you exactly why, but this stuff won't scale.”

If an organization needs to work at scale, said Stonebraker, “It’s a machine learning problem, has to be; because at scale, you can't have lots of manual effort, so this overcomes ETL issues.”

To access Stonebraker’s full presentation of “Big Data, Technological Disruption, and the 800-Pound Gorilla in the Corner,” go to

DBTA’s next Data Summit conference will be held May 19-20, 2020, in Boston, with pre-conference workshops on Monday, May 18.

Many presenters have made their slide decks available on the Data Summit 2019 website at