Key Takeaways about Data and Analytics from Data Summit 2019

Data Summit 2019 in Boston drew industry experts with deep knowledge spanning all areas of enterprise IT, including AI and machine learning, analytics, cloud, data warehousing, and software licensing who presented 3 days of thought-provoking sessions, keynotes, panel discussions, and hands-on workshops.

Data Summit 2020, presented by DBTA and Big Data Quarterly, will be held May 19-20, 2020, with pre-conference workshops on Monday, May 18. The conference will return to the Hyatt Regency Boston.

Be sure to check at for upcoming details on conference registration and program information.

Here are some key takeaways from the Data Summit 2019.

There is a persistent problem of data integration and data cleaning in the enterprise and it is the 800 pound gorilla in the room. The problem is that to do their work, data scientists must locate huge quantities of relevant data and integrate and cleanse that data for it to be useful, said Stonebraker, citing the example of a data scientist who said that they spend 90% of their time finding and cleaning data and then 90% of the other 10% checking on the cleaning because you can’t analyze dirty data. —A.M. Turing Award Laureate Michael Stonebraker, who is currently an adjunct professor at MIT and co-founder/CTO, Tamr

Knowledge graphs are on the rise and for good reason. “Graphs are an excellent way to store and query information that’s full of relationships that can’t be explicated in additional databases.The goal is to be able to ask questions about data and draw inferences about the content, RDF is an excellent candidate because that’s what it’s designed for.”Bob Kasenchak, director of business development, Access Innovations, Inc., USA

Hybrid cloud is the next evolution for legacy systems. Legacy enterprise data systems hinder business growth. “Time marches forward and these systems begin to slow down. If you roll forward a bit the cloud comes into play.”Paul Wolmering, VP worldwide sales engineering, Actian

DataOps has resulted in a transformative improvement in software development.DataOps can provide continuous delivery of analytics. Customers can deliver insights faster, ensure high quality, add features at the speed of business, and automate/orchestrate the complex environment of people and technology.Christopher P Bergh, CEO, head chef, DataKitchen

Containers are the fastest growing cloud enabling technology. Operational benefits of containerization include portability, scalability, fast deployment, and more. The business benefits from the agility, cost savings, and customer satisfaction.Jeff Fried, director of product management, InterSystems

While companies may hesitate to implement more data governance strategies, data governance provides substantial benefits. It delegates responsibility and enforces accountability; fosters collaboration; protects, enables, supports, and defends employees, consumers, stakeholders, and the business; allows for openness, flexibility, and extensibility; pushes data decisions to the lowest level of autonomy possible. “Data governance provides the rigor to answer everything consumers are concerned about.”Anne Buff, director of data governance

ML offers organizations many advantages. The ability to automatically build models that can analyze huge volumes of data and deliver lightning-fast results has also led to a growth in the availability of both commercial and open source frameworks, libraries, and toolkits for engineers. “Machine learning is a data analytics technique that teaches computers to learn from experience. From this algorithmic learning we will be able to either predict or understand.”Chelsey H Hill, assistant professor of business analytics, Feliciano School of Business, Montclair State University

There are plenty of DevOps-type products available to help orchestrate and integrate what is being done on the development side,but more products are needed for the database side. "What we need is the same thing on the database side—the database change management product, the database SQL performance-testing product, the test data management product. And they all need to be automated in such a way that the DBA understands and blesses what's going on here, and that it's actually going work. And the developers can just integrate it, and work with it, and it moves forward. That's where it needs to go, but we're not there yet. Dev is there; Ops needs work." —Craig S. Mullins, president & principal consultant, Mullins Consulting

Big data requires a mix of technologies. There are more options than ever for data management, including relational, columnar, and in-memory. "Relational doesn't solve everything, but neither does Hadoop, so you need the combination of what works best for you.” Athena IT Solutions managing partner Richard Sherman

SQL is far from over. “The data warehouse market is moving to cloud-native, platform-as-a-service data warehouses. And SQL is not dead by a long shot. One of the biggest drivers for moving the data warehouse to the cloud is to be able to allow bring your own analytics. Bring your own analytics starts almost initially in every single case with SQL-based analytics." Pythian Group VP Lynda Partner

It was hard enough to manage IT infrastructures when everything was on-premise only. But today, with combined on-premise and cloud deployments, it can seem impossible. “Customer goals to keep costs down are in direct conflict with software vendors’ desire to protect their revenue stream.” There are many ways to get into licensing trouble, including users downloading software with the organization being unaware, upgrades turning on features that the organization does not want, and third party software also turning on features the organization does not know are being used.Michael Corey, co-founder, LicenseFortress, and Don Sullivan, system engineer database specialist, VMWare

AI has reached a turning point, where companies are not considering whether to use AI, but when and where. Organizations just getting started with AI, should spend time defining a problem where technology can help. They should think about where AI can be injected to make decisions faster, what could be accomplished if they had more time, more people, if they had their best expert on the problem, or were able to do something faster, brining it closer to real time. Amy Guarino, COO, Kyndi

Information is driving huge advantage. However, according to Harvard Business Review, 69% of companies have not created a data-driven organization. In addition, according to HBR 53% don’t treat data as a business asset, and 52% are not competing on data and analytics. And, according to IDC, only 30% are almost always successful in finding the data they’re looking for and only 20% are almost always successful in preparing data for analysis.The answer is to focus on outcomes, build a culture of curiosity, and build bridges between data silos. Without this approach, there will be a lot of activity, but not successful outcomes.Lee Levitt, business strategist at Oracle

Augmented intelligence is necessary for gathering insights from the huge volumes of data coming into organizations. “You can't have enough analysts to go through that data, you can't have enough people working on it, analyzing it, making decisions with it.” The only approach for scaling to keep up is to deploy the new architectures that work with AI. AI can then be deployed thousands of times all over the place—working and learning, and improving on the spot because we can't scale.”John O’Brien, principal advisor and chief researcher, Radiant Advisors

With edge computing becoming the next big thing, AI on-the-edge is quickly following suit. AI on the edge can minimize delay, improve privacy, conserve bandwidth in the IoT system, and deliver personalization/self-improvement.Wolf Ruzicka, chairman, and Polina Reshetova, data scientist, EastBanc Technologies