The annual Data Summit conference went digital earlier this year, becoming Data Summit Connect. The online event in June featured live presentations by executives from leading IT organizations who engaged attendees with compelling presentations and spirited discussions on a variety of topics including data analytics and privacy, knowledge graphs, and AI and machine learning.
The following are some key points distilled from the 3-day webinar series which was preceded by a day of workshops. Full videos of Data Summit Connect 2020 presentations are available at www.dbta.com/DBTA-Downloads/WhitePapers.
Join us again October 20-22 for Data Summit Connect Fall 2020. The call for speakers is now open.
The use of AI is expanding rapidly. By 2021 75% of enterprise applications will use AI. However, with AI fueled dreams comes the reality that deploying AI isn’t easy. Less than 14% of AI projects move out of production. Companies need to identify the right business problem and deploy the right people to dig through the data, Ning said. In order to simplify the AI process, businesses need to study the proper use case, organize the correct data, and utilize the right tools, platforms, or services. Structured data is likely to drive most of AI’s impact. Structured data is processed so users don’t need a separate process. It’s organized and clean so queries can be run on this data quickly.—Elliott Ning, cloud advisor, Google
Data is unlike other valuable resources because it is "non-rival." Unlike food which we eat, or oil which we burn, or our accountant's consulting time, it is a resource that does not get "used up." Food can be consumed and used up, we can run out of time, and oil can be depleted. Data is not like that. Any number of people or firms can use data simultaneously and they don't diminish it. Any number of algorithms can be applied to data simultaneously without reducing its availability or value to anybody else and in addition, more data makes other data more valuable, and having other people in an ecosystem or in a company, who are trying to make and create value with that data, doesn't hold anyone else back. This is makes data fundamentally different than other resources.—Bryan Kirschner, vice president, DataStax
Public distrust of data collectors has become more pronounced. By and large, people don’t feel like they have control over the data collected about them by companies and the government. In addition, roughly half of the public believes that major tech companies should be regulated more than they are now.—Lee Rainie, director, internet and technology research, Pew Research Center and coauthor of the book, Networked
What you do is much less important than how you do it in data analytics. What you do (the system): the model the algorithm, the data pipeline, visualization, data governance, and even the data itself is actually less important in some ways than how you do it: the development, deployment, monitoring, iterating, collaborating, and measuring. In other words, the machine that makes the machine is more important than the machine itself. From a DataOps perspective, you want to focus on decreasing the cycle time required to get ideas from people on your team through their keyboard and into the eyes of your customer, and then iterate and improve. You also want to lower your error rates, not have so many meetings, and allow people to collaborate and innovate and, last, you want to measure your processes and be data-driven about your work—focus on rapid cycles, lower error rates, collaboration, and clear measurement.—Chris Bergh, CEO and head chef, DataKitchen
To be successful, data initiatives must be understandable, measurable, and linked to an outcome. Data is critical for input to strategic imperatives. The problem is that many organizations are finding that their data and infrastructures are not ready for newer applications such as conversational AI and predictive analytics. When initiating data project, it is important that advances be measurable and linked to business value because leadership is often reluctant to make the necessary investments unless there is clearly demonstrable ROI. —Seth Earley, CEO, Earley Information Science, author of The AI-Powered Enterprise
Data quality is the foundation for doing anything else. Challenges today in realizing the potential benefits of machine learning in the enterprise include data access issues (agility and security), data quality issues (disaggregated data with errors), lack of governance for validating model accuracy, and lack of collaboration between business and IT. If the underlying data is not accurate, then the organization will not be able to reach its goals with machine learning.—Rashmi Gupta, director data architecture, KPMG LLC
Simplifying ways for finding and accessing data is critical to creating an actionable roadmap for data and analytics. The business needs to understand customer behavior and product usage, increase operational efficiency, and then look to ways to innovate the business model to comprehend these considerations. Real-time data combined with predictive analytics can tell companies where things are heading in a more accurate way.—John O'Brien, principal advisor and CEO, Radiant Advisors
Digital transformation is a business imperative. It is a fundamental stage and it is nowhere near over. In order to transform the organization, every company has to become its own software company and innovate in order to compete with others.—Bruno Kurtic, founding vice president, strategy and solutions, Sumo Logic
Be sure to keep a human in the loop. When dealing with machine learning, have someone take a look at every step of the process.—Eric Schiller, senior data engineer, Excella
A knowledge graph system fuses and integrates data, not just in representation but also in context and time. Data-centric organizations that are adopting knowledge graphs value the use of an enterprise ontology and taxonomy as a shared model of the objects and words that are important to the business, see the value eliminating silos and using one data infrastructure for all their applications, and also value the ability to connect everything that happens to and with their patients, customers, and products as events off the core entity.—Jans Aasman, CEO, Franz
A simple way to think of a knowledge graph is: ontology + data/content = knowledge graph. Knowledge graphs are built on graph databases which are very good at modeling relationships, and excel at is aggregating multiple different types of information—including datasets—categorizing them, identifying relationships, and then integrating them together, but not necessarily moving the information. You can take information from many systems that you already have.—Joe Hilger, COO and co-founder, Enterprise Knowledge, LLC and Sara Nash, technical analyst, Enterprise Knowledge, LLC
Relationships are first-class citizens in a graph database. There is a need to build on existing analytics capabilities with faster, automated data prep, explainable AI and machine learning, improved algorithms and analytics, and more cost-efficient operations, which graph databases support. As with all databases, graph databases store facts, but they also keep track of how those facts are connected. Graph databases can combine structured and unstructured data, offer a flexible data model that can evolve as data changes, and provide rich insights on relationships.—Thomas Cook, director of sales, Cambridge Semantics
Location, location. Deployment and service models are key to where to securely store sensitive data. A public cloud is the least expensive option. However, it can be the least secure. A community cloud is more secure than a public cloud, but there may be some limits, including the cost. With a private cloud, users can choose between storing data on-premise or remotely. It is the most secure environment, but it is the most costly.
—Clay Jacson, senior database systems sales engineer, Quest Software