Conference Program

Data Summit 2020 is a unique conference that brings together IT practitioners and business stakeholders from all types of organizations. Featuring workshops, panel discussions, and provocative talks attendees get a comprehensive educational experience designed to guide them through all of today’s key issues in data management and analysis. Whether your interests lie in the technical possibilities and challenges of new and emerging technologies or using Big Data for business intelligence, analytics, and other business strategies, Data Summit 2020 has something for you!

Access to all tracks plus the AI & Machine Learning Summit, Data Lake Boot Camp, and DataOps Boot Camp is included when you register for an All-Access Pass or Full Two-Day Conference Pass. Attendees may switch between tracks as they choose. Only interested in the two-day Summit or our one-day Boot Camps? Stand-alone registration for these events is also available.

Advance Program Download the Advance Program [PDF]
 

Monday, May 18

Preconference Workshops

 

W1. Introduction to Knowledge Graphs

09:00 AM2020-05-182020-05-18

Monday, May 18: 9:00 a.m. - 12:00 p.m.

Knowledge graphs are a valuable tool that organizations can use to manage the vast amounts of data they collect, store, and analyze. An Enterprise knowledge graph’s representation of an organization’s content and data creates a model that integrates structured and unstructured data. Knowledge graphs have semantic and intelligent qualities to make them “smart.” Attend this workshop to learn what a knowledge graph is, how it is implemented, and how it can be used to increase the value of your data.

Speakers:

, COO and Co-founder, Enterprise Knowledge LLC

, Senior Consultant, Enterprise Knowledge

 

W2. Data Ops 101

09:00 AM2020-05-182020-05-18

Monday, May 18: 9:00 a.m. - 12:00 p.m.

DataOps has emerged as an agile methodology to improve the speed and accuracy of analytics through new data management practices and processes, from data quality and integration to model deployment and management. By leveraging automation, data democratization, and greater collaboration among data scientists, engineers, and other technologists, DataOps can help organizations improve the time-to-value of their data. Attend this workshop to hear about the key supporting technologies, real-world strategies, and success stories, and learn how to get started on your DataOps journey.

Speaker:

, Head of Product, Tamr

 

W3. Building Actionable Roadmaps for Data and Analytics

09:00 AM2020-05-182020-05-18

Monday, May 18: 9:00 a.m. - 12:00 p.m.

The starting point in developing and launching an enterprise data and analytics strategy is to understand the interrelationships that are necessary to deliver analytics capabilities. These relationships also account for skills and roles of everyone who works with data, from business executives and business analysts to data scientists. To measure and drive success, an actionable road map, with each phase focused on being lean with a business impact, is required. Attend this workshop to learn how to designate business drivers into analytic capabilities and data priorities, create and implement a road map, and deliver a compelling executive briefing.

Speaker:

, Principal Advisor and CEO, Radiant Advisors

 

W4. Data Science Best Practices

01:30 PM2020-05-182020-05-18

Monday, May 18: 1:30 p.m. - 4:30 p.m.

Data science, the ability to sift through massive amounts of data to discover hidden patterns and predict future trends, may be the “sexiest” job of the 21st century, but it requires an understanding of many different elements of data analysis. Extracting actionable knowledge from all your data to make decisions and predictions requires a number of skills, from statistics and programming to data visualization and business domain expertise. Attend this workshop for a deep dive into the fundamentals of data exploration, mining and preparation, and applying the principles of statistical modeling and data visualization in real-world applications.

Speaker:

, Founding President, Caserta

 

W5. Machine Learning Best Practices

01:30 PM2020-05-182020-05-18

Monday, May 18: 1:30 p.m. - 4:30 p.m.

Machine learning is on the rise at businesses hungry for greater automation and intelligence, with use cases spreading across industries. At the same time, most projects are still in their early phases. From selecting datasets and data platforms to architecting and optimizing data pipelines, there are many success factors to keep in mind. The advantages that machine learning offers organizations—the ability to automatically build models that can analyze huge volumes of data and deliver lightning-fast results—have also led to a growth in the availability of both commercial and open source frameworks, libraries, and toolkits for engineers. Attend this workshop for a hands-on course in the enabling technologies, techniques, and applications you need to know to succeed in today’s environments.

Speaker:

, Assistant Professor of Analytics, Information Management/Business Analytics, Montclair State University and Drexel University

 

W6. Scaling Self-Service Analytics Across the Enterprise

01:30 PM2020-05-182020-05-18

Monday, May 18: 1:30 p.m. - 4:30 p.m.

Realizing more value from data for transformation projects requires a strategy for establishing self-service data analytics. Insights from companies that have made this transformation, combined with industry best practices, provide guiding principles and recommendations to establish vibrant data communities that are intent on extending the value of analytics. Attend this workshop to determine the best self-service adoption strategies for your organization, prioritize key factors and actions, and set the right metrics for measuring and communicating your growth.

Speakers:

, Principal Advisor and CEO, Radiant Advisors

, Director, Editorial & Content Strategy, Radiant Advisors

Tuesday, May 19

Keynotes

Moderator:
Tom Wilde, CEO, Indico
 

WELCOME & KEYNOTE - Disturbances in the Data Ecosystem: How Techlash Will Become a Force in Your Lives

08:45 AM2020-05-192020-05-19

Tuesday, May 19: 8:45 a.m. - 9:30 a.m.

Lee Rainie discusses public attitudes about data, machine learning, privacy, and the role of technology companies in society. He covers how those will be factors shaping the next stages of the analytics revolution as politicians, regulators, and civic actors start to focus their sights on data and its use.

Speaker:

, Director, Internet and Technology Research, Pew Research Center and Author of the book "Networked: The New Social Operating System"

 

Sponsored Keynote presented by Oracle

09:30 AM2020-05-192020-05-19

Tuesday, May 19: 9:30 a.m. - 9:45 a.m.

 

Sponsored Keynote

09:45 AM2020-05-192020-05-19

Tuesday, May 19: 9:45 a.m. - 10:00 a.m.

 

Tuesday, May 19

Track A: Moving to a Modern Data Architecture

Moderator:
John O'Brien, Principal Advisor and CEO, Radiant Advisors
 

A101. Agile Data Mastering at Scale

10:45 AM2020-05-192020-05-19

Tuesday, May 19: 10:45 a.m. - 11:45 a.m.

The Promise of Data Mastering at Scale

10:45 a.m. - 11:45 a.m.

For years, independent business units have operated in organizations with substantial freedom of action, allowing them to be agile but also creating “silos” of information, relevant to their specific needs, according to Stonebraker. There can be considerable business value to integrating certain entities from these silos, for example, to facilitate cross-selling or to eliminate duplication of effort. However, while traditional data mastering is suitable for small or simple problems, a new approach is needed for bigger and more complex projects.

Speaker:

, Adjunct Professor, MIT, & Co-Founder/CTO, Tamr

 

A102. The New World of Database Technologies

12:00 PM2020-05-192020-05-19

Tuesday, May 19: 12:00 p.m. - 12:45 p.m.

Database technologies are constantly changing and adding new options for enterprises. To take advantage of the new world of database technologies, enterprise and database managers need to be open to new possibilities.

MongoDB: A Practical Approach to Building a High Performance Data Platform

12:00 p.m. - 12:45 p.m.

Though MongoDB is capable of incredible performance, it requires mastery of design to achieve such optimization. Kankipati discusses the practical approaches to optimization and configuration for the best performance. He provides a brief overview of the new features in MongoDB, such as ACID transaction compliance, and then moves on to application design best practices for indexing, aggregation, schema design, data distribution, data balancing, and query and RAID optimization. Other areas of focus include tips to implement fault-tolerant applications while managing data growth, practical recommendations for architectural considerations to achieve high performance on large volumes of data, and best deployment configurations for MongoDB clusters on cloud platforms.

Speaker:

, Lead Information Architect, Florida Blue

 

A103. Understanding Cloud Licensing

02:00 PM2020-05-192020-05-19

Tuesday, May 19: 2:00 p.m. - 2:45 p.m.

Organizations' software infrastructure today is commonly spread across on-premise and multiple cloud deployments. While having the flexibility to maximize the advantage of each approach offers benefits, it also presents challenges.

Straight Talk on the Software License Landscape

2:00 p.m. - 2:45 p.m.

It is difficult to get straight answers from vendors on the proper way to license software in the complicated world of hybrid and multi-cloud software deployments. Adding to the challenge is the fact that many vendors have turned to software license audits as an easy way to generate additional revenue. Our popular speakers focus on current software licensing trends and the lessons learned from the field. They also identify the steps every organization should take to avoid becoming a victim of a software licensing audit.

Speakers:

, Co-Founder, LicenseFortress

, System Engineer Database Specialist, VMWare

 

A104. Modernizing Your Data Warehouse

03:15 PM2020-05-192020-05-19

Tuesday, May 19: 3:15 p.m. - 4:00 p.m.

For decades, companies have made large investments in enterprise data warehouses to fuel their BI systems. Increasingly, enterprises are turning to cloud platforms for their modern data warehouses, which are enabling next-generation initiatives.

YellowBricks

3:15 p.m. - 4:00 p.m.

Learn directly from a customer using a new modern data warehouse built for the hybrid cloud to solve traditional data management challenges and make decisions faster. 

 

A105. Data Governance in the Cloud Era

04:15 PM2020-05-192020-05-19

Tuesday, May 19: 4:15 p.m. - 5:00 p.m.

Organizations are collecting large quantities of disparate data from a range of sources, using a variety of cloud and on-premise data management platforms. Adding to their challenges are increasingly stringent regulations about how data is handled.

The Changing Role of Data Governance in a Cloud-First World

4:15 p.m. - 5:00 p.m.

Is it time for data governance to become more agile to keep pace in a cloud-first world? Partner explores the thesis that data governance can and should change to keep pace with a data world that is vastly different than it was even a few years ago. Come learn how agile methods can be applied to data governance.

Speaker:

, Director of Solutions Architecture, The Pythian Group

 

Tuesday, May 19

Track B: Competing on Analytics

Moderator:
Julie Langenkamp, Director, Editorial & Content Strategy, Radiant Advisors
 

B101. Becoming an Analytics-Driven Enterprise

10:45 AM2020-05-192020-05-19

Tuesday, May 19: 10:45 a.m. - 11:45 a.m.

Competitive pressure is escalating as more organizations use technology and analytics to identify opportunities and address customers' needs more thoroughly. There are a variety of approaches to enable better decision making and enhance customer experience.

Getting Executive Buy-In for Digital and Analytics Transformation

10:45 a.m. - 11:45 a.m.

Like many other industries, the insurance sector is changing rapidly due to evolving customer expectations and disruption caused by the emergence of non-traditional, tech-savvy new entrants. Arbella engaged its executives to begin the long journey toward a complete transformation of the company. The goal was to become an analytics-driven enterprise, including addressing concerns and overcoming initial resistance. The presentation concludes by showcasing the proof of concept for the initial project.

Speakers:

, Head of Analytics & Data Science, Arbella Insurance Group

, Senior Actuarial Analyst, Arbella Insurance Group

Case Study: How Accelerated Analytics of Massive Data Propelled This Enterprise’s Business

10:45 a.m. - 11:45 a.m.

Speaker:

, CMO, SQream

 

B102. IoT, Analytics, and Streaming at Scale

12:00 PM2020-05-192020-05-19

Tuesday, May 19: 12:00 p.m. - 12:45 p.m.

According to Cisco, 500 billion devices are expected to be connected to the internet by 2030. Organizations are using new technologies to capitalize on this wealth of IoT data by analyzing it rapidly for timely insights.

IoT Sensor Analytics With Apache Kafka, KSQL, and TensorFlow

12:00 p.m. - 12:45 p.m.

Large numbers of IoT devices are leading to big data and the need for further processing and analysis. Learn how to leverage Kafka and KSQL in an IoT-sensor-analytics scenario for predictive maintenance. A live demo shows how to embed and deploy machine learning models—built with frameworks such as TensorFlow, DeepLearning4J, or H2O—into mission-critical and scalable real-time applications.

Speaker:

, Technology Evangelist, Confluent

 

B103. Analytics Best Practices Today

02:00 PM2020-05-192020-05-19

Tuesday, May 19: 2:00 p.m. - 2:45 p.m.

The speed and volume of data flowing into organizations is intensifying. but many companies are only able to leverage a relatively small portion of it to better serve customers, identify trends, and improve operations. Implementing best practices is critical to succeeding with data analytics today.

Trials and Tribulations: How to Build and Scale a World-Class Analytics Team

2:00 p.m. - 2:45 p.m.

Want to build a world-class in-house analytics team? MINDBODY is an American SaaS company headquartered in San Luis Obispo, Calif. The company provides cloud-based business management software for the wellness services industry. Learn how its data science department scaled up from a small collection of SQL reporting analysts to a successful, centralized team of data analysts, engineers, and scientists that produces high-quality and heavily adopted analytical models for all areas of its global SaaS company.

 

B104. Enabling Real-Time Analytics

03:15 PM2020-05-192020-05-19

Tuesday, May 19: 3:15 p.m. - 4:00 p.m.

The ability to quickly act on information to solve problems or create value has long been the goal of many businesses. However, it was not until recently that new technologies emerged to address the speed and scalability requirements of real-time analytics, both technically and cost-effectively.

Building a Real-Time Multi-Tenant Data Processing and Model Inferencing Platform

3:15 p.m. - 4:00 p.m.

Each week, 275 million people shop at Walmart, generating interaction and transaction data. Learn how the company's customer backbone team enables extraction, transformation, and storage of customer data to be served to other teams. At 5 billion events per day, the Kafka Streams cluster processes events from various channels and maintains a uniform identity of each customer.

 

B105. Succeeding With Data Science in the Real World

04:15 PM2020-05-192020-05-19

Tuesday, May 19: 4:15 p.m. - 5:00 p.m.

There are many considerations that factor into data science success in the real world. Ultimately, success is about the right combination of technology, processes, and people.

You Built It, But They Didn't Come: How to Make Your Data Product Indispensable

4:15 p.m. - 5:00 p.m.

Executives are worried about having an AI strategy. Data scientists worry about getting their models to be as accurate as possible. IoT teams stay busy juggling telemetry, alerts, and APIs. Report developers do their best to visualize the data, and engineers try to glue it all together and ship it. However, if business value is dependent on specific users engaging successfully with a decision support application or data product, then teams must design these solutions around the people using them—not the data or technology. Learn how human-centered design provides a process to help teams discover, define, and fall in love with customer problems.

Speaker:

, Founder & Principal Designer, Designing for Analytics

 

Tuesday, May 19

Track C: Data Lake Boot Camp

Moderator:
Abhik Roy, Lead Big Data Architect, Transamerica Data Engineering
 

C101. Modern Data Lake Essentials

10:45 AM2020-05-192020-05-19

Tuesday, May 19: 10:45 a.m. - 11:45 a.m.

There are certain critical elements to a data lake, without them, implementations often fail.

Essentials of a Modern Data Lake

10:45 a.m. - 11:45 a.m.

Roy describes the essential capabilities and architecture patterns that any modern data lake should possess, as well as the real challenges and opportunities to confront while building an enterprise data lake. Issues include on-premise and cloud integration, performance-tuning and scaling-up strategies, audits and controls, storage layers, compute patterns, presentation layers, data discovery, and governance.

Speaker:

, Lead Big Data Architect, Transamerica Data Engineering

 

C102. Drilling Down on Data Lake Architecture

12:00 PM2020-05-192020-05-19

Tuesday, May 19: 12:00 p.m. - 12:45 p.m.

Data lake adoption is increasing to support initiatives such as data science, data discovery, and real-time analytics.

Cruising the Data Lake: From Zero to Scale

12:00 p.m. - 12:45 p.m.

As part of the Highly Automated Driving (HAD) group at HERE Technologies, the company is building the High-Definition Map (HDMap) of the real world to power autonomous-car-driving use cases. With the complexity of pipelines for data enrichment and the petabyte scale of the content, the company needed a mechanism to avoid data silos and achieve a centralized way to analyze, predict, and evaluate the data. Chaphalkar highlights the principles and technology behind the company's data lake architecture and present strategies that can serve as guidelines to others seeking to stand up and run a data lake at scale.

Speaker:

, Senior Engineering Manager, HERE Technologies

 

C103. Cloud Data Strategy and Data Lakes

02:00 PM2020-05-192020-05-19

Tuesday, May 19: 2:00 p.m. - 2:45 p.m.

Across all industries, companies are investing heavily in modernizing data infrastructures to manage big data by creating “data lake” environments that process large volumes and varieties of data for reporting and analytics.

Cloud Data Strategy: Multiple Lakes Versus One Enterprise Lake

2:00 p.m. - 2:45 p.m.

Organizations with a strong growth focus in their corporate strategy allow internal business divisions to build their own lakes to enable faster time-to-market for analytics, resulting in a myriad of data lakes and a complex enterprise data landscape. Conversely, organizations with a strong defend focus tend to build a highly bureaucratic, centralized data lake to enable regulatory and compliance analytics. Learn why a balanced focus is essential to define a robust data strategy.

Speaker:

, Senior Manager, Data Management & Governance, Vanguard

 

C104. Building a Transactional Data Lake

03:15 PM2020-05-192020-05-19

Tuesday, May 19: 3:15 p.m. - 4:00 p.m.

With the proliferation of data in the past years, business-critical decision making is now heavily influenced by deep data analysis.

Building Large-Scale, Transactional Data Lakes Using Apache Hudi

3:15 p.m. - 4:00 p.m.

Hudi, which stands for "Hadoop Upserts Deletes and Incrementals," is a storage abstraction library that improves data ingestion. Our Uber speakers explain what Hudi offers and why it is needed, including how Hudi can provide ACID semantics to a data lake, and some of the basic primitives required to achieve acceptable latencies in ingestion, while also providing high-quality data by enforcing schematization on datasets.

Speakers:

, Engineering Manager, Uber

, Senior Software Engineer, Uber

 

C105. Securing the Modern Data Lake

04:15 PM2020-05-192020-05-19

Tuesday, May 19: 4:15 p.m. - 5:00 p.m.

The transition from storing data in an on-premise data warehouse to using a hybrid infrastructure has enabled tremendous agility and scale, but has also created a security and privacy risk.

Overcoming Challenges to Securing Modern Data Lakes

4:15 p.m. - 5:00 p.m.

Organizations that are concerned about the quality of their data, protecting their brand and intellectual property, and complying with evolving privacy regulations must understand how the modern infrastructure has broken the relationship between data and metadata, and how this in turn impacts the quality and security of their data. What's needed is a new approach that sits in the “data plane” and enforces metadata creation on write, manages user access, and performs data transformations.

Speaker:

, CEO & Co-Founder, Okera

 

Tuesday, May 19

Track AI: AI & Machine Learning Summit

 

AI101. The Journey to AI

10:45 AM2020-05-192020-05-19

Tuesday, May 19: 10:45 a.m. - 11:45 a.m.

AI is getting a lot of attention these days. Your journey to choosing and making the most of AI technologies starts with building high-value products.

What It Takes to Build a High-Value AI Product

10:45 a.m. - 11:45 a.m.

AI, ML, and analytics have become a standard stable of tools for organizations that are keen on accelerating their value creation and growth. Many companies have adopted and invested heavily in these capabilities. Some of the early adopters, those with the right culture and data strategy, are reaping wholesome benefits. Conversely, other organizations continue to endure costly resource misalignments. During this session, attendees get an opportunity to discuss what it takes to build a high-value AI product, leverage the company data asset and best practices to unleash their firm’s full productive potential, and accelerate innovation to grow wallet-share and win new markets.

Speaker:

, Director, Advanced Analytics, Genesis Capital - Goldman Sachs

 

AI102. AI in the Real World

12:00 PM2020-05-192020-05-19

Tuesday, May 19: 12:00 p.m. - 12:45 p.m.

Lots of theories about the value of AI and its relevance to problem- solving exist, but how does it work in the real world?

Enterprise AI and the Paradox of Accuracy

12:00 p.m. - 12:45 p.m.

One way to think about AI is as a very powerful “mimic.” If you show it example data or model an existing process, AI will be able to perform at massive scale, delivering dramatically increased throughput and reduced processing times. However, the most common concern with implementing AI revolves around the notion of “accuracy”: Does it perform the task accurately enough to be useful? When AI uncovers significant inconsistencies, the conclusion is often “AI is not yet smart enough to perform this task,” when in reality AI is simply uncovering existing inconsistencies in the human processes being automated. Wilde discusses the inflated expectations for AI and the need for careful definition of the inputs and outputs required for success. He outlines a set of guidelines to help users overcome these inflated expectations to take advantage of the real value that AI has to offer.

Speaker:

, CEO, Indico

 

AI103. Building Machine-Learning Apps

02:00 PM2020-05-192020-05-19

Tuesday, May 19: 2:00 p.m. - 2:45 p.m.

The slogan “There’s an app for that” is true for ML as it is for many other activities. Our app-centric world keeps expanding.

Orchestrating a Serverless Machine Learning Web App in AWS

2:00 p.m. - 2:45 p.m.

AutoDesk is leveraging a serverless AWS stack to build fully customizable web apps. Arora provides a detailed overview of what each technology can be used for and then describes a particular use case that AutoDesk solved by combining three different ML techniques to fabricate an NLP pipeline. The use case is focused on how the company used product usage/commands to identify what types of users existed for its major products like AutoCAD, Inventor, and MAYA. Arora also touches upon how his team built an app to give product managers the capability to run the ML models based on changing use cases and requirements.

Speaker:

, Data Scientist, AutoDesk

 

AI104. Optimizing Machine Learning for the Enterprise

03:15 PM2020-05-192020-05-19

Tuesday, May 19: 3:15 p.m. - 4:00 p.m.

ML has many uses within the enterprise as the technology matures and becomes more valuable to a variety of departments within organizations.

Using Machine Learning to Mine Competitive Intelligence Insights

3:15 p.m. - 4:00 p.m.

Large organizations have access to a trove of unstructured data—research and news reports generated within and outside the enterprise—that can be mined for competitive intelligence (CI) insights necessary for product development, marketing, and sales. It’s a massive undertaking, typically managed by a small team, with each member serving hundreds or thousands of internal clients. Now machine learning (ML) can be applied to read and analyze these documents to expedite the process of mining valuable nuggets of insight from vast and varied content collections. Learn about applying ML to distill and report search results, overcoming “training set” challenges, and optimizing delivery of curated insights.

Speaker:

, CEO, Northern Light

 

AI105. New Technologies for Finding Information

04:15 PM2020-05-192020-05-19

Tuesday, May 19: 4:15 p.m. - 5:00 p.m.

We hear a lot about how various AI and ML technologies are being deployed today. But what should we expect in the future? What new technologies will come into play? What new applications will we find for existing technologies? Our panel of forward thinkers, who may have their eyes on the stars but maintain their feet on the ground, shares their predictions about new technologies that will impact your organizations and your jobs going forward. Join this exciting panel to stimulate your thoughts about the future.

Moderator:

, President, Synthexis and Cognitive Computing Consortium


Speakers:

, CEO, Indico

, COO, Co-Founder, Basis Technology

Wednesday, May 20

Keynotes

Moderator:
Edee Edwards, Ontology Architect, National Fire Protection Association
 

OPENING KEYNOTE - A 2020 Vision for AI - Setting Up AI Projects to Succeed

08:45 AM2020-05-202020-05-20

Wednesday, May 20: 8:45 a.m. - 9:30 a.m.

As technology and techniques get sharper, AI capabilities are becoming more practical. But how can data professionals get business leaders to sign off on AI initiatives when past disappointments have made them skeptical? Jason Hein describes three key strategies to keep your projects on track to win.

Speaker:

, Director, Delivery Services, Earley Information Science

 

Sponsored Keynote presented by Datastax

09:30 AM2020-05-202020-05-20

Wednesday, May 20: 9:30 a.m. - 9:45 a.m.

 

Sponsored Keynote presented by Sumo Logic

09:45 AM2020-05-202020-05-20

Wednesday, May 20: 9:45 a.m. - 10:00 a.m.

 

Wednesday, May 20

Track A: Building the Data-Driven Future

Moderator:
John O'Brien, Principal Advisor and CEO, Radiant Advisors
 

A201. Winning With a Modern Data Strategy

10:45 AM2020-05-202020-05-20

Wednesday, May 20: 10:45 a.m. - 11:30 a.m.

New approaches exist to streamline and accelerate data management processes. It is critical to stay up-to-date with new ways to extract value from data with a modern data strategy.

Evolving Your Data Platform

10:45 a.m. - 11:30 a.m.

Being a data-driven company, Coupang, the largest online retailer in South Korea, relies heavily on data decisions—from the customer journey to the best algorithm to optimize space in a fulfillment center. The company uses data to find bottlenecks at each step of the process and move faster as a company. To keep up with the constant demand for excellence in scale, availability, latency, concurrency, and fast data growth, the company must constantly improve its data platform. Parihar explains how the data platform at Coupang has evolved and provide a glimpse of what is planned for the future.

Speaker:

, Director, Data Engineering, Coupang

 

A202. Building Smarter Systems

11:45 AM2020-05-202020-05-20

Wednesday, May 20: 11:45 a.m. - 12:30 p.m.

Smarter capabilities are being sought by organizations to improve speed, flexibility, and scalability. AI and machine learning are on the rise at businesses seeking greater automation and intelligence.

Multilingual ‘Did You Mean’ Recommendation System in Big Data Search

11:45 a.m. - 12:30 p.m.

Online communities and multinational corporations maintain multimillion object data repositories/data lakes in multiple languages. The semantic diversity of data makes it difficult to ask “correctly phrased” questions in such large heterogeneous repositories. Often, it is not possible for users to ask questions in their native language, which exacerbates the problem. Komissarchik and Jones address different aspects of a “Did you mean” recommendation system that was built for the 300-plus languages of Wikipedia search and share the three major mechanisms that Wikipedia “Did You Mean” contains.

Speakers:

, VP, Product, & Co-Founder, Glenbrook Networks, Inc

, Senior Software Engineer, Wikimedia Foundation

 

A203. The Rise of Containers

02:00 PM2020-05-202020-05-20

Wednesday, May 20: 2:00 p.m. - 2:45 p.m.

The rise of containers and orchestration technology enables applications and data to be transferred to the locations, platforms, or environments where they are needed or best-suited. However, there are important considerations that need to be addressed for successful implementations.

Understanding Database Containerization

2:00 p.m. - 2:45 p.m.

Container usage is now being applied by organizations of all sizes from small startups to huge established microservices platforms. Our experts are here to help practitioners navigate the minefield of database containerization and avoid some of the major problems that can occur. The presentation covers key issues, such as container configuration and homogeneous-versus- heterogeneous node types.

Speakers:

, Director of Product Management, Intersystems

, Product Manager, InterSystems

 

A204. Graph Databases to the Rescue

03:00 PM2020-05-202020-05-20

Wednesday, May 20: 3:00 p.m. - 3:45 p.m.

Graph databases are becoming more widely used for a variety of use cases such as customer profiling, fraud detection, medical research, and recommendation engines. Graph databases are particularly well-suited for storing information about relationships and connecting that information to illuminate context in data.

Fraud Investigation Using a Graph Database

3:00 p.m. - 3:45 p.m.

Is fraud affecting your wallet? Kavala explains how Capital One is reimagining the way it tackles fraud investigations by utilizing the AWS Neptune GraphDB to capture every possible relationship. between customers and associates through multiple degrees of separation. Using a rich web UI tool, the company has built a one-stop shop to easily start, investigate, and conclude a fraud case, while also allowing operations teams to save and share the case details with partners for further investigation.

Speaker:

, Master Software Engineer, Capital One

 

Wednesday, May 20

Track B: Digital Transformation

Moderator:
Julie Langenkamp, Director, Editorial & Content Strategy, Radiant Advisors
 

B201. Charting Your Course to Cloud Analytics Success

10:45 AM2020-05-202020-05-20

Wednesday, May 20: 10:45 a.m. - 11:30 a.m.

Organizations have talked for a long time about putting data workloads in the cloud for better flexibility and scalability and to enable innovative new use cases. But what does it take to turn that talk into action and get real value from running analytics in the cloud?

Charting a Course to Cloud Analytics Success: What Cricket, Luxury Fashion, and Mining Have In Common

10:45 a.m. - 11:30 a.m.

Learn how three companies achieved real value from running analytics in the cloud. Each was at a different stage of the cloud data maturity continuum. One was focused on getting better insights, another on embedding machine learning and AI into daily operations, and a third was looking to create entirely new business opportunities using data. This presentation uses realworld examples to illustrate that while the driving forces that are moving analytics to the cloud are varied, there are common denominators, methodologies, and best practices which consistently lead to successful business outcomes with data.

Speaker:

, VP, Marketing & Analytics as a Service, The Pythian Group

 

B202. PANEL: The Impossibility of Predicting the Future

11:45 AM2020-05-202020-05-20

Wednesday, May 20: 11:45 a.m. - 12:30 p.m.

As part of Data Summit’s deep dive into digital transformation, this panel discussion focuses on issues related to data quality, model creation, model training, proper sequencing of data, manipulating data blind spots—and more.

Speakers:

, Emerging Tech Strategist

, Principal Researcher, Unisphere Research, A Division of Information Today, Inc.

 

B203. Tapping Into the Full Value of IoT

02:00 PM2020-05-202020-05-20

Wednesday, May 20: 2:00 p.m. - 2:45 p.m.

In the past, there have been limitations to organizations’ ability to harness machine data. These roadblocks have been removed, and innovative companies are tapping into their machine data for product innovation and revenue growth.

Data for a Transformational Business Advantage

2:00 p.m. - 2:45 p.m.

Machine data intelligence is the ability to gather and analyze vast amounts of machine-generated data, including data from sensors, systems, and connected devices, to achieve new levels of insight that drive smarter operations and better decision making. Learn to identify the steps that enterprises need to take, and the systems they must have in place, in order to see a return on their IoT investments and thrive—not just survive.

 

B204. Analytics in Action: Driving Real-World Change

03:00 PM2020-05-202020-05-20

Wednesday, May 20: 3:00 p.m. - 3:45 p.m.

Today, collecting large quantities of data is simply not enough. The ability to achieve business-changing outcomes is what is needed. Predictive analytics can help organizations avoid problems and take advantage of opportunities.

Fueling a Turnaround With Team-Based Analytics

3:00 p.m. - 3:45 p.m.

A little more than 6 years ago, Florida Atlantic University ranked second from the bottom out of all state schools. To make matters worse, under a performance-based funding model, the state withheld $7 million in allocations. But, 2 years later, the university garnered the #1 rank in performance, reversing its standing in the state system while also eliminating the common achievement gaps by race/ethnicity and income. By integrating predictive analytics solutions with a commitment to targeted interventions, the university cultivated a robust portfolio of best practices, all of which were rooted in an inclusive and teambased philosophy.

Speakers:

, Assistant Provost, Florida Atlantic University

, Director, Academic Planning, Academic Affairs, Florida Atlantic University

 

Wednesday, May 20

Track C: DataOps Boot Camp

Moderator:
Joe McKendrick, Principal Researcher, Unisphere Research, A Division of Information Today, Inc.
 

C201. Succeeding With DataOps Today

10:45 AM2020-05-202020-05-20

Wednesday, May 20: 10:45 a.m. - 11:30 a.m.

There have been many new frameworks aimed at improving data management by combining what have traditionally been siloed functions and teams. One of the newest, DataOps, is expected by many to substantially improve data analytics.

Data Engineering at the Speed of Software Development: Introducing the DataOps Manifesto

10:45 a.m. - 11:30 a.m.

Traditional methodologies for handling data projects are too slow to handle the teams working with the technology. The Data- Ops Manifesto was created as a response, borrowing from the Agile Manifesto. This talk covers the principles of the DataOps Manifesto, the challenges that led to it, and how and where it’s already being applied.

Speaker:

, Senior Data Engineer, Excella

 

C202. DataOps in Action

11:45 AM2020-05-202020-05-20

Wednesday, May 20: 11:45 a.m. - 12:30 p.m.

Today, data is being created and collected at unprecedented speed. But organizations don’t want to simply capture and store all this data. They want to draw insights from these vast data stores to improve their decision making and rapidly execute changes for competitive business advantage. That’s where DataOps comes in.

DataOps as a Service

11:45 a.m. - 12:30 p.m.

DataOps as a service can provide DataOps methodologies, tools, and processes to enable teams in R&D to deliver high-quality solutions that leverage automation and collaboration with agility. AstraZeneca’s Viswanathan demonstrates a real-time DataOps-in-action case study for ETL automation in regression testing.

 

C203. DataOps and Data Governance

02:00 PM2020-05-202020-05-20

Wednesday, May 20: 2:00 p.m. - 2:45 p.m.

DataOps emphasizes communication and collaboration among many diverse stakeholders, such as data scientists, analysts, engineers, IT, and quality assurance and governance teams.

The Rise of DataOps (From the Ashes of Data Governance)

2:00 p.m. - 2:45 p.m.

Currently, data governance teams attempt to apply manual control at various points for consistency and quality of the data. By thinking of our machine learning data pipelines as compilers that convert data into executable functions and leveraging data version control, data governance and engineering teams can engineer the data together, filing bugs against data versions, applying quality control checks to the data compilers, and other activities. This talk illustrates how innovations are poised to drive process and cultural changes to data governance, leading to order-of-magnitude improvements.

Speaker:

, VP, Emerging Technology, Pariveda Solutions

 

C204. DataOps Strategy for ML/AI

03:00 PM2020-05-202020-05-20

Wednesday, May 20: 3:00 p.m. - 3:45 p.m.

Timely and frictionless access to data is a critical requirement for the expanding use of data science and AI/machine learning. New approaches help address the problem of skilled practitioners spending too much of their precious time managing data rather than building models.

Data in Mind, Data in Hand: Frictionless Provisioning for Data Science and ML/AI

3:00 p.m. - 3:45 p.m.

“Data in mind, data in hand” is a concept that shrinks the effort and latency from conceiving of a model and having the data to run it. While DataOps promises to streamline analytics of Big Data, it comes at a cost. The architecture to materialize this has many components and is complex. For the data scientist and AI/ ML engineer, the “Data in Mind, Data In Hand” concept demands that all of this complexity is not hidden, but rather, is exposed in such a way that all of the capabilities of the DataOps architecture are there for them to exploit.

Speaker:

, Principal Analyst, Hired Brains Research

 

Wednesday, May 20

Track AI: AI & Machine Learning Summit

 

AI201. Unlocking the Power of Machine Learning

10:45 AM2020-05-202020-05-20

Wednesday, May 20: 10:45 a.m. - 11:30 a.m.

A critical component of unlocking the power of ML is neural networks.

Using Machine Learning and Artificial Neural Networks to Optimize a Critical Utility Plant

10:45 a.m. - 11:30 a.m.

The National Institutes of Health Central Plant Optimization Software uses NOAA weather forecasting data to predict the future campus chilled water load demand. It optimizes operating decisions within a 36-hour time frame to minimize the total operating cost. The software’s decisions include choosing when and how to operate the Thermal Energy Storage (TES) tank and free cooling. The overall process is broken into two parts: Topological TES Dispatch Optimization (TTDO) and Chiller Fleet Management Optimization (CFMP). The component models are trained by the data-driven ML models. Parallel computing is implemented to provide timely results. Chill out and learn from this ML use case.

 

AI202. Diving Into NLP

11:45 AM2020-05-202020-05-20

Wednesday, May 20: 11:45 a.m. - 12:30 p.m.

Although far from a new technology, companies are finding new and interesting approaches to natural language processing (NLP).

The Power of Word Embeddings in NLP: Transfer Learning With BERT to Build a Text Classification Model

11:45 a.m. - 12:30 p.m.

NLP has seen a tremendous amount of research and innovation in the past couple of years. Text classification is extremely important in all industry sectors. Building up a text classification system from scratch for every use case can be challenging in terms of cost as well as resources, considering there is a good amount of dataset to begin training with. That’s where transfer learning comes in. Using models that has been pre-trained on terabytes of data and fine-tuning the base model based on the problem at hand is the new way to efficiently implement ML solutions without spending months on the data cleaning pipeline. This talk highlights ways of implementing the newly launched BERT and fine tuning the base model to build an efficient text classifying model.

Speaker:

, Data Scientist, Indellient US Inc.

 

AI203. The Rise of Cloud Services and AI

02:00 PM2020-05-202020-05-20

Wednesday, May 20: 2:00 p.m. - 2:45 p.m.

Most companies are in the early stages of AI initiatives. This shift will be dramatic as more move forward in the AI journey.

Cloud AI: From Data to Insight

2:00 p.m. - 2:45 p.m.

Although companies are at different stages of their AI journey, most agree that finding or developing analytic talent is a key concern and bottleneck for doing more. What if your business could take advantage of the most advanced AI platform without the huge upfront time and investment inherent in building an internal data scientist team? Google’s Ning looks at end-to-end solutions from ingest, process, store, analytics, and prediction with innovative cloud services. Knowing the options and criteria can really accelerate the organization’s AI journey in a quicker time frame and without significant investment.

Speaker:

, Cloud Advisor, Google

 

AI204. The Future of AI

03:00 PM2020-05-202020-05-20

Wednesday, May 20: 3:00 p.m. - 3:45 p.m.

Sensors, IoT devices, and high-frequency transactional systems collect and store huge volumes of very granular time series data often collected in nanoseconds, milliseconds, and seconds. This data is highly valuable to business as it contains patterns (also called motifs) that, if monitored in real time, can lead directly to significant cost savings, revenue optimization, and other beneficial outcomes. Motif governance will have an impact across all industries—patterns for machine failure in manufacturing, patterns of anti-money laundering, healthcare diagnostics, and many more. Chtilianov provides an introduction to motifs, examples of managing by patterns, the challenges in doing so, and the need for creating governed libraries of patterns.

Speakers:

, Principal Consultant, Capco

, Managing Principal, Capco

Wednesday, May 20

Keynotes

 

CLOSING KEYNOTE - The Speed of Change Is a Myth

04:00 PM2020-05-202020-05-20

Wednesday, May 20: 4:00 p.m. - 5:00 p.m.

Companies are racing to develop AI and analytic models and empower self-service analytics on modern data platforms to get the most value from their data. There is an overwhelming sense of urgency as more stories come out about digital transformation, disruptive innovations, improved customer engagements, product advances, and operational efficiencies at scale. O’Brien addresses the reality of challenges that companies face, the mindset needed to achieve an analytics-oriented future, and strategies for playing the long game.

Speaker:

, Principal Advisor and CEO, Radiant Advisors

Don’t Miss These Special Events

BROUGHT TO YOU BY