Conference Program

Follow us on #DataSummit

View the Advance Program [PDF].

Data Summit 2024 is a unique conference that brings together IT practitioners and business stakeholders from all types of organizations. Featuring workshops, panel discussions, and provocative talks, attendees get a comprehensive educational experience designed to guide them through all of today’s key issues in data management and analysis. Whether your interests lie in the technical possibilities and challenges of new and emerging technologies or using Big Data for business intelligence, analytics, and other business strategies, we have something for you!

Access to all tracks including AI & Machine Learning Summit and Data Mesh and Data Fabric Boot Camp is included when you register for an All-Access Pass or Full Two-Day Conference Pass. Attendees may switch between tracks as they choose. Only interested in the two-day AI & Machine Learning Summit or our one-day Boot Camp? Stand-alone registration for this content is also available.

 

Tuesday, May 7

Workshops

 

W1. Enterprise Data & Analytics Architecture Road Maps That Scale

09:00 AM2024-05-072024-05-07

Tuesday, May 7: 9:00 a.m. - 12:00 p.m.

To develop and implement a successful data and analytics strategy, it is essential to understand the interdependencies required to enable data and analytics capabilities and deliver ongoing business impact. These interdependencies also include the skills and roles of everyone involved in working with data, such as business executives, business analysts, and data scientists. A practical road map focusing on being lean and significantly impacting the business is essential to measure and drive success. Attend this workshop to learn how to identify business drivers and convert them into analytic capabilities and data priorities. You will also learn how to create and execute a road map and deliver a compelling executive briefing.

Speaker:

, Principal Advisor & Industry Analyst, Radiant Advisors

 

W2. Building the Semantic Layer of Your Data Platform

09:00 AM2024-05-072024-05-07

Tuesday, May 7: 9:00 a.m. - 12:00 p.m.

Semantic layers stand out as a key approach to solving business problems for organizations grappling with the complexities of managing and understanding the meaning of their data. A semantic layer, also called a context layer, is a business representation of data that allows organizations to quickly map various data definitions, from multiple data sources to familiar business terms, offering a consistent and consolidated view of data. Join our workshop to gain insights into the foundations of semantic/context layers, their implementation, and the business value they provide by enhancing the utility of your data. The workshop promises an interactive experience, offering participants the opportunity to both understand the nuances of semantic/context layers and actively engage in constructing one.

Speakers:

, COO, Enterprise Knowledge

, Principal Consultant, Enterprise Knowledge LLC

 

W3. Constructing Your Data Strategy: A Business & Technical Foundation for Success

01:00 PM2024-05-072024-05-07

Tuesday, May 7: 1:00 p.m. - 4:00 p.m.

In an information economy, data is the currency of business. Digital upstarts are disrupting every industry, making it imperative that organizations have a strong data strategy that turns data into insights and profitable activity. Organizations need data to streamline operations and reduce costs, improve decisions and plans, and grow revenues and profits. A data strategy is an enterprise-wide plan to harness data and analytics to achieve business goals. At a high level, it is a blueprint for creating a data-driven organization; at a low level, it is a set of blueprints for designing a data architecture to acquire, transform, and deliver data to business users and applications. Eckerson explains the keys to developing an actionable data strategy, how to build an executable road map, and how to create a data strategy that aligns with business needs based on an organization’s unique circumstances, culture, and data maturity.

Speaker:

, President, Eckerson Group

 

W4. Taking Advantage of Machine Learning & Natural Language Processing

01:00 PM2024-05-072024-05-07

Tuesday, May 7: 1:00 p.m. - 4:00 p.m.

Learn from an experienced developer how to use an open source vector database to power your GenAI chatbots, ecommerce recommenders, or similarity search-based apps. Learn the fundamentals of vector embeddings, vector indices, vector databases, and vector search, and gain hands-on experience using them in an application. The workshop starts with a brief introduction to neural networks, setting the stage for understanding vector embeddings. Next, we delve into the workings of vector indices and how to make informed choices among them. For the rest of the workshop, Bergman walks you through a practical project, embedding GitHub project documentation, storing vectors in Milvus, and conducting structured queries to answer questions based on the documentation. No prior experience required!  Just bring your own laptop, with Python 3.10 or higher installed, and your favorite IDE such as VS Code.

Speaker:

, Developer Advocate, Zilliz

Wednesday, May 8

Keynotes

 

Welcome & Keynote: A New Look at Infonomics in the Era of AI

08:45 AM2024-05-082024-05-08

Wednesday, May 8: 8:45 a.m. - 9:30 a.m.

IT and business executives frequently talk about information as one of their most important assets. But few behave as if it is. Even today, executives report on their financials, their customers, and their partnerships, but rarely the health of their data assets. And corporations typically exhibit greater discipline in managing and accounting for their office furniture than their data. The arrival of generative AI (GenAI) is sparking a discussion of how to adopt AI in measuring, monetizing, and managing data assets. Laney shares insights from his best-selling book, Infonomics, about how organizations can actually treat information as an actual enterprise asset. He discusses why data both is and isn’t an asset and property and what this means to organizations—particularly as they prepare to put AI to work broadly. He also covers well-honed approaches to and examples of organizations managing, monetizing, and measuring their data assets. 

Speaker:

, Innovation Fellow, Data & Analytics Strategy, West Monroe and Author of "Infonomics" & "Data Juice", visiting professor at University of Illinois Gies College of Business

 

Keynote: Data Security in the World of AI

09:30 AM2024-05-082024-05-08

Wednesday, May 8: 9:30 a.m. - 9:45 a.m.

Jain and Das discuss how organizations should secure their AI application and the critical data they are feeding into these systems to ensure compliance and prevent damaging data leaks.

Speakers:

, Co-founder & Chief Product Officer, Acante

, Co-founder & VP, Engineering, Acante

 

Keynote: How to Create a Collaborative Platform for Data Management and Governance

09:45 AM2024-05-082024-05-08

Wednesday, May 8: 9:45 a.m. - 10:00 a.m.

Learn how National Student Clearinghouse (NSC) created an operational MDM platform, giving access to a large volume of streamlined, high-quality data. With billions of records, a legacy IT system, and an enterprise focus on moving to the cloud, NSC focused on modernization for the cloud data ecosystem, adhering to compliance regulations and enhancing matching across the enterprise. Discover how NSC is now empowered with a single platform to support and facilitate customer requests with one source of truth while benefiting from a collaborative hub for data management and governance.

Speaker:

, Managing Director of Information as a Product, National Student Clearinghouse

 

Wednesday, May 8

Track A: Modern Data Strategy Essentials Today

Moderator:
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors
 

A101. Overcoming Data Silos

10:45 AM2024-05-082024-05-08

Wednesday, May 8: 10:45 a.m. - 11:45 a.m.

Important to an overall data strategy is management and governance of data assets.

How to Create, Govern, & Manage Data Products: Practices & Products You Need to Know

Data products promise to deliver high-quality datasets to business users on demand, fostering greater trust in data and higher levels of empowerment and self-service. But many companies struggle to understand not only what data products are, but how to create, govern, and manage them. Thought leader Eckerson dives into the practical implications of running an organization using data products, describing how a data product is different from a data asset and how to create data products from data assets using a variety of tools and techniques. He addresses organizational, architectural, and process considerations for delivering data products at scale.

Speaker:

, President, Eckerson Group

 

A102. Strategy Informed by Data

12:00 PM2024-05-082024-05-08

Wednesday, May 8: 12:00 p.m. - 12:45 p.m.

Looking at data strategy, a number of elements combine to inform strategic decisions, including repositories and data products.

Get Better Analytics by Putting Less Data in Your Database

A recent survey showed that 67% of companies had their software budgets cut during 2023. SaaS databases are easy to use and powerful, but they put a strain on budgets. Still, no one can afford to skimp on smart data analytics. How do you get more analytics out of your SaaS data warehouse/lakehouse, without spending more money? Treat incoming data streams as a graph. Relationships and categories of data can immediately be seen and acted upon. Duplicate entities can be resolved. Key pattern signals in noisy data streams can be pinpointed and the noise that you don’t need tossed out. By putting only relevant and clean data into analytical repositories, tons of useless data never have to be stored in pay-per-use systems, vastly reducing costs. You get smarter answers on clean, pre-filtered data in real time.

Speaker:

, Director of Product Innovation, thatDot, creators of Quine OSS

Unveiling the Business Value of Data Products: A Paradigm Shift in Data Utilization

In today's dynamic and data-centric business environment, organizations increasingly recognize the critical role of data products in extracting maximum value from their expansive data landscapes. This session explores what data products actually are beyond the buzzwords, why data products are becoming indispensable in data-driven business strategies, and what the best practices are for adopting data products. Join Denodo to better understand how data products can be a transformative approach in helping to democratize data access and revolutionize your decision-making processes.

Speaker:

, Director of Product Marketing, Denodo

 

A103. Becoming an Insights-Driven Enterprise

02:00 PM2024-05-082024-05-08

Wednesday, May 8: 2:00 p.m. - 2:45 p.m.

As organizations concentrate on being data-driven, let’s not forget the importance of becoming insights-driven as well.

Charting a New Course: The Essential Guide to Insights-Driven Transformation

As the digital landscape evolves at an unprecedented pace, the ability to leverage data for strategic decision making has become essential for staying competitive and innovative. Thai provides a road map for harnessing the power of data and analytics to drive business success and examines the key components of becoming an insight-driven decision organization. Included are building robust data infrastructure, fostering a culture that values data literacy and insights, and implementing tools and technologies for data analytics and interpretation. Finally, he looks ahead at emerging trends and future possibilities in the realm of business insights and analytics.

Speakers:

, Head, Innovation & Data Science, Arbella Insurance Group

, Sr. IT Manager, Arbella Insurance Group

 

A104. Building Effective Data Products

03:15 PM2024-05-082024-05-08

Wednesday, May 8: 3:15 p.m. - 4:00 p.m.

Many elements of datasets need to be considered when creating data products that are effective.

Bridging the Gap Between Data & the Real World

What datasets do you really need to be successful? The need is for consistent, clean, and curated datasets. Trusted data means acting on the data for critical business decisions. Bridging the gap between data and the real world empowers your data community to act on the data and provide monetary value from the data. How well is your organization providing trusted datasets to feed your AI and ChatGPT? How are datasets synthesized, scored, and shared? Find out how organizations can benefit from a data product, value scoring, and marketplace approach.

Speaker:

, Field Chief Technology Officer, Quest

Empowering Data Excellence: Leveraging Quest IM Solutions Within a Unified Data Management Framework.

In today's digital landscape, data is key to decision making and planning. Sandhill Consultants, a certified Quest partner, leads in integrating data management practices. Our approach unifies data modeling, governance, catalogs, and operations and follows industry standards. Giles introduces the audience to the transformative potential of Quest IM Solutions for data management and showcases how the partnership not only accelerates the delivery of trusted data assets, but also optimizes business strategies through enhanced data governance, management, and utilization.

Speaker:

, Principal Architect, Sandhill Consultants

 

A105. Competing on Analytics

04:15 PM2024-05-082024-05-08

Wednesday, May 8: 4:15 p.m. - 5:00 p.m.

Designing a pragmatic approach to competing on analytics relies on using strong analytic methods to get the most out of your data.

Let’s Get the Basics Right

Drawing on her 25 years of experience with data science, Chase lays out a simple, effective way to get the right analysis to drive your effective data-driven plans. She guides us through the TAP method using simple, understandable, and engaging examples. She brings to life the method for measuring all the various types of data organizations look at in experience management. Gain practical knowledge to accurately determine metrics, along with a new way of looking at your data.

Speaker:

, Director, Quality Analytics and Reporting, Alexion, AstraZeneca Rare Disease Business Unit

Introduction to MySQL HeatWave

MySQL HeatWave is a service that not only speeds up analytics on data stored in MySQL database, but also allows users to run analytics on data stored in an object store. Sundara covers key features of MySQL HeatWave, including some of the machine learning-based automation features offered.

Speaker:

, Technical Architect, MySQL HeatWave, Oracle

 

Networking Reception in the Data Solutions Showcase

05:00 PM2024-05-082024-05-08

Wednesday, May 8: 5:00 p.m. - 6:00 p.m.

 

Wednesday, May 8

Track B: What’s Next in Data & Analytics Architecture

Moderator:
Joe McKendrick, Principal Researcher, Unisphere Research
 

B101. Moving to a Modern Data Architecture

10:45 AM2024-05-082024-05-08

Wednesday, May 8: 10:45 a.m. - 11:45 a.m.

One important aspect, when moving to a modern data architecture, is an equally modern approach to data governance.

Data Governance Begins With Data Architecture

One of the challenges facing enterprise architecture is maintaining consistency across the enterprise. This is complicated by the fact that data comes from numerous disparate sources and systems that represent different purposes, focuses, and objectives. On top of that, there is considerable confusion as to terms such as data lake, data warehouse, and operational data store. To overcome these challenges at an enterprise level, a framework is necessary to apply data uniformly, consistently, and in a meaningful manner. The framework will transform data from multiple, disparate sources of data into an operational data store, a consistent, homogeneous environment, as the single dependable source of truth for all reporting and analytics.

Speaker:

, Enterprise Data Architect, Healthfirst, Inc.

Data Frenzy to Data Freedom: How Centralized Data Powers Modern-Day Data Analytics & AI

In today’s modern, data-rich environment, companies face the challenge of centralizing large volumes of diverse data to drive insights and operational efficiency. Without a comprehensive data centralization strategy, organizations risk missing out on significant opportunities for revenue growth and competitive advantages with data trapped in silos. By freeing data from silos and establishing a single source of truth, companies can accelerate their innovation to stay competitive.

Speaker:

, Lead Sales Engineer, Fivetran

 

B102. Enabling Real-Time Analytics

12:00 PM2024-05-082024-05-08

Wednesday, May 8: 12:00 p.m. - 12:45 p.m.

Real-time analytics contributes to building scalable and fault-tolerant data processing pipelines.

Building Real-Time Pipelines With FLaNK

The combination of Apache Flink, Apache NiFi, and Apache Kafka for building real-time data processing pipelines is extremely powerful, as demonstrated by this case study using the FLaNK-MTA project. The project leverages these technologies to process and analyze real-time data from the New York City Metropolitan Transportation Authority (MTA). FLaNK-MTA demonstrates how to efficiently collect, transform, and analyze high-volume data streams, enabling timely insights and decision-making.

Speaker:

, Principal Developer Advocate, Streaming, Cloudera and Future of Data meetup, startup grind, AI Camp

 

B103. Improving Data Discovery

02:00 PM2024-05-082024-05-08

Wednesday, May 8: 2:00 p.m. - 2:45 p.m.

Finding needed data requires more than a user-friendly interface, it needs good metadata and innovative uses of LLMs.

Navigating the Data Jungle: Strategies for Effective Data Discovery

In the contemporary data-driven landscape, businesses are inundated with vast amounts of data, necessitating sophisticated data management strategies. However, complexities arise in data management, particularly in large-scale environments. Key challenges include tracing data lineage, determining data freshness, identifying personally identifiable information (PII), and locating responsible data custodians, especially in scenarios where ownership is ambiguous due to staff turnover or lack of clear accountability. This presentation delves into the methodologies employed to integrate metadata into Acryl and explores the innovative use of large language models (LLMs) in responding to natural language queries about data. Knowledge graphs, in conjunction with LLMs, facilitate complex inquiries related to data discovery, thereby advancing our data discovering capabilities.

Speaker:

, Lead Software Engineer, Chime

 

B104. Modern Data Ecosystems

03:15 PM2024-05-082024-05-08

Wednesday, May 8: 3:15 p.m. - 4:00 p.m.

Infosec teams and data teams are naturally at odds because they have competing agendas, but there are ways to meet the needs of both without compromising the requirements of either.

Can’t We All Get Along? Effectively Managing the Demands of InfoSec Teams & Data Teams About Sensitive Data

In today's digital world, the integration of data governance and data security is critical. Security threats continue to evolve, while the sources and end points of an organization’s data continue to grow exponentially. For organizations to gain rapid access to usable data, they must first prioritize fostering a healthy relationship between their data governance and infosec teams. The chief data officer and chief information security officer approach data with the same end goal in mind, but often with different tooling and systems. The rise of SaaS-based automation and simplified data tools are paving the way to unified security and governance efforts to provide a common language and framework for CISOs and CDOs to join together in a united force.

Speaker:

, VP, Product, ALTR Solutions, Inc.

 

B105. Architecting for Speed & Scale

04:15 PM2024-05-082024-05-08

Wednesday, May 8: 4:15 p.m. - 5:00 p.m.

Ideas for saving time, enhancing data analytics, and adding business value begin with actual success stories.

Navigating Data Harmony by Exploring the Power of Apache Iceberg

Explore the potential of Apache Iceberg in the world of structured data. Uncover its unique features, including schema evolution and ACID transactions, making it an ideal solution for large-scale datasets. See how Apache Iceberg seamlessly fits into your data architecture, providing flexibility, scalability, and top-notch performance for analytics and data warehousing. Steinkamp shares real-world success stories where organizations have saved time and supercharged their business value with Apache Iceberg. Delve into how it enhances data relationships and analytics, making structured datasets more insightful. Get ready for an insightful exploration, where practical insights, success stories, and strategies for leveraging Apache Iceberg in structured data management and analytics are shared.

Speaker:

, Developer Advocate, InfluxData

 

Networking Reception in the Data Solutions Showcase

05:00 PM2024-05-082024-05-08

Wednesday, May 8: 5:00 p.m. - 6:00 p.m.

 

Wednesday, May 8

Track C: Data Mesh & Data Fabric Boot Camp

Moderator:
Edee Edwards, Taxonomy & Ontology Manager, FM Global, USA
 

C101. Data Mesh in the Real World

10:45 AM2024-05-082024-05-08

Wednesday, May 8: 10:45 a.m. - 11:45 a.m.

Using data mesh to improve decision-making involves accelerating the adoption of many data elements.

From Boardroom to Battlefield: Accelerating Adoption of Data, Analytics, & AI Using Data Mesh

The Department of Defense (DoD) initiated efforts toward advancing the agency's goal to improve decision making across all DoD entities. This goal rests under the foundational principle of accelerating DoD's adoption of data, analytics, and AI by prepositioning a common frame of reference for all DoD entities to converge and share data and AI models. Under the auspices of the DoD Chief Digital and Artificial Intelligence Office (CDAO), this effort will create an enterprise-level infrastructure of services intended to drive an integrated data, analytics, and AI strategy, while maturing a responsible DoD-wide AI ecosystem. This presentation highlights the case study on DoD’s efforts to establish a data mesh construct based on the following four elements: domain-oriented/decentralized data ownership and architecture, data as a product, self-service data infrastructure as a platform, and federated computational governance.

Speaker:

, Director, Business Intelligence and Metrics, U.S. Department of Defense (DoD)

Data Fabric Realized Sooner Than You Think

Advancements in data automation, low-code/no-code platforms, and APIs make it quicker and easier for organizations to start their data fabric projects, often in just a few months. Learn how these advancements enable smoother integration and management of data across the enterprise, leading to faster decision making, efficiency, AI readiness, and increased profits. Discover how leveraging data automation can accelerate your data fabric strategy so your data is more effectively fueling significant growth and sparking innovation.

Speaker:

, VP, Marketing and Partnerships, Syncari

 

C102. Taking Data Fabric to the Next Level

12:00 PM2024-05-082024-05-08

Wednesday, May 8: 12:00 p.m. - 12:45 p.m.

With the rise of generative AI (GenAI) and large language models (LLMs), the data fabric can add a range of new facilities to accelerate data democratization.

Using Data Fabrics With GenAI to Automate Data Management

The data fabric architecture has been steadily gaining traction in the enterprise to unify data across disparate sources into coherent data services. By leveraging the power of GenAI models in conjunction with smart data fabrics, organizations can automate the integration of data, provide natural language access to data and analytics, improve data quality while decreasing the need for labor-intensive data cleansing, and secure and govern data in real time. Fried explores the benefits of using data fabrics and GenAI to improve data management practices and provides examples of how these technologies can be used in real-world scenarios. He also notes the risks and lays out a practical path for applying this technology safely.

Speaker:

, Director, Platform Strategy and Innovation, InterSystems

 

C103. Data Mesh Best Practices

02:00 PM2024-05-082024-05-08

Wednesday, May 8: 2:00 p.m. - 2:45 p.m.

Data mesh is evolving due to changes in data architecture and technological advances.

From Daunting to Doable: The Evolution of Data Mesh

There is no doubt that data mesh principles resonate with so many data professionals, particularly those looking to move beyond brittle, monolithic architecture. However, adopting data mesh can seem daunting, due to both a scarce but improving ecosystem of tools, as well as organizational change management. Luckily, data mesh lends itself to evolutionary adoption, helping organizations to leverage existing platform investment and gain incremental value. Cordo reviews architectures and best practices from real-world experience, grounded by the stories of two organizations.

Speaker:

, CEO/Founder/Builder, Data Futures, LLC

 

C104. Data Fabric Key Enablers

03:15 PM2024-05-082024-05-08

Wednesday, May 8: 3:15 p.m. - 4:00 p.m.

New technologies can solve organizations’ operational problems.

Unlocking Data Agility; Powering Data Fabric Architectures

In this session, Bagnall explores ETL's role in seamlessly integrating with data fabric architectures to empower organizations with the ability to efficiently manage, integrate, and analyze their data from diverse sources. He delves into real-world use cases, best practices, and the key features that make any ETL process a valuable ally in your journey toward a more agile and unified data ecosystem.

Speaker:

, Senior Product Manager, Matillion

 

C105. Simplifying Your Data Mesh Journey

04:15 PM2024-05-082024-05-08

Wednesday, May 8: 4:15 p.m. - 5:00 p.m.

Data mesh plays a pivotal role within modern cloud architecture, while a semantic layer acts as a cohesive force within the data mesh framework.

Unlocking the Power of Data With a Semantic Layer

Data mesh is swiftly gaining traction as an innovative strategy for expediting data and analytics advancements. It achieves this by distributing data product development through domain-oriented, self-service methods. Crucial to the success of this approach is the emergence of the semantic layer, serving as a foundational catalyst supporting composable model design, enhanced collaboration, and decentralized ownership. This enlightening session delves into the integral role of the semantic layer within a contemporary analytics architecture, elucidating its interconnectedness with the data mesh concept.

Speaker:

, Director of Business Development, Atscale

 

Networking Reception in the Data Solutions Showcase

05:00 PM2024-05-082024-05-08

Wednesday, May 8: 5:00 p.m. - 6:00 p.m.

 

Wednesday, May 8

Track AI: AI & Machine Learning Summit

Moderator:
Ed Dale, Emerging Technology Associate Director, EY
 

AI101. Succeeding With Generative AI

10:45 AM2024-05-082024-05-08

Wednesday, May 8: 10:45 a.m. - 11:45 a.m.

Generative AI (GenAI) is all the rage these days, but finding effective and realistic uses for it is still elusive.

Shatter the Seven Myths of GenAI to Operationalize Impact

The vast majority of current GenAI projects will fail, not because of inherent flaws in large language models (LLMs), but because of misconceptions about how to use them and the lack of capabilities needed to successfully design, develop, and operationalize GenAI-driven applications. Carlsson debunks the most harmful myths that set up projects for failure and looks at case studies of how advanced AI teams in industries ranging from pharma to food delivery are shattering these myths and delivering transformative outcomes.

Speaker:

, Head of AI Strategy, Domino Data Labs

Integrating LLMs With a Private Knowledge Platform

In this era where AI is reshaping industries, the integration of large language models (LLMs) like ChatGPT with private knowledge platforms is a groundbreaking development. Datavid shares experiences and lessons learned from both internal R&D and the benchmarking of several LLMs with customers and subsequent integration with existing KM platforms. Deep dive into the synergistic potential of combining the advanced natural language processing capabilities of LLMs with the rich, domain-specific data housed in private knowledge platforms. Come explore how this integration can revolutionize AI applications in your industry!

Speakers:

, Chief Revenue Officer, Sales, Datavid Limited

, Director, Sales & Consulting North America, Datavid Limited

 

AI102. Navigating the Landscape of AI Techniques

12:00 PM2024-05-082024-05-08

Wednesday, May 8: 12:00 p.m. - 12:45 p.m.

When talking about AI, it’s important to recognize that many technologies, not just GenAI, need to be considered.

Exploring the Interconnected World of Logistic Regression, Neural Networks, & Computer Vision

Chen explores the inherent connection among logistic regression, neural networks, and computer vision using mathematical structures as a lens. Drawing parallels between the construction of logistic regression functions and mathematical representations uncovers the foundational role of abstract mathematical concepts in shaping these methodologies. In logistic regression, the linear function, dynamically shaped by a combination of various features, emerges as a visual metaphor—a plane in the mathematical fabric. In neural networks, weights and nodes form a space surrounded by multidimensional planes, aligning closely with mathematical principles. In computer vision, filters function as weighted combinations of pixel features, extending the mathematical concept to image processing. This presentation illuminates the harmony and shared essence of mathematical principles across diverse machine learning and computer vision paradigms.

Speaker:

, Financial Analytic Manager, Freddie Mac

Vector Databases: Innovating Data Management in the AI Era

In the rapidly evolving landscape of AI, the ability to efficiently handle and process vast amounts of complex data is paramount. Vector databases and vector search have emerged as critical components in this domain, offering a specialized approach to managing multidimensional datapoints, or vectors, that are essential for advanced AI applications. Agarwal gives a comprehensive exploration of vector databases, their role in AI solutions, and the emerging trends and technologies that are shaping their development.

Speaker:

, Director and Global Practice Leader, Site Reliability Engineering, Cloud & NoSQL Databases, Datavail

 

AI103. Putting Generative AI to Work

02:00 PM2024-05-082024-05-08

Wednesday, May 8: 2:00 p.m. - 2:45 p.m.

A well-known drawback to using generative AI (GenAI) is its tendency to produce false information.

Strategies to Mitigate Hallucinations in LLMs

A crucial aspect of constructing and applying GenAI for enterprise-level applications is mitigating hallucinations. The generation of factually inaccurate information can occur both during the initial development of large language models (LLM s)and the subsequent refinement of existing model responses through prompt engineering. Bhattacharya explores diverse approaches to mitigate these issues, including the introduction of new decoding strategies, optimizations based on knowledge graphs, the incorporation of innovative components in loss functions, and supervised fine-tuning. She also addresses methods such as retrieval augmentation, feedback-based strategies, and prompt tuning, which can be implemented during the prompt engineering phase.

Speaker:

, Senior Data Scientist, BNY Mellon

Is Your Data Ready for AI?

AI has the power to help your organization disrupt, innovate, generate faster insights, cut costs, and increase productivity. But responsible and successful AI use demands high-quality, trusted data and transparent, observed, and accessible data intelligence. See firsthand how taking a model-to-marketplace approach to managing and leveraging your organization's data can help you gain the footing needed to get the AI results you desire.

Speaker:

, Director for Professional Services and Presales, Quest

 

AI104. The Rise of Vector Databases

03:15 PM2024-05-082024-05-08

Wednesday, May 8: 3:15 p.m. - 4:00 p.m.

The vector database has fast emerged as a preferred platform for GenAI applications.

Whither the Vector Database? Why & How These New Platforms Support GenAI

While companies have long used vector databases to recognize patterns and support machine learning recommendation engines, now they are using them to support GenAI initiatives by storing, modeling, and searching tokenized data documents. Vector databases feed relevant content to language models (LMs), helping enrich prompts, fine-tune models, and govern outputs. Petrie defines vector databases and how they help companies boost productivity and gain competitive advantage with domain-specific GenAI initiatives. He looks at market requirements, adoption trends, challenges, benefits, use cases, and architectural approaches.

Speaker:

, VP Research, BARC

Approaches to Application Modernization in the Emerging Era of Mainstream AI/ML

Today's data-driven strategies for boosting revenue, reducing costs, and minimizing risk often begin with modernizing applications. According to Gartner, 68% of executives globally plan to up their spending on modernization technologies; 90% see AI and ML as likely additions by 2026; and key focus areas include low-code/no-code, GenAI, and distributed cloud. See how companies like Visa, BNP Paribas, Standard Chartered, and Deutsche Bank are achieving quicker feature rollouts and reducing total cost of ownership.

Speaker:

, Sr. Director Solution Architecture, Hazelcast

 

AI105. Improving the Accuracy & Performance of AI Models

04:15 PM2024-05-082024-05-08

Wednesday, May 8: 4:15 p.m. - 5:00 p.m.

Models can be structured and designed in a variety of ways to enable them to provide valuable insights.

Empowering AI Through Time Series Analysis

Time series analysis plays a crucial role in enhancing the capabilities of AI by providing valuable insights into temporal patterns, trends, and dependencies within datasets. Oad explores the synergies between time series analysis and AI, showcasing how the integration of temporal data can significantly improve the performance and accuracy of AI models. Key points to cover include temporal context in data, enhanced predictive modeling, improved anomaly detection, dynamic feature engineering, optimizing AI for time-varying data, forecasting and trend analysis.

Speaker:

, ML Engineer, U.S.Xpress, Inc.

 

Networking Reception in the Data Solutions Showcase

05:00 PM2024-05-082024-05-08

Wednesday, May 8: 5:00 p.m. - 6:00 p.m.

Thursday, May 9

Keynotes

 

Keynote: Mastering the Data Evolution: AI, Graph Modeling, & Tactical Curation

09:00 AM2024-05-092024-05-09

Thursday, May 9: 9:00 a.m. - 9:45 a.m.

Confronting the toughest data management challenge head-on, Rudden dissects the complexities of AI-driven versioning and presents a road map for navigating this intricate landscape. She delves into the strategic application of taxonomies and ontologies within the realm of graph modeling—heralding a new era of data structuring that boosts analytics, foresight, and decision making. Her approach provides attendees with the acumen to select, organize, and manage the right datasets, fortifying their data architecture against the rapid evolution of technology. Geared for a diverse array of data professionals, from strategists and scientists to engineers and BI experts, Rudden's insights are set to empower the audience with practical tools and methodologies. This keynote is your key to demystifying data management and embracing its future with confidence and expertise.

Speaker:

, CEO, Bast.ai

 

Keynote: Modern Data & Analytics Architecture: Solving the Real-Time Challenge

09:45 AM2024-05-092024-05-09

Thursday, May 9: 9:45 a.m. - 10:00 a.m.

Although typical data architectures are able to process streaming data, more often than not, the analytics are performed offline in batch mode. The real-time data is available for analysis, but the benefits of real time are lost the instant the data lands in a datastore or lakehouse for analysis. Ahuja delves into a modern data and analytics architecture—the Unified Real-Time Data Platform—that solves the real-time challenge. He shares details and use cases on how to process streaming data, enrich it with contextual historical data, and execute advanced analytical workloads—all at ultra-low latencies and massive scale.

Speaker:

, Chief Technology Officer, GridGain Systems, Inc.

 

Thursday, May 9

Track A: Emerging Technologies & Trends in Data & Analytics

Moderator:
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors
 

A201. Modernizing Your Data: New Platforms, Tools, & Trends

10:45 AM2024-05-092024-05-09

Thursday, May 9: 10:45 a.m. - 11:30 a.m.

Join our panel of industry experts as they discuss various aspects of data modernization. Representing a variety of approaches, the panel considers how emerging technologies, platforms, and architectures affect data management, organization, and analytics. What will the future bring? Our panel gives their opinions.

Speakers:

, Director of Product Management, MySQL HeatWave and Cloud Observability, Oracle

, Technical Director of Alliances, VAST Data

, CEO, Neural Magic

 

A202. Unlocking the Power of Data Science

11:45 AM2024-05-092024-05-09

Thursday, May 9: 11:45 a.m. - 12:30 p.m.

Data science is behind many different functions benefitting the enterprise.

Uncovering Behavioral Segments by Applying Unsupervised Learning to Location Data

Location data is a powerful tool. To help marketers understand and best meet holiday shoppers’ needs, Foursquare applied unsupervised learning methods to location data to derive meaningful segments of individuals based on their demographics and shopping behaviors. Rather than invest in reaching more general segments like moms or Millennials, marketers are now able to focus their efforts by targeting these data-driven segments. Dimensionality reduction methods such as principal component analysis (PCA), combined with clustering methods such as k-means can isolate which features describe the most variance among users. Those features can then be used to group like users together in an unsupervised manner to analyze results. This information empowers marketers to determine which segments present the largest opportunity and what strategies to use to best target them.

Speaker:

, Senior Data Scientist, Foursquare

 

A203. Adopting a DataOps Strategy

02:00 PM2024-05-092024-05-09

Thursday, May 9: 2:00 p.m. - 2:45 p.m.

Data is recognized as a critical asset for all organizations so leaders look to leverage generative AI (GenAI) capabilities.

DataOps Can Build the Foundation for Your GenAI Ambitions

GenAI cannot function without a bedrock of data, yet traditional data management is failing to meet businesses’ new demands, especially the need for real-time, consistent, and self-service data for applications, insights, and analytics. DataOps provides a set of strategies that allow organizations to harness data to enable solutions, develop data products, and activate data for business value across all technology tiers, from infrastructure to experience, evolving to help organizations move from analytics to real-time operational use cases such as GenAI.

Speaker:

, Vice President, Data Strategy and Management, EXL

 

A204. Accelerating the Time to Value of Data

03:00 PM2024-05-092024-05-09

Thursday, May 9: 3:00 p.m. - 3:45 p.m.

Data analytics is rapidly evolving. Stay ahead of the curve with an understanding of the direction and evolution ahead.

Fast-Track Your Data Exploration & Preparation

Data exploration and preparation is core to gaining insights from data. In this session, attendees learn how to fast track exploration and preparation efforts. Using existing skill sets in SQL and cloud-native tools, you can accelerate your time to insights with the framework delivered in this session. Learn how GenAI and advancements in platform tools and automation will streamline and supercharge your data analytics efforts.

Speaker:

, Head of Customer Engineering, Google LLC

 

Thursday, May 9

Track B: Navigating the Hybrid & Multi-Cloud Future

Moderator:
Joe McKendrick, Principal Researcher, Unisphere Research
 

B201. Taking Your Data & Analytics to the Cloud

10:45 AM2024-05-092024-05-09

Thursday, May 9: 10:45 a.m. - 11:30 a.m.

Succeeding in the cloud goes beyond simply adopting cloud-native technologies; costs must also be considered.

Cloud Cost Optimization: Strategies, Tips, & Best Practices

Are you taking a hard look at your cloud costs this budgeting season? Is your IT budget getting eaten up without a good ROI to show for it? Do you need to justify your cloud expenses and business cases to leadership? Join Agarwal for a practical discussion on cloud cost optimization strategies, tips, and best practices. Learn about right-sizing your cloud resources, monitoring and managing your cloud usage effectively, identifying and eliminating wasteful spending, leveraging automation for better cost control, exploring cloud pricing models and modernization opportunities, and achieving the right balance of price and performance for your business requirements.

Speaker:

, Director and Global Practice Leader, Site Reliability Engineering, Cloud & NoSQL Databases, Datavail

 

B202. Building & Managing Cloud Database Products

11:45 AM2024-05-092024-05-09

Thursday, May 9: 11:45 a.m. - 12:30 p.m.

As cloud computing increasingly dominates the data world, it’s good to pay attention to products that are cloud-based.

Self-Service & Hyper Automation of Cloud Infrastructure

For any modernization and digital transformation initiative to be successful in today's rapidly changing landscape, enterprises need a lean engineered product model that empowers the customers to easily build and manage cloud database products themselves meeting all enterprise control objectives. In this presentation, Roy discusses how a hyper-automation framework could be utilized for both database stateful and stateless automation, thus enabling full automation of all aspects of a product management cycle. This framework, in addition to database and analytical products, can also be applied for cloud data governance, data movement, and data enablement products.

Speaker:

, Engineering Principal, Wells Fargo

 

B203. Data Strategies for a Hybrid & Multi-Cloud World

02:00 PM2024-05-092024-05-09

Thursday, May 9: 2:00 p.m. - 2:45 p.m.

Key elements to consider when strategizing about putting data in the cloud include meeting business objectives.

Architecting Oracle Workloads on VMware Multi-Clouds

It is a formidable task to ensure business-critical applications meet business service level agreements within the given RTO/RPO. This requires simplifying, standardizing, and automating the deployment and configuration along with providing HA and DR to these critical applications to meet business objectives. VMware multi cloud solutions, including VMware Cloud on AWS, Oracle Cloud VMware Solutions, and others, provide consistent and interoperable infrastructure and services between VMware-based data centers and the public cloud, which minimizes the complexity and associated risks of managing diverse environments.

Speaker:

, Co-Founder, LicenseFortress

 

B204. Changing Landscape of Oracle Auditing

03:00 PM2024-05-092024-05-09

Thursday, May 9: 3:00 p.m. - 3:45 p.m.

When auditors come calling, it’s necessary to be prepared well ahead of time.

Surviving an Oracle Java Audit

One year ago, Oracle dramatically changed how businesses can license Java moving forward. In effect, Oracle moved the goalposts such that companies can no longer license Java by the processor or named-user-plus model. Instead, Oracle now utilizes an employee-based licensing model in which every employee, full-time or part-time, must be licensed, regardless of whether they use Java. Unsurprisingly, Oracle’s definition of what constitutes an employee for licensing purposes is breathtakingly broad. Gartner has been quoted as saying 1 in 5 users will be audited in the next 3 years. With Oracle now aggressively pursuing companies for their Java usage, this session explores what you need to know when Oracle comes knocking on your door.

Speakers:

, Co-Founder, LicenseFortress

, Co-Founder/Chief Operating Officer, LicenseFortress

 

Thursday, May 9

Track C: Database Management in the Era of DevOps, Microservices, & the Cloud

Moderator:
Heather Hedden, Senior Consultant, Enterprise Knowledge
 

C201. Modern Apps: Achieving Agility, Scalability, & Reliability

10:45 AM2024-05-092024-05-09

Thursday, May 9: 10:45 a.m. - 11:30 a.m.

Move into the future with data management tips and techniques.

A Step Toward Agility: MassMutual’s Evolution to Containerization

Developers in general are dealing with several challenges in managing data pipelines. Unpredictability and Inconsistency: Inconsistent tool usage leads to pipeline performance and reliability issues. Time to Market: Understanding of tools is needed for timely pipeline deployment while maintaining quality. Security: A least privilege model is required for strict access control to pipelines and data. Dependency Management: On-the-fly dependency installation causes compatibility issues and vulnerabilities. Scaling: Pipelines must scale to handle rapid data job submission without compromising performance or reliability. As MassMutual data pipelines move towards becoming centrally managed, owned, and versioned, the organization is working towards a secure, governed, reliable, and scalable run-time for all applications. Join Anand and Vijay as they talk about the Next-Gen Data Pipeline Solution at MassMutual and how they achieved a seamless, containerized solution for managing their workloads and execution environment.

Speakers:

, Lead Software Engineer, MassMutual

, Lead Software Engineer, MassMutual

 

C202. Transforming Your Apps With Real-Time Analytics

11:45 AM2024-05-092024-05-09

Thursday, May 9: 11:45 a.m. - 12:30 p.m.

One approach to digital transformation involves data analytics and real-time data.

Elevate Your Data: Real Analytics on Kubernetes via ClickHouse

Looking to transform your apps with real-time analytics? Kubernetes is an outstanding platform operating high-performance databases, and ClickHouse runs well on it. This talk begins with the basics of Kubernetes and introduces an operator that enables you to stand up ClickHouse clusters. Hodges walks through the installation process and brings up a ClickHouse cluster in real time. He then shows how running on Kubernetes enables emergent behavior like independent scaling of compute and storage, server fault tolerance, cross-AZ high availability, and rolling upgrades. You'll have enough guidance from this talk to start your journey to real-time data on a robust, cloud-native architecture.

Speaker:

, CEO, Altinity

 

C203. Bridging the Gap Between Databases & DevOps

02:00 PM2024-05-092024-05-09

Thursday, May 9: 2:00 p.m. - 2:45 p.m.

Databases and DevOps do not stand in isolated silos.

A New Era Has Come & So Must Your Database Observability

If you think metrics is all you need to build proper observability and monitoring, Furmanek has a differing view. Although we have graphs, charts, and diagrams in place, we later learn that metrics are not enough. We don’t know how to configure thresholds, and we can’t troubleshoot issues when they appear. In this talk, Furmanek shows how to build proper observability and what’s needed in the modern world.

Speaker:

, DevRel, Metis

 

C204. Creating Knowledge Graphs

03:00 PM2024-05-092024-05-09

Thursday, May 9: 3:00 p.m. - 3:45 p.m.

A knowledge graph is commonly understood to be a knowledgebase that uses a graph database (a graph-structured data model) to integrate and link data in a form that is understood by both humans and machines.

Enterprise Knowledge Graphs: The Importance of Semantics

Hedden explores all the components of an enterprise knowledge graph and provides further insight into the semantic layer or knowledge model component, which includes an ontology and controlled vocabularies, such as taxonomies, for controlled metadata. Further, she looks at the relationship knowledge graphs have with AI, including generative AI. While data experts tend to focus on the graph database components (RDF triple store or a label property graph), they should not overlook the importance of this semantic layer.

Speaker:

, Senior Consultant, Enterprise Knowledge and Author, The Accidental Taxonomist

 

Thursday, May 9

Track AI: AI & Machine Learning Summit

Moderator:
Ed Dale, Emerging Technology Associate Director, EY
 

AI201. Making AI Ethical & Explainable

10:45 AM2024-05-092024-05-09

Thursday, May 9: 10:45 a.m. - 11:30 a.m.

No more black box AI implementations—the technology needs to be ethical and explainable.

Explainable AI Component of Responsible & Ethical AI Development

As AI becomes increasingly integrated into our lives, it is important to ensure that it is developed and deployed in an ethical and responsible manner. One key component of this is explainable AI, which provides transparency and accountability by enabling users to understand how AI systems make decisions and identify potential biases or errors. This can help mitigate risks and build trust in AI, leading to more ethical and responsible use. In this presentation, Moharir explores the importance of explainable AI and its role in promoting responsible and ethical AI development.

Speaker:

, Lead Data Scientist, Microsoft

 

AI202. Incorporating GenAI in Enterprise Apps

11:45 AM2024-05-092024-05-09

Thursday, May 9: 11:45 a.m. - 12:30 p.m.

Take advantage of machine learning and NLP within the organization.

Making Every Word Count: Using NLP to Make GenAI More Efficient in Enterprise Applications

Generative AI (GenAI) is proving useful in the enterprise, but in many applications, it can't be used "off-the-shelf." For instance, deploying GenAI to answer business research questions from long text documents—primary and secondary market research reports, journal articles, thought leader white papers—requires several adaptations to make the process (and processing) efficient and effective. One of those adaptations is optimizing the document text with natural language processing (NLP) to accommodate the text capacity limitations of large language model APIs. Seuss explains and demonstrates how to use NLP to feed the GenAI only "summary worthy sentences" that are rich in meaning and help ensure the GenAI response is as accurate and meaningful as possible.

Speaker:

, CEO, Northern Light

 

AI203. Transforming Tools in the World of AI

02:00 PM2024-05-092024-05-09

Thursday, May 9: 2:00 p.m. - 2:45 p.m.

Introducing AI into the enterprise is top of mind for many these days.

Putting Generative AI to Work

Probstein introduces the concept of metasearch as a transformative tool in the business world, akin to a master key unlocking various treasure chests. This analogy aptly describes the modern enterprise landscape, where numerous cloud-based applications, each with their unique datasets, are seamlessly accessible. He emphasizes the practicality of this approach, highlighting the efficiency of using metasearch over traditional methods that often involve heavy data amalgamation. By keeping the data in its original “chests” and using metasearch as the unifying tool, businesses can enjoy a more streamlined and agile data management process. The talk further delves into the synergy between metasearch and AI technologies like ChatGPT. AI, when applied to the rich and varied internal data of a company, can act as an intelligent guide, making sense of the vast information treasures. This approach not only simplifies data interaction, it also unlocks deeper insights, enhancing decision making and strategic planning.

Speaker:

, President, Swirl

 

AI204. Transformative Potential of Knowledge Graphs

03:00 PM2024-05-092024-05-09

Thursday, May 9: 3:00 p.m. - 3:45 p.m.

Knowledge graphs are now independent entities capable of continuous self-improvement.

Knowledge Graphs Revolutionize Data Management

Recent advancements in large language models (LLMs) have spearheaded the development of self-sustaining knowledge graphs. Aasman focuses on four important aspects essential for knowledge graphs to autonomously synthesize and manage information: Intuitive Query Primitives, which allow effortless extraction of data from LLMs; Natural Language to Structured Query Translation, which translates natural language queries into structured queries across various languages; Integrated Vector Store, which facilitates seamless interactions between internal, private data and external, public data; and Neuro-Symbolic Framework, which synergizes rule-driven logic, constraint-based reasoning, description logic, Graph Neural Networks (GNN), Machine Learning, and LLM inferences. The presentation showcases practical applications.

Speaker:

, CEO, Franz Inc

Thursday, May 9

Closing Keynote

 

Closing Keynote: 2024 Trends in Data Management & Data Fabric

04:00 PM2024-05-092024-05-09

Thursday, May 9: 4:00 p.m. - 5:00 p.m.

Many companies have prioritized various data management trends this year to meet their increasing demands in data and AI initiatives. To help guide people, Radiant Advisors and Database Trends and Applications magazine conducted a market survey in Q1 2024, which analyzed what companies are doing beyond the hype. This survey focused on companies' perceptions, planning, and adoption of current data management practices, such as data fabric and active metadata, multi-domain master and reference data management, data quality, data observability, and data catalogs. The study covered a range of industries and company sizes. Following your participation at Data Summit 2024, you can compare your enlightened perspectives with the survey findings.

Speaker:

, Principal Advisor & Industry Analyst, Radiant Advisors

Don't Miss These Special Events