8:45 AM
Keynotes
Length: 45 Minutes
Speaker(s):
David Weinberger, Senior Researcher, Harvard's Berkman Center for Internet & Society
Description: AI and the internet are transforming our understanding of how the future happens, enabling us to acknowledge the chaotic unknowability of our everyday world.
Back when we humans were the only ones writing programs, data looked like the oil fueling those programs. But now that machines are programming themselves, data looks more like the engine than its fuel. This is changing how we think about the world from which data arises, and that data is now shaping as never before. We’ve accepted that the intelligence of machine intelligence resides in its data, not just its algorithms—particularly in the countless, complex, contingent, and multidimensional interrelationships of data. But where does the intelligence of data come from? It comes from the world that the data reflects. That's why machine learning models can be so complex, we can't always understand them. The world is the ultimate black box. Weinberger looks at the implications of this for people who work with data.
10:45 AM
Modern Data Strategy Essentials Today
Length: 1 Hour
Description: Collecting, querying, and manipulating data are important, but analyzing it well provides a competitive advantage.
Title: Making Self Service Work: How to Empower, Align, & Retain Data Analysts
Time: 10:45 AM - 11:45 AM
Description: Organizations employ many data analysts embedded in various departments and business units. These data analysts cost organizations millions of dollars in wages annually. Surprisingly, corporate data teams don’t know most of the data analysts in their organization, nor do they have a strategy to align them or optimize their organization’s investment in them. Eckerson presents a comprehensive strategy for empowering data analysts; describes how to make a business case for developing a self-service strategy that optimizes their time and output; and explains how to motivate and retain data analysts (people), how to organize and manage data analysts (organizations), how to govern data analyst output (process), and how to select tools and products that enable them to work as efficiently and effectively as possible (technology).
What’s Next in Data & Analytics Architecture
Length: 1 Hour
Description: Legacy data architecture needs to be modernized to meet today’s data and analytics needs.
Title: Lessons Learned From Moving to a Modern Data Architecture
Time: 10:45 AM - 11:45 AM
Description: Capital One’s move to the cloud required it to modernize its data operations for this new environment. This meant learning how to balance the flexibility and efficiency of managing data in the cloud in order to generate the most value from its data. Bharathan shares more on the decisions Capital One made throughout this journey around monitoring, access, schema management, resiliency, cost, security, load patterns, and governance. He shares lessons learned—what worked, and what didn’t—and more on the tools Capital One developed to ensure a well-managed, well-governed cloud data platform.
Data Mesh & Data Fabric Boot Camp
Length: 1 Hour
Description: Data silos continue to impede access to needed information. Solutions are on the horizon.
Title: Data Mesh Is Not Only an Architecture
Time: 10:45 AM - 11:45 AM
Description: With a 133-year history, Northern Trust has a backbone of IT infrastructure built decades ago, when on-premises solutions dominated the technology landscape. Due to the complexity of global regional regulatory requirements and the limitations of legacy systems, valuable data assets are maintained and isolated only in online transactional processing (OLTP) databases. The company faced challenges in data sharing, management, and governance in supporting enterprise-level analytics projects to meet business needs and growth. A digital modernization initiative took place that had a data mesh ecosystem as a critical component, leveraging cloud services on Azure and other modern technologies.
AI & Machine Learning Summit
Length: 1 Hour
Description: There’s no stopping the introduction of AI-based technologies into the enterprise.
Title: Using Data Management Methodologies to Foster Development of Transformational AI/ML Tradecraft
Time: 10:45 AM - 11:45 AM
Description: Data science methods provide a means to establish analytic tradecraft, capable of managing a large amount of data, allowing for full characterization of actor behaviors, and providing valuable insights. As data volume increases, AI/ ML plays a significant role in this "high data entropy" space, providing users with the means to combine multi-sourced datasets, with the goal of learning and identifying patterns, and develop actionable insights while assuring they follow the organization’s law and policy boundaries. Rodriquez presents a case study on how the intelligence community (IC) is addressing these challenges by establishing innovative AI/ML governance and data management methodologies, supporting the development of a policy-compliant AI governance ecosystem, predicating strategies to enforce legal and policy considerations, and establishing data controls.
12:00 PM
Modern Data Strategy Essentials Today
Length: 45 Minutes
Speaker(s):
Ruben Ugarte, Decision Strategist, Practico Analytics
Description: We’re all looking for insights that can be gleaned from our data. In addition to being data-driven, consider being insights-driven
Title: Unlocking More Insights Through Better Decisions
Time: 12:00 PM - 12:45 PM
Description: Collecting more data doesn't always translate into better business outcomes. The best companies are shifting their entire culture to obsess about generating more insights that can be turned into tangible results in profits, revenue, and growth. Ugarte shares strategies for how teams can start to make the shift to being insights-driven and how to turn those insights into profitable decisions. Learn how to determine the ROI of your data and why more insights are the key to increasing the value of data. Look at your decisions as a process instead of one-off and start optimizing this internal process. Connect the dots in how data becomes an insight and then into a winning decision.
What’s Next in Data & Analytics Architecture
Length: 45 Minutes
Description: Modern data architectures optimize for quick delivery at scale.
Title: Unifying Data & Models for Cross-Domain Personalized Fashion Recommendations
Time: 12:00 PM - 12:45 PM
Description: Zielnicki explores how Stitch Fix evolved its large suite of recommender models into a novel model architecture that unifies data from client interactions to deliver a holistic and real-time understanding of their style preferences. Stitch Fix’s Client Time Series Model (CTSM) is a scalable and flexible sequence-based recommender system that models client preferences over time, based on event data from various sources, to provide multi-domain, channel-specific recommendation outputs. The model has enabled Stitch Fix to continuously provide personalized fashion at scale, like no other apparel retailer.
Data Mesh & Data Fabric Boot Camp
Length: 45 Minutes
Description: Enable your data team to get the most value from their time and quickly deliver needed business insights through a next-generation data fabric methodology.
Title: Next-Generation Data Fabric Methodology
Time: 12:00 PM - 12:45 PM
Description: MacWilliams introduces a technology agnostic methodology that solves the common challenges facing data teams and focuses on the processes among technologies—on-board data faster, flex automatically when data changes, create solutions that are manageable across technologies, and provide the foundation to be able to monitor and maintain your data fabric for the future. The methodology integrates with already existing technologies. Starting with the end in mind, MacWilliams covers how to best monitor and maintain your platform, how to maximize data team capacity, how to utilize meta-data to streamline your team’s development, and where to build custom and leverage modern technologies.
AI & Machine Learning Summit
Length: 45 Minutes
Description: MLOps can streamline ML development, thus increasing operational effectiveness.
Title: Building MLOps Organizations for Scale
Time: 12:00 PM - 12:45 PM
Description: Jablonski looks at the journey to defining and implementing an MLOps solution for your organization. Jablonski begins with the metrics necessary for successful model lifecycle measurement, then discusses the technology stack to be deployed and the operational model necessary for success at scale. These three must all be defined and built collectively to ensure alignment between operational needs, technology capabilities, and success metrics. High model performance, elimination of bias, and predictability are all key elements of an MLOps strategy.
2:00 PM
Modern Data Strategy Essentials Today
Length: 45 Minutes
Speaker(s):
Richard Huffine, Chief, Library and Public Information, Federal Deposit Insurance Corporation
Description: A key point about succeeding with digital transformation involves obtaining commercial data services and utilizing their capabilities to the fullest.
Title: Intricacies of Acquiring Data Services
Time: 2:00 PM - 2:45 PM
Description: Acquiring commercial data services is not as simple as signing a standard contract without a preliminary and thorough investigation and an understanding of how to broker, evaluate, and integrate the data into a corporate or governmental data lake, data warehouse, or data mesh. Huffine calls your attention to the complexities of licensing, negotiations, usage, and ROI. Additionally, he covers environmental, financial, and core data (ZIP codes, GIS layers, etc.) as well as use cases for modeling, dashboards, research, and regulatory activities.
What’s Next in Data & Analytics Architecture
Length: 45 Minutes
Description: DataOps affects everyone involved in the data ecosystem, which pretty much encompasses all employees, so adopting a strategy for agile data delivery is important.
Title: Composable Design Patterns
Time: 2:00 PM - 2:30 PM
Description: Industry publications and thought leaders have been touting the benefits of composable design for both business and architecture. For roughly 10 years, Composable Analytics has been ahead of that curve. We were founded on using composable design strategies to get actual projects up and running and providing value for clients. Vidan shares some of the real-world lessons learned over that time and explores some of the more common usage patterns that Composable Analytics has found that help put theory to practice and composable architecture into production.
Title: Ransomware & Recovering Your Data/IDOL Unstructured Data Analytics
Time: 2:30 PM - 2:45 PM
Description: IDOL Unstructured Data Analytics is an advanced search, knowledge discovery, and analytics platform that uses AI and ML to leverage key insights stored deep within your unstructured data—including text analytics, audio analytics, video analytics, and image analytics. Some use cases include safe city analytics, law enforcement media analysis, open source intelligence, marketing and brand monitoring, brand and reputation protection, and digital assistant chatbot. Ciesliga and Drewry discuss some of the current major attack types and how to detect, protect, and recover. While they focus on recovery, they also highlight common data backup protection options with their advantages and drawbacks.
Data Mesh & Data Fabric Boot Camp
Length: 45 Minutes
Description: Moving to the cloud is now a normal function but presents interesting new challenges.
Title: Why Operationalizing Data Mesh Is Critical for Operating in the Cloud
Time: 2:00 PM - 2:45 PM
Description: As companies look to scale, they face new and unique challenges related to data management in the cloud. Data mesh offers a framework and a set of principles that companies can adopt to help them scale a well-managed cloud data ecosystem. Learn how Capital One approached scaling its data ecosystem by federating data governance responsibility to data product owners within their lines of business and hear how companies can operate more efficiently by combining centralized tooling and policy with federated data management responsibility.
AI & Machine Learning Summit
Length: 45 Minutes
Description: Neural networks can be used for many applications.
Title: BMW's Iconic Industrial Dataset Breaks New Ground
Time: 2:00 PM - 2:45 PM
Description: In 2022, BMW Group, Nvidia, and Microsoft teamed up to create the world's largest synthetic image dataset (800,000 images) for training AI in industrial environments. The dataset, called SORDI, was created using data from the BMW iFactory and rendered in Omniverse. The speakers demonstrate SORDI-next groundbreaking enhancements. Object classes in SORDI will no longer be static but will show signs of use and aging. They detail the benefits of these new features, especially with respect to the training of deep learning networks for industrial object recognition.
3:15 PM
Modern Data Strategy Essentials Today
Length: 45 Minutes
Speaker(s):
David Adler, President/Founder, Adler Law Group
Description: Some data is meant to be shared, but other data requires being secured so it doesn’t get into the wrong hands.
Title: Own Your Data: Key Issues for Contracts
Time: 3:15 PM - 4:00 PM
Description: One of the biggest concerns of small, middle-sized, and enterprise companies involves securing data and ensuring integrity when collaborating or integrating with other services. Seasoned executives know that issues of IT security, compliance, regulations, cyber liability insurance, supply chain requirements, incident response, and forensics should all be addressed in the contracts. However, it is easier to identify contractual protections than to obtain them. Agreeing to contractual commitments is a function of both liability to the enterprise and negotiating leverage.
What’s Next in Data & Analytics Architecture
Length: 45 Minutes
Speaker(s):
Andy Li, Senior Software Engineer, Rippling
Description: Tools to collect, organize, store, and analyze data are part of a modern data stack that can transform data.
Title: Diving Into Rippling’s Data Platform: Powering Customer Experiences With Presto
Time: 3:15 PM - 4:00 PM
Description: At Rippling, a workforce management platform, a key part of its product is to enable customers to run automations and analytics on their HR/IT/finance data. Thus, its data stack is critical to making sure customers have the best experience when doing that. Li shares details about Rippling’s data architecture and discusses why the company chose Presto and Apache Pinot to underpin its data infrastructure. Today Presto powers the Rippling data platform and customer-facing scripting language, RQL (Rippling Query Language), to run arbitrary customer queries. Presto helps enable diverse, federated querying at scale.
Data Mesh & Data Fabric Boot Camp
Length: 45 Minutes
Description: New technologies can transform companies’ data journeys.
Title: Data Fabric or Data Mesh? Find the Happy Medium
Time: 3:15 PM - 4:00 PM
Description: Data fabrics and data meshes are promising paradigms for helping organizations on their data journeys. Data fabric is a new approach complementing the existing infrastructure and data management technology, accessing the data on demand as it’s needed by the consumers of the data, with centralized metadata and governance. Data mesh accesses the data on demand, providing the metadata and governance capabilities at the edges of the organization, where the data resides, enabling agility and autonomy throughout the organization. While much of the conversation around data fabrics and data mesh has been primarily about which approach or architecture is “better,” Fried discusses how the real value of these concepts isn’t rooted in an “either/or” approach and why they must be viewed as complementary.
AI & Machine Learning Summit
Length: 45 Minutes
Description: Knowledge graph technology expands by employing neural networks.
Title: Build Predictions With Machine Learning & Graph Neural Networks
Time: 3:15 PM - 4:00 PM
Description: Probably the most important reason for building knowledge graphs has been to answer this age-old question: “What is going to happen next?” Given the data, relationships, and timelines we know about a customer, patient, product, etc. (“the entity of interest”), how can we confidently predict the most likely next event? Graph neural networks (GNNs) have emerged as a mature AI approach for knowledge graph enrichment. GNNs enhance neural network methods by processing graph data through rounds of message passing. Aasman describes how to use graph embeddings and regular recurrent neural networks to predict events via GNNs and demonstrates creating a GNN in the context of a knowledge graph for building event predictions.
4:15 PM
Modern Data Strategy Essentials Today
Length: 45 Minutes
Description: Challenges in licensing compliance have not diminished; in fact, they are more challenging than ever.
Title: Stealth Audits & Other Trends in Software Licensing Compliance
Time: 4:15 PM - 5:00 PM
Description: Unisphere’s report, “Managing the Software Audit: 2022 Survey on Enterprise Software Licensing and Audits Trends,” surmised that, due to lost revenue as a direct result of COVID-19, major software vendors increased the pace of their software licensing audits to generate additional revenue. Your risk of a software audit is greater now than at any point in time. Corey and Sullivan discuss the significant findings of the survey and explore the stealth audit, the newest tool in the vendor audit toolbox. They explain the difference between vendor policy and contractual obligation and expose how software licensing trolls have weaponized software vendor audits. Don’t be the next victim!
What’s Next in Data & Analytics Architecture
Length: 45 Minutes
Description: Putting data first rather than taking an application-centric approach is the mark of a modern data architecture.
Title: Solving Complex Data Problems by Treating Data as a Supply Chain
Time: 4:15 PM - 5:00 PM
Description: Emerging modern business-oriented data architectures such as data hub, data lake house, data fabric, and knowledge graphs put data first. By treating data as a supply chain, from which applications hang, as opposed to traditional application-centric approaches, where data is suborned into silos, Bentley explains how to solve complex data problems simply. Data unity, data security, data governance, data context, and data quality are all ensured throughout the data lifecycle without the complexity of multiple integrations from multiple vendors and components. Through real-world case studies, Bentley highlights the advantages of this approach, lessons learned, and practical advice in implementing these modern architectures that augment your existing data ecosystem without the need for “rip and replace.”
Data Mesh & Data Fabric Boot Camp
Length: 45 Minutes
Description: Data architectures tend not to be static and new approaches are always welcome.
Title: Enabling Data Mesh With Event-Driven Data Architecture
Time: 4:15 PM - 5:00 PM
Description: The concept of data mesh has resonated strongly with both data professionals and the broader engineering community. Loose coupling, enablement of federated development, and data sharing ease the difficulty of data management in both large and small organizations, as well as bringing data systems closer to parity with modern microservice-based systems. Cordo explores how the adoption of an event-based data architecture can enable an organization's sustainable transition to data mesh. This includes an overview of event-based architecture, architectural patterns for event-based data systems, and organizational considerations.
AI & Machine Learning Summit
Length: 45 Minutes
Title: Garbage In, Garbage Out: Guaranteeing Data Quality for Scaling ML Applications
Time: 4:15 PM - 5:00 PM
Description: Buying a home is one of the largest financial decisions people can make. Yet, the industry has gone relatively unchanged for decades. Opendoor is changing that by making hundreds of data-informed decisions every day through bespoke applications of AI, ML, and data science. The team uses everything from heavily parametrized, linear econometric models to deep neural networks. Drawing from Stone’s expertise in AI, ML, and data science, his talk focuses on developing a data-first approach to finance, the integration of algorithms and humans, and improving AI and ML from better data.