“XOps” is emerging as a key automation strategy to operationalize business value from data and AI/ML workflows in the broader enterprise technology stack. XOps was named one of the Top Data and Analytics Trends of 2021 by Gartner, and examples include DataOps, MLOps, DevOps, BIOps, PlatformOps, and ModelOps. As many tools and platforms have evolved to solve niche problem areas, they are fairly disconnected, creating more confusion for data leaders to decide what tools and platforms are needed to support end-to-end business needs.
In simple terms, XOps can be broken down to “X” related to data, infrastructure, business intelligence (BI), and machine learning (ML) models, and “Ops” is the automation via code. The individual component has existed for years, but the difference now is they are interconnected to drive agility and innovation by removing silos.
Data, BI, and ML Ops are the primary focus of this article as they are interrelated to create business value. They are considered as functional verticals, and DevOps is the interconnecting glue that ensures consistency of tools, security, and governance requirements to drive continuous integration (CI), continuous development (CD), and continuous training (CT).
What’s common between Data, BI, and MLOps?
Organizations build their strategy around data and technology, and IT is the main driver of those initiatives. They collaborate with various business functions to gather requirements to implement BI and ML solutions. The adoption and success of these solutions are often measured by the accuracy and timeliness of data products. The usage pattern and consumption of data products such as users, queries, frequently used tables, and attributes are often overlooked. In addition, a lineage view is only partially complete without metadata from BI and ML applications. The metadata integration from those tools is required to be integrated with the DataOps tool to provide full visibility and deliver governance requirements of data sources, business rules, consumption, users, access levels, PII data, business owner, model endpoints, and potential data leaks.
As organizations continue to embrace best-of-breed solutions as part of their transformation journey, application and technology migration are two necessary components to consider. Having an integrated metadata and unified observability solution allows them to define the migration roadmap and evaluate business impact quickly without investing significant resources to conduct a months-long discovery process.
What’s Different Between Data, BI, and MLOps?
DataOps focuses on reading structured data from SaaS applications and databases and unstructured data from documents and images to provide business and technical metadata. In addition, it provides orchestration and transformation capabilities to create a trusted data store. It’s a foundation layer for downstream BI and ML applications. Many organizations have established ETL/ELT tools, and in those cases, lineage can be registered in the DataOps via API integration. It should be a strategic asset to provide trusted and verified data in a data marketplace, enabling data citizens to request access to tools of their choice. The primary users of the tools are business analysts, data stewards, and data engineers. The metadata captured as data moves from raw format to trusted format are entity, tables, attribute mapping, conformance rules, business and technical contact, business glossary, data quality rules, quality score, and runtime users.
BIOps focuses on formatting data into a logical structure compatible with a tool, building reports and dashboards by BI developers and business users. Most DataOps tools provide native integration to BI tools and pull metadata such as tables, queries, run time user name, metric definition, derived attributes and rules, and record count.
MlOps tools are used by ML engineers and data scientists. The data is formatted into a structure via feature engineering for model development, training, and running experiments. Metadata integrated into DataOps tools include model names, registered models, model versions, experiments, metrics, scores, and endpoints.
Are DataOps and DevOps the Same?
They are different from the standpoint of tools, skills set, and focus. However, the agility of DevOps to build product features, test and deploy, and bring value to users quickly should be applied to the data ecosystem. Ideally, the two teams need to be part of the same organization or need stakeholders from both teams to be aligned on product and/or project goals. Most organizations have mature CI/CD processes to handle infrastructure provisioning, user access, and configurations of security controls to include business applications changes such as workflows, artifacts, and metadata scripts. The application-level change management scripts can be called from Git or script using Terraform, Ansible, or other scripts. Each application provides a code base and test plan to ensure changes to a higher environment are applied and validated correctly and rollback without data loss if something goes wrong during deployment.
What are the Potential Barriers to Adoption?
In most parts, people and processes are the main drivers for the lack of adoption. Alignment of resources across the organizations and buy-in from the stakeholders are not part of the initial strategy, leading to a lack of commitment if those stakeholders are brought into the engagement or program at a late stage. Ideally, every data leader wants to own the budget, build a team, and be responsible for end-to-end delivery of program and products, but in reality, the key stakeholders may be IT security, governance, and the DevOps team. Aligning key stakeholders and active participation from those teams, and clarity of roles are critical for success.
Organization maturity in embracing change at a scale, driving culture change, and hiring people who have done this for a living are important for a successful rollout.
Establishing data literacy is a key part of a successful adoption. Data literacy includes user onboarding, training (in-person or on-demand), documentation, short how-to videos, and documentation. It’s a team that needs to stay focused on creating content and delivering them to users on every product release. In most cases, documentation is outdated, and there is no reviewer because of a lack of bandwidth from the development team. User documentation needs to be part of the sprint plan for every developer and reviewer.
Create a centralized tool that can be considered a home tool for business users to get most of their questions answered and navigate to other applications for detailed analysis using a single sign-in, helping to ensure a seamless experience.
Define a clear role of each tool in the broader data ecosystem. Each tool must be defined with business value alignment to new revenue, cost savings, operations efficiency, and compliance requirements. The interoperability of the tools is part of the core design and tool selections to drive an efficient and mature data organization.
At its core, XOps reinforces the idea that different development teams are cross-functional and work with one another. As data, BI, and ML solutions increasingly shift to becoming core business functions, the information silos that have previously acted as barriers to the data-driven enterprise may finally begin to break down.