TRAINING AND PREP
The rise of XOps practices within data environments means new ways of working and new supporting roles, and, as a result, training and preparation are required to ensure success. One such new role that is evolving is that of the DataOps engineer. “If we think of data operations as a factory, then the DataOps engineer is the one who owns the factory assembly line that builds a data and analytic product,” said Bergh. “A good DataOps engineer introduces automation into a data organization that can improve the productivity of data scientists and analysts by seven to 10 times.” DataOps engineers need a wide breadth of skills, from experience with scripting languages such as Python to familiarity with DataOps or DevOps tools, in addition to knowledge about agile methods and workflow tools, he continued. They also need “familiarity with the toolchain your data engineers, scientists, analysts, and governance team use. DataOps engineers are similar to DevOps engineers but with skills that specifically address data teams’ problems. There is a critical shortage of professionals with DataOps skills.”
Another role arising with the emergence of XOps is the “Kubernetes operator,” said Rishi. “The Kubernetes operator pattern enables the site reliability engineer role. SQL or NoSQL databases like PostgreSQL, MySQL, MongoDB are some of the most popular workloads in Kubernetes today and organizations don’t have—or even need to have—SRE [site reliability engineering] experts for each one of these databases.” With the operator pattern, set up and day-to-day functions, such as deploying the databases, keeping them up-to-date, and protecting data, become much simpler.
A crucial first step is “defining and writing down processes,” said Harper. “Recording how things get done is key as we’re able to create how-to guides instead of relying on singular people to carry out processes.” This is often a step that is overlooked, Gwirtz noted. “Amazingly enough, organizations don’t always understand the required processes and the do’s and don’ts to be applied. Specifically, roles and responsibilities should be well-defined,” she said. It all starts with clearly defining the proper processes at at least two levels, Gwirtz explained: “the organization-wide level, and at the application level specific to each application’s data flows.”
Still, there are lessons to be learned from what has occurred on the DevOps side of the house. “Early XOps models were based on agile development combined with operations automation,” said Allan. “Experience and familiarity with the principles of agile development and operational automation will significantly accelerate the adoption of XOps.”
TOOLS AND TECHNOLOGIES
Tools and platforms are also a critical part of the equation. The tools required for successful implementation “include a framework for data scientists to log relevant metrics depicting the eligibility of the operational model over time—drift identification,” said Gwirtz. Additional tools include “low and high environments—dev, test, accept/stage, production—to support the different stages of development: DevOps tools to support the automated promotion of artifacts, and security scanning automation as a gatekeeper for new or updated code introduction. Consider escalation automation as part of the process. It’s not a must, but it certainly helps in reducing time for resolution.”
There is also a risk of employing the wrong tools for the wrong type of XOps, Bergh cautioned. “DataOps is not just DevOps for data. There are hundreds of tools for data engineering, data science, analytics, self-service, governance, and databases. DataOps enables meta-orchestration of the toolchain across end-to-end data analytics processes.”
DataOps tools need to address data-related automated orchestrations, Bergh continued. These capabilities include “built-in connectors to the complex chain of data engineering, science, analytics, self-service, governance, and database tools; meta-orchestration—a hierarchy of orchestrations; and integrated production testing and monitoring—one environment to manage tests across the heterogeneous toolchain.”
When it comes to methodologies in-tended to protect sensitive data—such as DevSecOps—Rishi advises not only looking at tools such as Git and code scanners but also at operational blueprints. “A consideration that organizations need to keep in mind is that data created during application run time is usually the most critical resource for an organization and resides in a variety of databases—not Git. Organizations need to be mindful that database replication is not the same as data protection functions like backup and disaster recovery and invest in the right Kubernetes-native data protection tools.”
OVERCOMING CHALLENGES
What challenges get in the way of fully realized XOps practices? Enterprise perception and buy-in are probably the most vexing, industry observers agree. XOps-based approaches are “often seen as disrupting one’s work, even though its ultimate goal is to make everyone’s life easier,” said Harper. “For example, it frees time from repetitive tasks so folks can focus on more important items. We need to give teams the tools, access, and budget they require to move the company forward. After all, there is no amount of training or preparation that can make up for lack of employee support.”
Ultimately, the most significant barrier to XOps is cultural. “XOps requires a mindset of extreme accountability and greater collaboration,” said Allan. “The shift in thinking means that teams can no longer point the finger and must take responsibility for all aspects of the service delivery. A second challenge, but perhaps a less significant one, is the requirement to automate the entire process. Automation of edge cases can lead to a long tail of tasks with diminishing value and return for the enterprise. XOps is most valuable when there is consistency and homogenous service delivery within the organization. This often proves challenging for organizations with legacy services and leads to two modes of data-based services.”