Data is a Process, not a Project: Data Governance and Centralization at Data Summit 2024

The ongoing dive into modernity—and all  the new technologies and hype trains that come along with it—requires a modern data architecture to support it. With this architecture comes a variety of other necessities—namely, a modern approach to data governance. 

Aaron Cutshall, enterprise data architect, Healthfirst, Inc., and Niraj Vora, lead sales engineer, Fivetran, joined the annual Data Summit session, “Moving to a Modern Data Architecture,” to examine the multitude of nuance associated with cultivating modern data architecture, focusing on the roles that clear frameworks and centralized data strategies have to play. 

The annual Data Summit conference returned to Boston, May 8-9, 2024, with pre-conference workshops on May 7.

Consistency across the enterprise, as Cutshall explained, is a major challenge facing the space of enterprise architecture. The logical answer to this roadblock is data governance, yet it remains elusive to many enterprises struggling to implement.

“Data governance—for many people—is a mystery,” said Cutshall. “People want to do it, know they need to do it, but don’t know how.”

To offer clarity, Cutshall defined data governance as managing data assets, such as rules, processes, policies, and frameworks. Data governance also works to ensure data quality, integrity, security, and compliance, ultimately improving decision-making and reducing operational friction.

“It’s not just about what we call the data; that's only part of it. It’s about how we’re going to use that data and what its significance is,” explained Cutshall.

A data architecture, on the other hand, is the design and structure of data systems and infrastructure. Ideally, a good data architecture maintains an accurate, consistent, and complete picture of an organization’s data assets.

Combining data governance and data architecture sets enterprises up for success, noted Cutshall. Data governance is the business focus that works with the data architecture’s implementation focus, working together to promote guiding principles—such as data quality, integrity, and accountability.

When should data governance begin? According to Cutshall, right now—preferably at the beginning of data initiatives. However, enacting data governance in the middle of an initiative is better late than never, though Cutshall warns, “without data governance, you’re hemorrhaging data. You’re hemorrhaging money.”

Data governance is needed everywhere, acting as the glue that binds all data sources together, explained Cutshall. It further identifies and ensures best practices are adhered to.

Every data governance project should include:

  • A data dictionary with consistent definitions
  • A data lineage to show the migration of all data from source to target
  • The rules necessary for usage, including at a minimum

Many might be asking, how do I get started? According to Cutshall, start with the data architecture, capturing data definitions, flows, usage, and lineage. If you don’t already have one, implement a data catalog to assist in understanding and then documenting data flows—what data is stored where, how it gets there, and where it goes.

“Data is not a project, it’s a process,” said Cutshall. “And you can’t do it in isolation, or from the top-down.”

Echoing Cutshall, Vora explained that the effort of centralizing large volumes of diverse proprietary data is a major architectural challenge. With this challenge comes a lack of access to and availability of data, creating a chaotic data environment that erodes any potential opportunities for enterprise success.

The solution, Vora explained, is a comprehensive data centralization strategy—which, without one, leads to missed opportunities for revenue growth and competitive advantages. As AI continues to pervade almost every facet of industry, data readiness for advanced analytics and AI begins with data centralization.

“Data is your competitive advantage,” said Vora. “Successful organizations have mastered data movement.”

Vora offered Fivetran’s principles of data centralization to power advanced analytics and AI:

  1. Data needs to move in and around the business.
  2. Data must be ready to use the moment it is centralized.
  3. Governance and security cannot be ignored.

Fivetran’s automated data movement platform—with over 500 pre-built connectors—enables data of every source to flow effectively throughout an enterprise. As a no-code, configuration-only platform with out-of-the-box deployment, Fivetran offers automated extract and load capabilities complemented with end-to-end reliability. Additionally, the platform offers enhanced visibility over what is happening with your data while allowing the client to maintain control over who has access to the data.

Many Data Summit 2024 presentations are available for review at