Data-Driven in 2022: Data Management Opportunities in the Year Ahead

<< back Page 2 of 4 next >>

The notion that “data is the new oil has many believing that every piece of data needs to be captured and stored in a big data technology of some sort,” said McGrattan. “Few are questioning the value of storing, managing, and securing it. We often hear people say that they’re storing data in case they may need it at some point in the future. As a result, this becomes a bit like the junk drawer in the kitchen—filled with stuff that we believe may be needed but that likely will get tossed during the next remodel.”

A better solution, McGrattan suggested, “is to leave the data assets where they naturally live and bring in just the data elements that are required to answer a particular ques­tion at the time the question is asked. Rather than pulling all of one’s Sales­force contact data, which is of dubi­ous quality, into a data warehouse to drive a marketing campaign, it makes more sense to pull the specific con­tacts required for that campaign in real time. Integration vendors that provide that ability will definitely help solve the problem. Similarly, data catalogs will provide an inven­tory of data assets and help identify unused assets that could be cleaned up.” McGrattan urged the formation of a “value index” for data assets, and for organizations to “be ruthless in imposing rules around only storing data assets that meet a minimum threshold.”


Without quality data, AI and machine learning cannot deliver on their promise—it’s as simple as that. The most pressing opportunity for organizations in the year ahead, then, “lies in the ability to discover and consume data which is clean and trustworthy,” said Manoj Karanth, vice president and global head of data science and engineering for Mindtree. “While big data paradigms have con­verged around lakehouse architec­tures, the ability to self-serve the data with provenance continues to be a challenge. In some sense, this is akin to the age-old data governance chal­lenge. However, with the increasing speed and volume of data, this has emerged as a big impediment.”

Data quality will be “the most pressing issue shaping big data man­agement in 2022,” agreed John Nash, chief strategy and marketing offi­cer at Redpoint Global. “As we col­lect increasingly more data to utilize in business settings, the challenges around data quality multiply. While the concept of data quality is not new, its role in shaping customer experience has increased exponen­tially. Data hygiene, data governance, and data privacy have become issues beyond IT and are under the scrutiny of many enterprise stakeholders.”

The rise of “cloud-native unified data governance solutions will allow automated data discovery and data classification with end-to-end lin­eage,” resulting in sustainable ways to address the challenge, said Karanth. “Pairing this with business leaders who can own these data products will address the issues around trust and data usage.” To leverage this opportu­nity, “data and IT managers will need to look at data as an enterprise con­cern and put in governance mecha­nisms which help measure the state of the ‘data economy’ within the enter­prise,” he added. With insight into the maturity of an organization to make data-driven decisions, steps can be taken to prioritize moves to improve the situation, Karanth noted.

Privacy and security are other areas that will increasingly be turned over to automation in the coming year. Data volumes are growing, but data is becoming harder to use and get value from due to heightened security and privacy requirements, according to Steven Touw, CTO at Immuta. This problem is exacerbated as organizations migrate to the cloud because homegrown controls that were built up over the years from their on-premise systems are lost, he noted. Touw sees a rise in auto­mated data governance technology, designed to enable organizations “to implement data access controls, ensuring sensitive data is only acces­sible by those authorized, meaning each user who queries data sees only the data they’re supposed to see for their approved purpose or role.” Such technology helps detect “sensi­tive consumer data such as first and last names, Social Security numbers, and address information, allowing companies to classify and tag the data.” This also helps organizations build data policies that comply with complex privacy regulations without having to manage it manually one table or column at a time, and so it is orchestrated in a way that aligns with their existing DataOps workflows, Touw said.

<< back Page 2 of 4 next >>