The Clash Between Data Quality and AI: Unisphere’s Latest Findings

Data quality issues have been a looming threat for any and all enterprises, often surfaced by the proliferation of new data analytics and AI projects that, incidentally, rely on good data to succeed. In a recent survey, only 23% of DBTA subscribers expressed full confidence in their data, and close to one-third said that data quality is a constant, ongoing issue.

This statistic—paired with the continuous rise in AI adoption—points to a potential for massive failure on behalf of data-driven enterprises, complicated further by organizational inability to manage complexities, find internal support, and accurately determine ROI.

Joe McKendrick, research analyst at Unisphere, and Robert Stanley, senior director special projects at Melissa Informatics, joined DBTA’s webinar, The State of Data Quality: The Elephant in the AI Room, to explore both the recent findings of Unisphere’s latest data quality survey and crucial insights that can dictate the success of data quality initiatives in the modern enterprise.

McKendrick emphasized the great disconnect of low-quality data: With low or no data quality, there is low or no data trust. Without trust, AI and intelligent operations cannot function.

“With AI, people are coming at it from a lot of different directions. [And] something very key to the success of AI [is] data quality,” said McKendrick. “Data quality has always been very important…now, it’s become incredibly important. In the past few years, it’s essentially become a necessity for any organization that wants to stay in business and wants to thrive.”

Stanley then defined data quality, explaining that it refers to the accuracy, consistency, and completeness of data.

Where does quality suffer? According to Stanley, it’s often negatively impacted by inaccurate information, outdated contracts, incomplete records, and duplicate records.

To support these claims, McKendrick introduced Unisphere’s latest data quality survey, conducted in September 2023, featuring the opinions of 202 participants. These participants’ roles ranged from IS/IT to CIOs, CTOs, VPs of IT, DBAs, and data architects, in varying enterprise sizes from all industries.

McKendrick and Stanley highlighted various survey findings that illustrated the relationship between data quality, its degradation, and AI.

The report found that data quality issues being discovered through AI projects has increased by 14% since 2021. Despite the adoption of new analytics projects surfacing problems with data quality, about 22% of respondents reported that their last formal data quality initiative was 1-2 years ago. Furthermore, data quality as a top priority to ensure the ongoing success of projects or initiatives has decreased by 15% since 2021, now standing at a low 35%.

Various obstacles prevent the fruition of data quality initiatives in the enterprise. According to the survey, 58% report that the biggest data quality challenge is project complexities and unknowns, followed closely by lack of internal support (50%), and an inability to accurately calculate expected ROI (45%).

McKendrick and Stanley further went on to discuss the implications that cloud and multi-cloud, automated data entry, third-party data, and external data have on data quality.

For an in-depth review of Unisphere’s latest data quality survey, you can view an archived version of the webinar here.