RESEARCH@DBTA: Survey Reveals the Pervasiveness of ‘Dark Data’

Existing data infrastructures are crumbling under the weight of data, leading to unsustainable energy consumption, increased management complexity, and declining security.

That’s the word from a recent survey of 1,288 executives, released by Hitachi Vantara. The survey also finds 76% feel their current data infrastructures “will be unable to scale to meet upcoming demands”—such as AI. Another 61% report they are simply “overwhelmed” by the amount of data they manage. By 2025, the report’s authors predict, large organizations will be storing more than 65 petabytes (PB) of data.

Defining Dark Data

Furthermore, IT executives estimate that they don’t have control of half of the data flowing through their enterprises.

This “dark data” is defined by the survey’s authors as data that’s collected and stored but is never used. Another 57% agree that more attention needs to be paid to the sustainability impacts—carbon footprints—of storing unused dark data.

The survey’s authors estimate the total amount of dark data in larger organizations to be at about 17 PBs. This estimate of dark data is related to the finding that 75% of data executives “save everything, then never touch half of it.” In a related statistic, 44% state that “no one in their organization knows all the data they are collecting and storing.”

Raw storage capacity is also hitting its limits, the survey shows. The Hitachi survey finds data storage needs are expected to double in 2 years. And while cloud is a natural solution to these requirements, it is “not a silver bullet,” the survey’s author’s state. That’s because almost half of data center workloads, 49%, are expected to remain on-prem—either in traditional on-prem systems or in private clouds. Another 27% of data center workloads will be in public clouds by 2025 and another 21% co-located.

Security Concerns

Data silos remain a major problem across the enterprises in the survey. Close to one-third, 29%, state that much of their data is still locked away and inaccessible through various enterprise systems, such as CRM. The speed at which data is available also vexes many organizations: 33% say their users complain that it takes too long to access the information they need, and 34% say it takes too long to execute system processes.

Security is another concern associated with the data explosion, as the survey has found yawning gaps in data protection. Seven in 10 executives admit they would not be able to detect a data breach if it happened. Another 68% state they lack the resilience to survive ransomware attacks, the survey also has found. Even more surprising, 22% of executives admit their most critical data is not backed up.

Another 18% state they lost data over the past 2 years because it was corrupted. Furthermore, only 29% are confident their organization’s employees are following their security policies.

Data and System Modernization

“Data remains the number-one business asset,” the survey’s authors point out. “To put it to work, companies need to address some big challenges.” The key is to move forward with data and system modernization efforts. Data needs to be part of any digital transformation initiative, with 25% of executives agreeing that updating and modernizing their data infrastructures needs to be part of any data modernization effort. Practices such as DataOps and AIOps need to be part of this process.

As this survey shows, there is still much work that needs to be done to fully leverage the data that is required to succeed in today’s digital economy. Ultimately, data teams need to focus on providing greater visibility and integration to the data that surges through their enterprises.