With Modernization Comes Data Challenges

Modernization is driving many of today’s enterprise data strategies—and cloud stands out as the primary vehicle for attaining this modernization. However, many enterprises are struggling with data quality issues, as well as integrating cloud-based and on-premise data.

That’s the word from a recent survey of 1,840 data executives and professionals, released by Progress (“The 2020 Data Connectivity Survey Report”). The 2019 survey collected input from respondents across more than 13 distinct industries worldwide to identify patterns and insights for ongoing data management strategies.

More than eight in 10 respondents indicated they are in the midst of a modernization effort. The data world is evenly divided between cloud and on-premise systems, the survey found. A majority, 56%, indicated they have or are undergoing a cloud migration to a public service provider as part of their data modernization effort. At the same time, close to half of respondents, 49%, are modernizing via a legacy systems migration or update. Another 36% are turning to a microservices architecture. At least 16% indicated they have no modernization projects underway or that they were not sure if they did.

“As organizations are assembling their system modernization strategies, the boundaries around the enterprise information environment are dissolving,” the survey’s authors stated. The on-premise data center protected by systemic perimeter security is evolving into a hybrid environment spanning on-premise, multiple cloud platforms, and various SaaS and PaaS services, the report revealed, noting “that hybrid architecture is bound to be the norm for the next decade, if not longer.

Modern Data Management

The survey identified the following as three key elements shaping modern data management:

  • Open source frameworks for high performance computing “leverage commodity components, including Apache projects Hadoop, YARN and Spark, drastically lowering barriers to entry for all businesses to implement big data infrastructures for operations and analytics.”
  • The economics and simplicity of cloud computing, “bolstered by the growing array of cloud provider host-native services, has influenced many organizations in migrating their data and applications to the cloud.”
  • Better end-user reporting and analytics tools, coupled with increased end-user sophistication, “have resulted in the emergence of the citizen analyst role. The citizen analyst is a business-oriented problem solver who is knowledgeable in the ways that analytics and machine learning algorithms are wielded to identify profitable business opportunities.”

Big Data Challenges

The reach of big data is creating difficulties for many enterprises, however. The mechanical aspects of modernization and cloud migration are still impacted by known data issues such as data quality, which is on the rise as a challenge. Data quality, veracity, inconsistency, and incomplete data moved from being cited by 14% just 3 years ago to 44% in the current survey.

Data sprawl is another leading challenge, cited by 40%. At least 38% said it is problematic to integrate cloud data with on-premise data. Data volume, velocity, and variety are vexing for at least 30% of enterprises.

Interestingly, the survey found that 31% of the respondents are not using any “big data” platform, while 12% simply are not sure. “This suggests that there is still a large number of organizations that have not yet identified a significant role for big data as part of their data environment,” the report stated.

Among the selected options, in both the 2018 and 2019 surveys, Amazon S3 (17%) and Hadoop Hive (15%) were the most frequently selected platform products. The relative popularity of these choices is reflective of the different big data adoption strategies: taking advantage of scalable object storage (S3) or relying on traditional Hadoop ecosystem components for structured data analysis (Hadoop Hive).

Databases On The Rise

The Progress survey also tracked database brand adoption, and found that it is led by Microsoft SQL Server (57%), and followed by MySQL (41%), Oracle (37%), PostgreSQL (24%), and Microsoft Access (22%). At least 68% of respondents also use NoSQL databases, and 29% currently employ graph databases. Graph databases represent “another alternative NoSQL approach” that has gained attention, the report pointed out, explaining that “their storage models and data representation enable more sophisticated analytics.”

The survey also tracked time-series databases, another class of non-relational databases, which are still in their infancy, but used in some form by 25% of respondents. “As Internet of Things and other real-time data streams—either from automated machine-generated data, sensors, and devices embedded within both residential and commercial smart-environment technologies, autonomous vehicles and sensor-connected manufacturing machinery—become more commonplace, there will be a growing need for database systems that can scale to support real-time ingestion of high-speed streaming data,” the report’s authors concluded.

Regulatory Compliance Concerns

Another consideration having a high impact on database management is the growing number of data protection mandates and regulations. The percentage of respondents focusing on meeting the EU’s GDPR requirements has doubled over the past year, from 15% to 30%. “With GDPR’s meteoric rise coupled with increases in the ISO 27001 standard for information security, HIPAA/HITECH and the Payment Card Industry Data Security Standard (PCI DSS), it is clear that the criticality of sensitive data protection cannot be denied,” the survey report stated.