MarkLogic-Unisphere Research Study Examines Management of Big and Unstructured Data
As data continues to grow unabated, organizations are struggling to manage it more efficiently. By better leveraging their expanding data stores and making the information available more widely, organizations hope to put big data to work — helping them to achieve greater productivity and more informed decision making, as well as compete more effectively as a result of insights uncovered by analytics on their treasure troves of information.
Improving the management of big data is not something to consider addressing at some point in the hazy future — the big data challenge is already here, according to a new survey of 264 data managers and professionals who are subscribers to Database Trends and Applications. Organizations now have data stores ranging from the hundreds of terabytes to multiple petabytes, according to the study, which was conducted by Unisphere Research, a division of Information Today, Inc., in partnership with MarkLogic, in January 2012.
Respondents define big data as any large-size data store that becomes unmanageable by standard technologies or methods. Unstructured data — such as business documents, presentations, and social media data — is the most difficult part of the challenge, and respondents do not view traditional enterprise relational database systems as being up to that task. Three-fifths of respondents are examining new technologies to better capture and manage large volumes of data, while 42% are already investing in new solutions. Close to half are also reacting to emerging big data requirements by re-examining their data management processes. (See Figure 1.)
Figure 1: How Companies Are Addressing Big Data Challenges
Evaluating new technologies 60%
Re-examining data management processes 45%
Investing in additional technologies or infrastructure 42%
We are doing nothing at this time 15%
Hiring more or more specialized IT staff 12%
Creating cross-departmental team to address problem 12%
Don't know/unsure 10%
Hiring more consultants/outside contractors/support 9%
(Multiple responses permitted.)
In terms of data volumes that organizations are currently managing, the survey finds that 12% of respondents support more than a petabyte of data, and another 32% say they have data volumes in the hundreds of terabytes.
Among the key findings uncovered by the survey is the fact that unstructured data in particular is growing rapidly and threatening to overwhelm organizations’ current data management systems. Respondents are also concerned that management does not fully understand the looming challenge, and is failing to appreciate the significance of unstructured data assets to the business.
As to how unstructured data across their enterprises will change over the next 3 years, 91% of respondents say they expect their volumes of unstructured data to grow in the coming months and years, and one-fifth even expect this growth to exceed 50% of their current levels. Respondents are split as to whether unstructured data will eventually surpass the amount of structured or relational data in their enterprises. Half see this as happening either very soon or at some point over the next decade. About 16% say they are already managing more unstructured than structured assets, while at the other end of the spectrum, 40% say this probably will not occur at their organizations, and 10% don’t know if or when this might happen.
Respondents are confident that once big data is under control, the benefits to their organizations will be far-reaching. In fact, three quarters of the survey’s respondents say they would benefit from better reporting and analytics, and 63% say that predictive analytics would be enriched by big data. (See Figure 2.) The ability to address big data will also enhance decision making, say respondents. A majority agree that top-level business decisions, both strategic and operational, would be improved by getting a better handle on their big data assets.
Figure 2: Big Data Areas of Discovery of New Opportunities or Insight
Better reporting/analytics 74%
Predictive analytics 63%
Market research 34%
Don't know/unsure 12%
(Multiple responses permitted.)
As data volumes grow, three key strategies are being employed to better manage big and unstructured data, but with them come concerns. One is cloud computing, which is now in use or planned at seven out of 10 of the respondents’ organizations. Leading cloud deployments used or planned to be used by respondents within the next 12 months include private clouds, cited by 43% of respondents; hosted solutions, cited by 26% of respondents; public cloud services, cited by 17% of respondents; and cloud-based database systems, cited by 16% of respondents. Overall 68% of respondents are deploying at least one of these four cloud approaches. Security is cited by roughly half the respondents as a concern in moving to the cloud, and one-third say they have concerns about data availability.
A second strategy is the implementation of nonrelational databases. These are being deployed at the core, not just the fringes, of enterprise data sites, and most prevalent are NoSQL and in-memory databases. NoSQL or nonrelational databases or technologies are being adopted by more than two-fifths, or 41%, of the survey respondents, and one-fifth are opting for in-memory databases to address their growing data demands. Skill requirements represent respondents’ greatest concern about NoSQL and nonrelational deployments.
Finally, seven out of 10 respondents’ organizations are also using open source technologies in their environments. Open source is seen as offering greater flexibility to enterprises with unrestrained licensing, more standardized and open solutions, and the support of an active community of developers and users. A majority of respondents, 71%, say they use or plan to use open source technologies in the year ahead. While open source presents a compelling value, respondents nonetheless voice apprehension about the possibility of relinquishing some of the support and security they have come to expect from commercial products.
To access the full 46-page report, “Big Data is Real and It is Here — 2012 Survey on Managing Big and Unstructured Data,” go to the MarkLogic website at http://info.marklogic.com/post-relational-reality-dbta-survey-2012.html.