Cloud technologies and frameworks have matured in recent years and enterprises are starting to realize the benefits of cloud adoption—including savings in infrastructure costs, and a pay-as-you-go service model similar to Amazon Web Services. Here is a look at the cloud market and its convergence with the big data market, including key technologies and services, challenges, and opportunities.
Evolution of the ‘Cloud’ Adoption
The technology, platform, and services that were available in the early 1990s were similar to the “cloud” adoption of the last decade. We had distributed systems with Sun RISC-based server workstations, IBM mainframes, millions of Intel-based Windows desktops, Oracle Database Servers (including Grid Computing–10g), and J2EE N-tier architecture. There were application service providers (ASPs), managed service providers (MSPs), and internet service providers (ISPs) offering services similar to cloud offerings today. What has changed?
There were significant events that triggered the emergence of cloud offerings and their adoption. The first one was the Amazon Web Services (S3, EC2, RDS, SQS) scale out and the development of IaaS (infrastructure as a service) once Amazon was able to realize the benefits of its offering for its own internal use. The second major event was the search engine, advertising platform, and Google Big Table (memcache) and realization that millions of nodes with commodity hardware (cheap) can be leveraged to harness MapReduce and other frameworks to distribute the search query and provide results at a millisecond response time (unheard of with even mainframes).
In the middle of the 2000-decade, the traditional telecom and mobile phone service providers saw that that they needed to move to scale-out platforms (the cloud) to manage their mobile customer base, which grew from a few million to a billion (factor of 1000). The mobile data grew from a few terabytes to petabytes and they needed newer scale-out platforms and wanted on-premise as well as hybrid cloud deployments.
The creators of Hadoop ran TeraSort benchmarks with large clusters of nodes in order to determine the benefits of MapReduce frameworks. It resulted in the emergence of the Hadoop Cluster Distribution; NoSQL data stores such as columnar, document, and graph databases; and massive parallel performance (MPP) analytical databases. An ecosystem of vendors emerged to reap the benefits of the scale-out cloud infrastructure, MapReduce frameworks, and Hadoop and NoSQL data stores. The applications included data migration, predictive analytics, fraud detection, and data aggregation from multiple data sources.
The new paradigm shift addressed the key issue of scale as well as the handling of unstructured data that was lacking in traditional relational databases. The paradigm shift occurred as a result of the availability of commodity hardware and a framework to run massive parallel data processing across clusters of nodes including distributed file system, high performance analytical databases, and NoSQL data stores for handling unstructured data.
The Drivers for Hybrid Cloud Adoption
For enterprises that are adopting the hybrid (public/private/community) cloud pay-as-you-go model for IaaS, PaaS, and SaaS cloud deployments, the key drivers are cost, flexibility, and speed (time to set up hardware, software, and services). The primary use cases for the new hybrid model include the ability to do data migration, fraud detection, and the ability to manage unstructured data in real time.
Image courtesy of Shutterstock.