Hybrid Databases Rearrange the Data Deck

We’ve reached the point where hybrid cloud arrangements have become commonplace in enterprises, and with this trend come implications for databases and data management. Hybrid cloud scenarios are now seen at a majority of enterprises, according to a survey of 229 data managers and professionals, conducted by Information Today, Inc. in partnership with Oracle. A majority, 54%, are working with hybrid cloud environments, compared to 40% exclusively with private cloud settings (“The New Data Management Landscape: 2018 IOUG Special Report on Data Management Trends”).

“Most commonly, now companies have a cloud-first strategy, in which all new applications should go to the cloud,” said Caio Milani, general manager of the cloud group at MarkLogic. “This means that most companies are moving to the cloud but not all legacy systems [can move] at once, which leads to a hybrid environment while they make this move.”

Hybrid clouds—which match the power and flexibility of public clouds with the security and local control of private clouds or on-premise systems—have become “the most popular form of deployment for new and re-deployed applications,” said Diane Clay, senior product marketing manager of cloud solutions and services for Hitachi Vantara. “Hybrid should be part of the cloud strategy and cloud roadmap for any decent-sized enterprise,” added Sash Sunkara, CEO and co-founder of RackWare.

However, there is also confusion in the market—enterprises may not recognize their implementations as hybrid clouds, said Ken Rugg, chief product and strategy officer at EnterpriseDB. “Nearly all enterprises of any size now use both public and private clouds of some sort,” he pointed out. Multi-cloud, or using multiple public cloud providers simultaneously, is in vogue, but may also defy standardized definitions.

It boils down to the fact that enterprises are no longer relying on one single platform or computer resource, explained Larry Socher, managing director and offerings lead of intelligent cloud and infrastructure at Accenture. “We have very few clients with 100% of their workloads on a public cloud,” he noted. “Most of our clients are adopting a multi-cloud strategy, which includes multiple SaaS, PaaS, and IaaS public providers with hybrid application and data placement across public and private cloud.”

Location, Location, Location

The rise of both hybrid and multi-cloud platforms means data needs to be managed in new ways, industry experts point out. There are lingering questions about which data should go into the cloud, and which should stay on-premise.

“This depends on the organization itself,” said Wise. “All data can be housed in the cloud. Some businesses worry about data privacy, security, or regulatory compliance. Cloud vendors build their platforms ensuring these concerns are addressed, whereas on-premise environments cannot always ensure that is the case. Organizations need to evaluate costs and levels of managed services in order to determine what works best in the cloud versus on-premise.”

Often, too much attention is paid to the application, and not enough to the data. “When organizations plan where to run their workloads—whether on a public cloud provider, on-prem, migrated to a third-party software service, or some combination thereof—the focus is often on the application itself,” said Gordon Haff, technology evangelist at Red Hat. “The more critical question frequently relates to the data the application uses.”

There’s a need to let data stay where it is as hybrid clouds develop. “The most efficient way to manage a hybrid data infrastructure is to leave the data where it naturally belongs, moving data processing and analysis to the data, rather than the other way around,” said Emma McGrattan, senior vice president of engineering at Actian. “This avoids unnecessary data duplication, simplifies data governance, and is cost-effective from a data storage, data transfer, and data management perspective. Organizations with data subject to security and compliance requirements, such as healthcare and financial services, often prefer to keep sensitive data on-premise to directly control the security and governance of those data assets.”

The location of data—or data gravity—is highly relevant to hybrid strategies. “Data locality is critical from a performance perspective, and the data should still be co-located as close as possible to the servers that are actually using it and updating it,” said Todd Matters, chief architect and co-founder of RackWare. “Putting data where it can be accessed by the most number of servers is really the most important consideration. With the hybrid cloud, you have plenty more options available to you—along with the opportunity to attune your storage options to your performance and your cost objectives.”

The data gravity issue will be a bottleneck for some time to come “which means your data architecture needs to be adjacent,” said Socher. This may mean moving more resources to public clouds to achieve such locality. “A lot of the placement will be dictated by the network’s ability, or inability, to move data around. Some of our clients are building private clouds in colocation facilities to get out of data centers—enabling cloud adjacency and minimizing the impact of data gravity.  If I put my private cloud with SAP HANA in an Equinix facility, now I have bigger pipes with lower latency to get to public cloud providers like Amazon, Azure, and Google Cloud Platform.  It’s easier to share resources across different environments for analytics and AI.”

Part of this equation is the maintenance and eventual storage of data. “Hot data, or data that an organization would access over the course of a week, needs to be available either on-prem or nearby at the edge to ensure performance and availability,” said Laz Vekiarides, CTO and co-founder of ClearSky Data. “That doesn’t mean that the data has to live on-prem or at the edge permanently, but it does at least need to be cached there. There’s just not enough bandwidth in the world to eliminate latency if it has to be pulled down from a data center hundreds of miles away. The cloud is a great place to store cold data, of course, but it’s also perfect to store the master copy of data, due to its resiliency, redundancy, and infinite capacity, as long as data can be intelligently cached at the edge or on-prem.”

Clay suggested that data be managed as a single pool of resources through a single open platform. However, she added, “cloud management platforms had shown great promise for this task, but in practice added a very heavyweight management layer to IaaS and did little to support flexibility and agility.” Another option, she continued, “is to make the on-premise infrastructure identical to the public cloud PaaS environment so that there is little operational distinction between the two.” While this “may work well in a single vendor environment, it can add complexity in a multi-cloud environment.”

Deciding which data should go to the cloud and which should remain on-premise is a choice driven by a number of variables. For remaining on-premise, “the obvious cases include data sovereignty, compliance, or security issues,” said Kevin Hannah, director of product operations for Kazuhm. “Can I move my customer data across geographical borders? Am I able to do radiology image analysis anywhere else but in my hospital? Is the cloud going to protect my latest movie against piracy? But with the advent of the edge, then latency becomes a major consideration. This is the case, for example, in Industry 4.0 initiatives in manufacturing where assembly-line sensors are separated from public cloud, private cloud, or other on-premise systems to run machine learning inference locally.”

Aggregated machine learning processing is perhaps best done at the access edge (last mile) or infrastructure edge (local data centers) to feed decisions to local supply chains, with centralized aggregation of data, perhaps in the cloud, for the heavy lifting of machine learning model training where there is the additional usage cost, Hannah said. “Data is going to be located where it can be best put to use. Location, location, location.”

The process of DataOps—the automation and provisioning of data on a continuous basis—is helping to change the equation, Clay said, adding that “enterprises need to be able to look at this through a DataOps lens,” meaning data and applications need to be close to where they are needed. They also should “be able to easily change when business requirements change—and different solutions may be better suited for certain tasks than others.”

Locking Down the Cloud

Hybrid cloud also changes the security equation for data, industry experts concur. In a hybrid environment “where you have more interconnection, you need to be more careful about your networks and how your data moves across those networks,” said Matters. “Back when we had regular data centers, there was one input and one output. But now, with hybrid cloud, there could be three different clouds that are all interconnected. You need a very good strategy to make sure your data is locked down, none of the entry points can be breached, and that everything is highly encrypted.”

Data security, always a key concern for IT, can become a larger issue with hybrid cloud scenarios. “The mix of on-premise data, private cloud, and public clouds, location and infrastructure beyond the firewall for corporate data by definition increases the IT complexity and risk, particularly for companies who haven’t yet automated monitoring for security and compliance.” Hybrid cloud “can make it a bit harder to manage security and governance since the tools and process may differ,” Milani added. “Because of the complexity that hybrid brings, some companies are moving everything to the cloud in order to have a single security paradigm.”

Where data resides “is sometimes framed as a security question,” Haff said. “But that’s mostly not the right way to look at it, at least if we’re talking about security in the narrow sense. Whether you’re running your own infrastructure or renting it by the minute from a cloud provider, you’re still responsible for the security at the application level. Furthermore, ‘classic’ infrastructure security (think networking configurations and access control policies) may actually be better handled by the specialists at a cloud provider than from a small IT team that lacks dedicated security expertise.”

As a result, the onus for security is increasingly moving to public cloud providers, who are better equipped to manage issues. In hybrid settings, much of the security may be moved to the public cloud side, Socher predicted. A few years ago, “our clients started to realize that public clouds can be as secure as the private cloud,” he said. “Given the talent and scale of the major providers, they have more cloud skills and talent than our clients. Security is becoming less of a factor.  Regulatory reasons will play a larger role than security for applications and data remaining in private clouds.”

The Business Perspective

There are many business benefits now apparent as mixed-cloud and on-premise environments emerge. “You have the ability to match your IT needs to the environment that meets those needs, but nothing more,” said Matters. “In the past, a lot of environments were highly over-provisioned. If you wanted to buy a server then, you had to buy it powerful enough, just in case you needed it later. With hybrid cloud, you don’t need to do that anymore. The resources are much more fungible. You can deploy things on an as-needed basis. You can implement features like auto-scaling and auto-parking in those clouds. In the end, it takes more thought and more management, but it’s a much less expensive proposition.”

Another key advantage is the transitional role hybrid cloud plays. “A hybrid cloud architecture can help organizations transition toward the cloud without a major infrastructure overhaul or posing significant competitive risk,” said Helena Schwenk, global analyst relations manager at Exasol. “For example, organizations can use scalable public clouds for dynamic data workloads, while leaving less volatile or more sensitive data workloads for a private cloud or on-premise data center. Other architectural considerations also factor in the decision-making process. An organization may decide for performance reasons that it makes more sense for the analytics database to reside on-premise but for the analysis tool that connects to it to reside in the cloud.”

Disaster recovery is another area where hybrid cloud is playing a big role. “Our clients increasingly use private and public to back each other up,” said Socher. “Rather than have two private clouds on the east and west coasts back each other up, you could use AWS West to back up your data center in Maryland and close the DR data center.  Examples like this illustrate why hybrid is now the norm.”

The flexibility that hybrid cloud enables can be a compelling value proposition, said Lyndsay Wise, director of market intelligence at Information Builders. “Organizations can now choose how they want to manage their data—costs, usage, storage, etc.–and which data should be in the cloud versus on-premise. For instance, many platforms now offer pay-per-use pricing so that costs and risks are lower than an on-premise environment. Organizations can create a development environment or sandbox without the risk of broader platform and/or infrastructure costs and can pay for the space they use when they use it.”

The newly found ability to “harness all data assets to provide a 360-degree view of the customer is a dream now being realized by enterprises that have integrated all their data assets across a hybrid data infrastructure,” McGrattan pointed out. “Hybrid data integration and hybrid data analytics technologies are the key to realizing this dream without breaking the bank and without sacrificing data security. Data assets that need to remain on-premise can be seamlessly merged with data generated or managed in the cloud.”

The future is “undoubtedly hybrid and enterprises need a clear set of blueprints for their future data architecture,” McGrattan added. “Balancing and integrating on-premise, private cloud, and public cloud data is fundamental to delivering a successful hybrid environment.”