Hybrid Cloud Analytics and Strategies: Q&A with Alluxio's Haoyuan Li

The prospect of managing data across the data center and public cloud platforms can be daunting to enterprises. There are concerns about hybrid latency affecting analytic workloads and many companies can't store data in a public cloud due to a variety of legal and regulatory concerns. However, the cloud offers flexibility, elasticity, and scalability benefits that are hard to beat, making migration of at least some workloads highly desirable. Recently, Haoyuan Li, founder, chairman, and CEO at Alluxio, Inc., shared his views on what's at stake now and how organizations can overcome hybrid and multi-cloud challenges to take advantage of cloud migration for cost benefits and agility.

Haoyuan Li, Alluxio DBTA: How do you see on-prem/hybrid/multi-cloud use now?
Haoyuan Li: Infrastructure is increasingly being spread across geographical locations and even multiple cloud providers. Cloud migration is ubiquitous in some shape or form, and environments completely on-premise are becoming hard to maintain. The flexibility of public cloud environments is driving migration in order to reduce operational costs and increase business agility.

DBTA: What are the key trends in adoption?
HL: There is a clear trend of increased adoption of hybrid and multi-cloud solutions, where customers run analytics and AI workloads in public cloud environments without abandoning their data lakes on-premise. However, the re-architecture and the initial toolset adopted often make this process slow. Container and data orchestration technologies are now being leveraged to abstract out the infrastructure so that applications can run in any environment without modifications.

DBTA: What is causing this?
HL: The hybrid and multi-cloud model tailors to the needs of an organization for agility while keeping infrastructure costs low. There are a few reasons for increased adoption. One case is when the computing infrastructure on-premise is running out of CPU resources to meet user SLAs (service level agreements). A public cloud is attractive for elastic provisioning to reduce costs for ephemeral or bursty workloads. Data itself may not be moved to public cloud storage to leverage the existing on-premise infrastructure or for compliance reasons. Another case is when storage clusters on-premise are overloaded and fail to scale. Partial migration to cloud storage addresses storage needs while enabling cloud native computing.

DBTA: What are companies’ concerns as they formulate their strategies for on-prem, in the cloud, and multi- and hybrid cloud deployments?
HL: Infrastructure, management, and operational costs are huge concerns in what can seem to be a complex environment. Network latency and the lack of data locality with compute and storage spread across locations is a reason for skepticism. Security integration and access control at the desired granularity is another focus area. 

DBTA: What are the solutions or technologies that are helping organizations successfully manage their data landscapes?
HL: Hybrid and multi-cloud environments for data analytics have data spread across storage silos being accessed by multiple compute clusters. When devising an infrastructure strategy, considerations include the need to re-write applications, managing the movement of data across silos, and future-proofing the architecture by preparing for an infrastructure with components on-premise and across multiple clouds. Abstraction at all layers of the technology stack is being employed. Container orchestration future-proofs the application layer so that workloads can be migrated across infrastructure providers when needed. However, moving data is not immune to network and storage costs. A data orchestration layer alleviates this issue by decoupling applications from the location of data.

DBTA: What new approaches do organizations need to embrace to adopt to be successful?
HL: The use of infrastructure policies across the board is key to keeping management costs low while optimizing for resource utilization. Each customer’s scenario is unique with different usage patterns. Elasticity in the cloud demands that auto-scaling policies to control when and for how long to utilize compute resources. The use of data management policies is an approach to dictate the location and lifetime of data without manual intervention. This technique addresses the need for on-demand access across locations while minimizing network traffic to optimize cost and performance.

DBTA: How has the COVID-19 health crisis affected these trends over the last 6-9 months?
HL: Infrastructure costs have been even more of a focus over the last several months. For some of our customers, certain intermediate steps toward cloud migration were eliminated to further accelerate adoption of hybrid and multi-cloud models. The recent challenges have made resolution of the most prevalent infrastructure issues even more urgent, and has made a radical new approach more attractive compared to incremental improvements.   

DBTA: Looking ahead, what are three key trends you see taking hold?
HL: The increased use of container and data orchestration as the new standard across hybrid and multi-cloud environments is one. A unified security model across a complex environment is another crucial layer of the technology stack. And then, with all these solutions in place, we will see even more automation of complex workflows to manage resources across regions with a unified view.