The Intersection of More Data and Less Complexity: Strategies for Optimized Cloud-Based Data and Analytics

Data, data, data—there’s more of it every day, at greater complexity than ever before. While having the tools to manage and utilize that data is necessary, ensuring you are leveraging the maximum value from those systems and tools is critical.

DBTA recently held a webinar, titled “Unlocking the Value of Cloud Data and Analytics,” featuring speakers Todd Wright, global product marketing with DataDirect Solutions at Progress; Rachel Pedreschi, VP of technical services at Ahana; Mike Frasca, field chief technology officer at Reltio; and Chris Santiago, VP of solutions engineering at Unravel Data, to explore strategies and methodologies to optimize the value and cost of Cloud data, while lightening the workload of analytics teams.

Wright highlighted Progress DataDirect Connectors as a comprehensive solution for challenges in BI/data analytics project implementation. With an endless amount of data sources available today comes a (seemingly) endless number of roadblocks; Progress DataDirect Connectors correct core challenges of a lack of inbuilt connectivity, private information exposure, dynamic data source API/schema changes, accessing on-premises data with cloud BI tools, lack of auditing, and uniform identity across data sources within data analytics. A single Progress DataDirect Connector can access data holistically and securely with authentication systems, as well as expose it safely to cloud applications and aid in keeping tabs on data usage to encourage efficient and optimized data analytics processes.

Pedreschi zeroed in on a particular strategy for saving money in cloud-based analytics: the open SQL data lakehouse. Ahana Cloud for Presto is an AWS-based service that deploys the open data lakehouse strategy within an enterprise infrastructure, where the enterprise only needs to provide the S3 bucket. Ahana provides the rest; metadata tables with Hive Metastore or AWS Glue, authorization access with open source projects like Apache Ranger or AWS Lake Formation, backed by Presto’s massively scalable, distributed, in-memory system that allows users to write queries not just against data lake files, but other databases as well. These strategies assist in lowering the costs of cloud-based analytics necessary for today’s overwhelming production of data, level by level. Ahana Cloud is built for data teams of varying experience, offering a moderate level of control of deployment without complexity, as well as dedicated support from Presto experts.

Frasca opened with a staggering statistic; according to research from Gartner, the average enterprise has almost 450 distinct applications that have their own databases, data, and truths of that data. That’s an enormous amount of data systems to manage and control, with efficiency and efficacy being an even more arduous goal. Frasca made clear that modern enterprises need to consolidate data, as well as having a means to understand the context of it across enterprise functions. Unmanaged data leads to poor business decisions; data synchronization is irregular and resource intensive, low quality data creates inaccurate insights, and data aggregation is highly manual and prone to error. Frasca pointed toward Reltio’s Cloud MDM (master data management) services as an avenue for unifying data sources, aggregating, and consolidating it so that it’s insight ready for analytics systems. Through capabilities like automated matching techniques that build and curate connections and groupings of entities, stewardship workflows and reporting to manage policy enforcement activities, and managing a central source of truth with a repository of lookup data and standard vocabulary, Cloud MDM lets data analysts focus on driving insights for better business decisions—rather than on grueling, manual data tasks. 

Santiago stated it plainly: every company is now a data company. With that sentiment in mind, he reminded the audience of its complexity; companies hitting budgets quicker with cloud services, resource constraints in terms of talent and expertise, as well as difficult technology that prevents production combine to form a need for data observability. According to Santiago, traditional DevOps tools aren’t cutting it, only offering fragmented solutions that simply don’t tackle enough of the challenges. With Unravel, enterprises can leverage performance, cost governance, and data quality for data observability in a single platform—entirely designed for the modern data stack. Data teams have different wants and needs in order to encourage peak productivity and Unravel is purpose-built to accommodate those varying requirements for optimized data performance.

To learn more about the ways you can increase the value obtained from cloud-based data and analytics, you can view an archived version of this webinar here.