From data warehouses to data lakes, there is a growing array of choices when it comes to cloud platforms, deployment models, and features. At the same time, challenges remain, including data integration and governance, performance, and management and monitoring.
Nearly 80% of DBTA subscribers currently have digital transformation initiatives underway and the large majority of these projects are focused on two key areas: cloud solutions and analytics.
DBTA held a roundtable webinar featuring Anand Rao, director, product marketing, Qlik; Lewis Carr, senior director, product marketing, Actian; and Thomas Hazel, founder, CTO, and chief scientist, ChaosSearch, who discussed key solutions and best practices for succeeding with modern analytics today.
Rao explained that there are three ongoing trends in the data architecture modernization and automation space. This includes cloud application development, data warehouse modernization, and a next generation cloud data lake.
Cloud application development through CDC streaming ensures consistency and ease of use across all major platforms (sources and targets), has no impact to production systems, and is easy to manage, automated, with low TCO, Rao said.
Data warehouse modernization reduces risks, save time and money – no scripting or coding required. A data warehouse now can be built-in hours, changed in minutes. It can also future-proof for new requirements and new platforms.
A managed data lake creation allows users to quickly and easily create high-scale data pipelines. It eliminates risky, expensive, and complex custom coding. And can close the “last mile” by provisioning analytics-ready data in real-time.
Modern cloud analytics needs convergence, Carr said. A modern architecture needs a data hub, analytics, hub, data lake, and data warehouse.
The data hub can connect to and ingest from diverse and disparate data sources. Offers batch and streaming modes. And is used for data preparation to ensure data quality.
The analytics hub acts as self-service data access and preparation for non-IT users. It eliminates spreadsheet silos and allows for advanced analytics (canned & customized).
A data lake can do advanced analytics with Spark, Kafka, and other open standard. It’s cost-effective scale and can analyze semi-structured/un-structured data.
The data warehouse component offers sub-sec. queries on operational workloads, 10s of Terabytes of persistent data, can read structured relational data, and offers flexible deployment, according to Carr.
Hazel noted that a data lake is better suited for a big data world. According to Hazel, ChaosSearch provides a data lake platform for analytics at scale. Enable search, SQL, and machine learning workloads on cloud data at scale, at lower cost, with no data movement and faster time to insights, he said.
An archived on-demand replay of this webinar is available here.