Enterprise Must-Haves for Modern Data Warehousing

To fit into modern analytics ecosystems, legacy data warehouses must evolve—both architecturally and technologically—to deliver the agility, scalability, and flexibility that business need to thrive in today's data-driven economy.

Alongside new architectural approaches, a variety of technologies have emerged as key ingredients of modern data warehousing, from data virtualization and cloud services, to Hadoop and Spark, and machine learning and automation.

DBTA recently held a webinar with Clive Bearman, director of product marketing, Qlik; Keith Lambert, VP, marketing and business development, Kore Technologies; and Brian Bulkowski, CTO, Yellowbrick, who discussed the must-have capabilities for modern data warehousing today—how they work and how best to use them.

According to Bulkowski, an enterprise data warehouse should be always on and available, have ad-hoc SQL, offer correct answers on any schema, can process terabytes to petabytes of data, provide mixed real-time inserts, ETL, Batch, and interactive workloads, and support thousands of concurrent users.

Enterprises choose data warehousing solutions to discover data-driven opportunities, Bearman said. It’s time to rework the data warehouse architecture format. Must-haves include:

  • Real-time updates: Architected for realtime changed data capture and analytics ready data delivery
  • Universal data: All types, sizes and velocities of enterprise data
  • Automation everywhere: End-to-end automation improves productivity and responsiveness
  • Self-service data marts: Curated, fit-for-purpose data
  • Smart catalog: Expand knowledge of usage, lineage, confidence and trust

Bulkoswki suggested companies use a hybrid cloud data warehouse, which will offer several benefits including the ability to easily shift locations, have an agile versus cost tradeoff, and provide a single database for secure datasets.

According to Lambert, enterprises should start off with best practices and standards. Businesses should define a data model and naming standards, create a data flow diagram, build a source agnostic integration layer, adopt a data warehouse architecture standard, and consider an agile data warehouse methodology.

Enterprises should invest in multi-source and database aggregation, point-in-time snapshots, incremental database updates, automation with message-based architecture, detailed transaction logging, the ability to analyze and profile data sources, template-driven ETL software, easy to change development environment and tools, and a technology partner with reliable and flexible software, Lambert said.

Lambert proposed using Kourier Integrator, which is a multi-purpose solution. The platform provides:

  • Near Real-Time / CDC
  • Multi-Source / DBMS
  • Point in Time
  • SQL Automation
  • Real-Time via REST
  • API Subscriptions
  • Inbound/Outbound
  • Asynchronous/Batch
  • ERP Adaptors
  • Storefront/Portal
  • Marketplace
  • EDI
  • Data Sharing
  • Data Archiving
  • Data Migration
  • Data Cleansing

An archived on-demand replay of this webinar is available here.