How to Leverage the Power of Data Lakes

Hadoop adoption is growing and so is the commitment to data lake strategies. Data security, governance, integration, and access have all been identified as critical success factors for data lake deployments.

DBTA recently held a roundtable webinar featuring Carole Murphy, global product marketing at HPE Security - Data Security, Carlos Maroto, managing architect at Search Technologies, and Kevin Petrie senior director and technology evangelist at Attunity to discuss the enabling technologies and best practices for unlocking the power of the data lake.

Data Lakes in today’s hybrid world demand everything be analyzed, everywhere, and in real-time, Petrie said.

Attunity can help enterprises simplify data integration with automation. There is no manual coding or scripting, is automated end-to-end, and can be optimized and configurable. Universal data integration can ingest data in real time and at scale within Hadoop.

A key advantage is that Hadoop-based data lakes can ingest raw data, keep it forever, make it all available, analyze it only when needed, and do it on a massive scale, according to Maroto. However, enterprises struggle with how to find the exact data they need.

The process to uncover the insights businesses need includes ingesting data, researching the data, configuring search engines, parsing and indexing, searching and analyzing, and putting that data into production.

Murphy stressed the need to protect the information since data lakes contain large volumes of highly-sensitive data and many types.

There is a need to protect sensitive data for secure analysis, as mandated by regulatory edicts, and the need for compliance with the upcoming EU GDPR, to unlock the potential of big data investments.

HPE can help secure this data, Murphy explained. Hyper format-preserving encryption and tokenization supports data of any format: name, address, dates, numbers, etc.; preserves referential integrity; preserves meaning, logic and value of data; is used for production protection and data masking.

An archived on-demand replay of this webinar is available here


Subscribe to Big Data Quarterly E-Edition