Oracle Big Data Portfolio Adds New Data Integration Technologies

Oracle has unveiled Oracle Data Integrator for Big Data to help make big data integration more accessible and actionable for customers.  This news follows Oracle’s recent Enterprise Big Data announcement to advance Oracle’s goal of enabling Hadoop, NoSQL, and relational database technologies to work together and be deployed securely in any model.

According to Oracle, generating value from big data requires the right tools to move and prepare data to effectively discover new insights. In order to operationalize those insights, new data must integrate securely with existing data, infrastructure, applications, and processes. Oracle Data Integration is a key component of Oracle’s Big Data solutions, a group of Oracle technologies that work together to help companies securely reach across Hadoop, NoSQL, and relational databases to sift through massive and diverse data sets.

The goal with the new data integration capabilities is to bring together disparate communities that have emerged within the Oracle client base and allow the mainstream DBAs and ETL developers as well as the big data development organization to be brought together on a single platform for collaboration, said Jeff Pollock, vice president of product management at Oracle.  

Oracle Data Integrator for Big Data does this by allowing customers to leverage the mainstream skills that they already have in-house and have them directly applied to newer, more specialty big data languages, explained Pollock. One of the biggest challenges to big data adoption in organizations, he noted, has been the proliferation of languages and hard-to-find skills to support newer languages in big data environments. “This gives your mainstream DBAs and your mainstream ETL developers a way to work with a single logical design in a tool that they are comfortable with.”

A key aspect of Oracle Data Integrator for Big Data is that it is designed to leverage a customer’s existing investment in their Hadoop cluster, running natively without requiring proprietary code to be installed or a separate server to be run, said Pollock.  As a result, customers can be big data ETL developers without having to learn Scala, Pig or Oozie code, he said. “We can now generate for Apache Pig using Pig Latin, we can generate on Spark for using machine-learning based languages running on Spark, and we support Hive running MapReduce 2, as well as Hive running on Hortonworks Tez.”

For more information, go to the Oracle Data Integrator for Big Data data sheet.