Pentaho 5.1 Adds Features to Help Organizations Scale Big Data Operations

Pentaho has announced version 5.1 of its business analytics and data integration platform, which enables code-free analytics directly on MongoDB, simplifies the data preparation process for data scientists, and offers full support for YARN.

Analytics on NoSQL database MongoDB without manual coding

Pentaho version 5.1 enables MongoDB data collections to be analyzed directly ‘at the source,’ without hand-coding or the requirement to prepare data in a staging area. This gives companies a greater ability to make reliable insights faster, and decreases the need for specialists’ skills.

Operationalizing R and Weka for data scientists

Pentaho 5.1 also enables large-scale analytics with the availability of its new Data Science Pack, announced earlier this month. The Pack equips data analysts and scientists with a toolkit to build a ‘360 degree customer view’ that blends different data sources, like social and MongoDB, and enable advanced analytics like churn prediction and customer sentiment. By operationalizing two commonly used technologies, R and Weka, Pentaho offloads the burden of the data flow process.

The pack includes an R Script Executor for Pentaho Data Integration (PDI), removing the burden of data preparation; Weka Scoring for PDI to allow the user to “score” data as part of a PDI transformation by applying classification, clustering, and regression models constructed in Weka; and Weka Forecasting for PDI, leveraging forecasting models created in Weka’s time series analysis and forecasting environment in order to create future predictions on incoming data within a PDI transformation.

Full YARN support

And finally, YARN (MapReduce 2.0)integration, announced earlier this year by Pentaho Labs, is now available in Pentaho 5.1. Pentaho developers familiar with Pentaho Data Integration can exploit the computational power of Hadoop, without having to write complex MapReduce code.  With YARN, Pentaho Data Integration jobs can make flexible use of Hadoop resources, expanding and contracting as data volumes and processing requirements change. 

More information is available about Pentaho 5.1 features.