IBM Netezza and Revolution Analytics Partner to Bring R Statistics Language to the Enterprise

Revolution Analytics, a commercial provider of software and services based on the open source R project for statistical computing, and IBM Netezza have announced a partnership to integrate Revolution R Enterprise and the IBM Netezza TwinFin Data Warehouse Appliance. According to the vendors, this will enable customers seeking to run high performance and full-scale predictive analytics from within a data warehouse platform to directly leverage the capabilities of the open source R statistics language. Under the terms of the agreement, the companies will work together to create a version of Revolution's software that takes advantage of IBM Netezza's i-class technology so that Revolution R Enterprise can run in-database in an optimal fashion. General availability of this integrated solution is currently planned for later this year.

The easiest way to think of R, "although it is not a perfect analogy," is that it is "an open source version of SAS," Jeff Erhardt, COO of Revolution Analytics, tells 5 Minute Briefing. A high-powered statistical programming language, "R has been around for about 15 years and was originally conceived of and written by Robert Gentleman and Ross Ihaka in New Zealand in the mid-1990s. Since then it has taken the academic and research world by storm, and really become the lingua franca of statistics," says Erhardt.

The community of researchers working with R relies not just on the core programming language itself, but also creates what in the R world are called "packages," to solve specific statistical problems, explains Erhardt. As a result, R should be thought of not just as a programming language, but as a "giant ecosystem and community" of packages and applications designed to solve cutting-edge statistical problems in fields like biology, drug discovery, gene discovery, quantitative finance, supply chain, and social networking. "It is just remarkably diverse and remarkably powerful." Revolution Analytics' goal, says Erhardt, is to help drive R out of the academic and research environments and into the enterprise.

The new partnership integrates Revolution R Enterprise with IBM Netezza's high performance data warehouse and advanced analytics platform to help organizations address the challenges that arise as complexity and the scale of data grow.  By moving the analytics processing next to the data, this integration will minimize data movement - a significant bottleneck, especially when dealing with "big data." 

The commercial, enterprise-ready version of R will be ported over and optimized to run advanced analytics in-database in the Netezza data warehouse appliance, says Matthew Rollender, director of technology and Strategic Alliances of IBM Netezza. "Just by having that analytics and that advanced processing in the database, you are removing a lot of data movement and latency and it also gives you the ability to better manage and provision data with respect to the advanced analytics." The mechanism on the Netezza side that enables this is the i-class technology, which is like an SDK that allows the movement of advanced analytics and computational functions down into the database in a way that is transparent to the user, explains Rollender.

With Revolution R Enterprise for IBM Netezza, advanced R computations are available for rapid analysis of hundreds of terabyte-class data volumes - and can deliver 10-100x performance improvements at a fraction of the cost compared to traditional analytics vendors. According to the vendors, R is used by more than two million analysts in academia and at companies such as Google, Bank of America and Acxiom.

"We are entering the age of analytics, where companies have been collecting all of this data, and it is growing exponentially," says Erhardt. At the same time, companies are realizing that if they are not able to make business-driven decisions from this data, they are not competitive, he adds. In addition, there are smaller companies, or large companies' departments that have been priced out of the market because of legacy software costs they were not able to afford.

The current combination of the massive amounts of data that companies need to analyze; students that were trained on R in school; and corporations demanding something more cost-effective, is creating "a perfect storm" leading to R, asserts Erhardt.

The companies will collaborate on marketing and sales activities. More information is available from IBM Netezza and Revolution Analytics