0xdata, a provider of open source machine learning and predictive analytics for big data, has announced general availability of the latest release of H2O, a prediction engine for big data users of Hadoop, R and Excel. H2O was developed to allow ordinary users to take advantage of the predictive power of big data through better algorithms, according to SriSatish Ambati, CEO and co-founder of 0xdata. “We are trying to bring Google-scale analytics to the world.”
The second generation H2O “Fluid Vector” release delivers new levels of performance, ease of use and integration with R. Early H2O customers, according to the company, include Netflix, Trulia and Vendavo.
With H2O users can explore and model big data from within Microsoft Excel and RStudio and connect it with data from HDFS, S3, SQL and NoSQL data sources. H2O’s in-memory columnar compression and fine-grain parallelism via MapReduce provides speed, scale and extensibility for advanced algorithms on big data. Customers can extend the Lego-like architecture and run their own algorithms and models or take advantage of 0xdata’s latest algorithms for Distributed Trees and Regression, such as Gradient Boosting Machine (GBM), Random Forest (RF), Generalized Linear Modeling (GLM), k-Means and Principal Component Analysis (PCA).
And, according to the company, rather than wait for an entire job to finish, H2O provides approximate results at every step in the analysis process so users can get a general idea of results and kill a job and start over quickly if the early approximate numbers exceed an anticipated range.
H2O is taking the fundamental algorithms that are required to scale data science and enabling users to build models that are easy to run and easy to deploy, says Ambati.
“What we envision is that most developers will build their own specific algorithms and models for their own domains and industries,” said Ambati. To support that capability, with this release, Oxdata has greatly simplified the API and also made ad hoc data exploration much easier, said Ambati. In addition, the company is now working on an API for Scala, which is expected to be rolled out in the next few weeks. “The Scala framework is becoming more and more popular in machine-earning circles,” said Ambati. “Our entire roadmap is a function of our early adopters and customers.”
More information is available about 0xdata.