At Spark Summit in San Francisco this week, Microsoft announced it is making an extensive commitment for Spark to power Microsoft’s big data and analytics offerings including Cortana Intelligence Suite, Power BI, and Microsoft R Server.
As part of the initiative, Spark for Azure HDInsight is now generally available and introducing a fully managed Spark service from Hortonworks that has been hardened for the enterprise and made simpler to use. And, previously announced as public preview, R Server for HDInsight will be generally available in the summer making the Spark integration available both on-premises and in the cloud, making it easier to move code and projects to the cloud with a few clicks and within a few minutes without buying hardware or hiring specialized operations teams typically associated with big data infrastructure.
In addition, R Server for Hadoop on-premises will support both Microsoft R and native Spark execution frameworks and be available in June. Microsoft says that combining R Server with Spark gives users the ability to run R functions over thousands of Spark nodes so they can train models “on data 1000x larger and 100x faster than was possible with open source R and nearly 2x faster than Spark’s own MLLib.”
Microsoft is also announcing the Microsoft R Client, a new freely available tool for data scientists to build high performance analytics using R, as well as Power BI support for Spark Streaming. Previously announced with Power BI General Availability, Spark support in Power BI is being expanded with new support for Spark Streaming scenarios.