Major Changes for SQL Server Analytics Products

Microsoft announced at the end of February 2022 that significant changes are in store across its SQL Server analytics products. To set the context, the broader enterprise IT marketplace was still primarily on-prem when Microsoft first introduced its cloud analytics products in 2017. Since that time, we’ve seen a flood of migrations to the cloud with analytical workloads at the forefront, since a cloud-based approach to analytics offers greatly improved manageability, deployment simplicity, as well as standard cloud benefits like flexibility, cost control, and scalability.

Fast forward to today. Microsoft now offers many new and powerful analytics products based solely in the cloud, like Azure SQL, Azure Machine Learning (ML), Azure Data Lake Storage (ADLS), and Azure Synapse Analytics. Add to that the fact that the global pandemic has solidified and even accelerated enterprise analytics in the cloud—especially since the cloud facilitates quick prototyping and provisioning without requiring employees to visit the physical office.

SQL Server Big Data Clusters (BDC) Retires

Now that we have so many improved products available in the cloud, Microsoft has decided to “retire” SQL Server BDC. So what is different about retiring a product compared to announcing its end-of-life (EOL). In this case, anyone licensed for SQL Server 2019 BDC with Software Assurance will continue with full support through February 2025—3 years hence. Until that time, SQL Server BDC updates continue normally through the cumulative update process. You might also investigate whether this is a good time to migrate to Azure Synapse Analytics.

Change Is Coming to PolyBase Scale-Out Groups in SQL Server

Likewise, Microsoft has announced the retirement of PolyBase scale-out groups in SQL Server. That means when we receive the bits for SQL Server 2022, PolyBase scale-out groups will be excluded from that release. Earlier releases of SQL Server with PolyBase, ranging from SQL Server 2016 to SQL Server 2019, will continue supporting the feature set until EOL. Other PolyBase features like scale-up groups and data virtualization, continue fully supported.

Here’s my analysis of the two changes above—they are not necessary if you properly activate and set up Azure Data Lake Storage. ADLS is more flexible, scalable, and more widespread. Many vendors offer data lake offerings, so hiring or growing your own talent is easier for BDC or PolyBase.

Change is Coming for Hadoop Support

You might recall that Cloudera and Hortonworks merged a couple years ago. Soon after, their independent product lines started to see a lot of change. Microsoft, in turn, has restricted support for external data sources to those versions that are still in mainstream support by their respective vendors. What that means for us is that SQL Server 2022 will not include support for these products as well as dropping their support in SQL Server 2016 through 2019.

Instead, use the new object storage capabilities in SQL Server 2022. Along those lines, Hadoop Distributed File System (HDFS) will ship with SQL Server 2022 using a new WebHDFS connector. Microsoft is also switching to the publicly documented REST APIs instead of a JAVA Hadoop client. That means, when using SQL Server 2022, you will need to switch from wasb[s] to abs to connect to Azure Blob Storage, and from abfs[s] to adls to connect to ADLS Gen2.


Also new in SQL Server 2022 for on-prem analytics for object storage integration connectors over REST APIs. (Notice the increased use of REST APIs?) We will see more investment in Azure Arc-enabled data services for hybrid capabilities, plus more investment in the Spark SQL connector to provide first-class connective between Apache Spark to the entire line-up or Microsoft data platform products.

Continuing the trend of strengthening SQL Server with its offering of cloud analytics, Microsoft is introducing Azure Synapse Link for SQL Server 2022, arriving in GA later this year. This feature is one of my favorites and means that you can have industry-leading near-real time analytics without cumbersome ETL pipelines. Read more about Azure Synapse Link as well as the preview of SQL Server 2022 at