Cool Features of SQL Server 2014: Improved Cardinality Estimates

With all the cheerleading and the steady drumbeat of new features being released to Azure, it’s easy to lose track of the many cool and valuable new features released in the on-premises version of SQL Server. One of the crown jewels of SQL Server, the cardinality estimator (CE), underwent a large redesign for SQL Server 2014 to improve performance. Cardinality estimates are an extremely important part of query processing. In a nutshell, cardinality estimates are what the relation engine predicts for the number of rows affected by a given operation, including intermediate row sets like those created by filters, aggregations, joins and spool.

The Importance of Cardinality to Query Processing

Despite its importance and also a testimony to its original good design, the older CE remained fundamentally the same from its introduction with SQL Server 7 through SQL Server 2012. As time marched on, the SQL Server development team found situations where specific scenarios for the CE could be improved. The findings meant the introduction of a lot of fixes, patches and QFEs over the years. When the development team decided to overhaul the CE for SQL Server 2014, the goal was to create a cardinality estimator that provided better execution plans, especially for complex queries. At the same time, the team did not insist on that as an airtight requirement. Consequently, there may be a few rare but conceivable situations where a specific query and data distribution might regress in terms of performance.

Consequently, the new CE is something you can enable or disable at the database level. (And by ‘disable’, I mean keep the old CE logic in place). You do this either in the Database Properties Dialog or using the ALTER DATABASE statement. Setting the compatibility level to 120 activates the new CE, anything lower means to use the old CE logic. (The new CE can also be enabled or disabled at the server level using a startup trace flag).

The evidence in the broader SQL Server community is that SQL Server 2014, using the new CE in operation, is better performing with the performance benefit increasing with the sophistication of the queries in the workload. For example, the old CE had problems with ascending or descending key values where a query wanted to retrieve values above or below bounds of the statistics, respectively. If the old CE encountered a situation like that, it would estimate only one row would be returned – even if there were in fact 50k rows that existed above or below the statistical bounder. Bad performance ensues. That has now been fixed in the new CE. Wondering what other aspects of the CE have changed? The read more about it at the official SQL Server/SAP team blog HERE.

Test Before Deployment

So does this mean you should automatically implement the new CE if you’re upgrading an older SQL Server application that’s been running just fine? Please don’t rush! I strongly encouraged to test the new CE thoroughly before you enable it against your production workloads. If the testing shows no net change or a positive impact on performance, then enable it. Otherwise, sit on the sidelines a bit longer before taking the plunge.