Get to Know Azure SQL Database Hyperscale

About 5,500 of the SQL Server faithful came out for the PASS Summit 2018 in Seattle last November. As usual, the event had the feel of a great big family reunion. It’s always a fun time, with lots of socializing in the halls of the conference center and at the various restaurants and gathering places around town. The training was also top-notch this year with more than 16 tracks of sessions going on from early morning until evening—literally hundreds of SQL Server developer, DBA, architect, BI, and data science sessions!

What is Hyperscale and What are its Features?

One element that was particularly interesting and the top of many conversations afterwards was the day two keynote covering a new offering in public preview called Azure SQL Database Hyperscale (just Hyperscale here on out). Hyperscale is an Azure SQL Database built upon a new set of Hyperscale storage technologies that enable high throughput and performance, databases of up to 100Tb in size, and offers very elastic and transparent scalability as workload needs change from minute to minute.

Hyperscale is intended for those users whose requirements encompass very large databases and powerful, elastic connectivity and query processing. It sits squarely between the General Purpose service tier, which is ideal for most business workloads, and Business Critical, which is appropriate for situations where I/O latency is first priority. It is also a much easier alternative to heavily-sharded database architectures. Other features, none of which require code changes, include:

  • Nearly instantaneous database backups and restores regardless of database size. Backups and restores are based upon new file snapshot technology but still support Point in Time recovery.
  • Very fast transaction log commit times and throughput due to an all new transaction log technology. Transaction log size is unlimited. (Yes, you read that right).
  • Read scale-out to one or more read-only nodes for hot-standbys or so that you can offload your read workloads to other nodes.
  • Rapid scale-up of compute power to accommodate a workload that may need to shift heavily upward or downward in constant time. For example, it might act like scaling up and then down between a P6 server to a P11 server, but at much faster rates. The scaling operation might be up or down, by reconfiguring CPU, memory, or connections. Alternately, the scaling operation might be inward or outward across compute notes. Scaling takes as much as 2 minutes.

Hyperscale is available for an individual database using the vCore-based purchasing model. Since it is a single-database solution, Azure SQL Managed Instances are not available with Hyperscale. Hyperscale is currently available in public preview for single databases in the following regions: West US1, West US2, East US1, Central US, West Europe, North Europe, UK West, SouthEast Asia, Japan East, Korea Central, Australia SouthEast, and Australia East. You can also use the Azure Hybrid Benefit program to map your on-premise or IaaS SQL Server licenses to Hyperscale databases.

What are the Limitations of Hyperscale?

Hyperscale, at present, has a few limitations compared to other SQL Server technologies. For example, it does not have a feature for long-term retention backup, simple recovery model or bulk logged recovery model. For that matter, you cannot take an old-fashioned backup since it uses new recovery technology. If you want to build a copy or move data from a Hyperscale database, you will need to use other services like Azure Data Factory, the export service, or even BCP scripts. It also does not support Polybase, R or Python, nor Transparent Data Encryption (TDE). Also, you cannot currently migrate a Hyperscale database to another service tier. Geo-replication and geo-restore are not supported. Consult the Hyperscale FAQ for more details.

Hyperscale brings several cool, new technologies to bear. The primary conceptual difference from a regular SQL Server is that Compute-dependent components of the DBMS are abstracted apart from the Storage-dependent components.

Here’s a very simplified overview. First, the Compute nodes look like a traditional SQL Server, but without local data files or a log file. The Compute nodes write OLTP activity to the Log Service and fetches data pages from the Page Servers, when they’re not immediately found in the local data cache or the Buffer Pool Extension (now officially called RBPE, or Resilient Buffer Pool Extension). Because  Compute writes transactions to a Log Service, rather than directly to a log file, it is able to handle a huge number of concurrent connections and very high log throughput in ways unseen in previous versions SQL Server. From there, the Page Servers and secondary compute instances consume activity via the logs from the Log Service, applying transactions to the data files. Large databases will have many page servers, spinning up new ones as needed to ensure that there are no local resource constraints.

Details for the new Hyperscale architecture are discussed on the official announcement page.

Want to Give Hyperscale a Try?

Many customers are actively engaged in Proof of Concepts (POCs) for production databases or are planning new deployments on Hyperscale. Want to try it too? Then go to today and join the public preview!