Testing and tuning SQL is an often-overlooked aspect of incorporating the database into your DevOps pipeline. It is important to be able to analyze and optimize SQL performance within your application code—and to do so early in the development process.
If you have ever written SQL statements then you know it is a flexible language. There are usually multiple ways to write a SQL query to achieve the same results. For example, you can combine multiple tables using a join or a subselect and achieve the same output. But maybe not the same performance. And this is only one example of the various ways you can write SQL to achieve the same results.
Application developers typically do not spend a lot of time on optimizing performance. Their focus is on writing code that matches the project specifications and delivers the expected results. So, it makes sense that SQL performance testing should be conducted on all programs before they are migrated to a production environment, or you will likely experience performance problems.
DevOps and SQL Testing
DevOps provides an opportunity to integrate SQL performance testing into your continuous delivery pipeline.
In a DevOps environment, the best approach is to measure, analyze and improve SQL statements at all stages as your code progresses from development to testing to production. The more SQL performance testing that can be accomplished by developers, the earlier performance problems will be found and corrected. And that means the cost of delivering high-quality database applications will decline.
However, things are not as simple as just running your program and evaluating its performance metrics. The data that you use in your test environment will not be the same as your production data. Typically, you will have less test data than you do in production. And that means you will also get different access paths and performance results.
Part of SQL performance testing during development is the ability to mimic production scale in test. You can update database statistics to make the DBMS think that the test system has production levels of data. You may need to update test statistics multiple times as production data expands. The goal is for the test environment to always be set up to look as much like production as possible.
Keep in mind that it is not just the amount of data that differs between test and production systems. Usually the test environment will have different hardware, system parameters, and configurations than are used for production. Your test system will usually be running on a less powerful processor and using less memory. This can result in different access paths for test and production systems, which can cause performance problems that only show up after you move to production.
Some database systems and management products allow you to model the configuration and settings of your production environment in your test system, without deploying the same exact setup. This can be setup as part of the SQL performance testing in your DevOps pipeline to enable your SQL tests to be as close as possible to the production environment.
With the proper setup and tooling, developers can examine the access paths of their SQL statements to judge their efficiency.
You should look for flexible tools that can be configured for testing specific use cases. For example, skewed data can negatively impact SQL performance. When data is non-uniformly distributed a subset of the values occur much more frequently than others. A special case of non-uniformly distributed data is skewed data. When data is skewed, one value (or a very small number of values) occurs much more frequently than others. Testing such cases is important to avoid performance surprises in production systems.
Furthermore, you will need a tool that supports all of the different types of SQL that you will be delivering. It should support testing of both static SQL and dynamic SQL. It should be capable of testing procedural SQL used to write stored procedures and user-defined functions. Any type of and flavor of SQL that you are using should be testable and with performance results that can be compared across multiple test runs.
Furthermore, the tool should be capable of documenting the performance results achieved, and perhaps even suggesting multiple SQL variations to improve the results.
Tools that can help set up testing for various use cases and SQL variations should be integrated into your DevOps toolchain to enable SQL performance testing applications before they are moved to production status.