- Take interdependencies into account to avoid settings that do not make sense. One needs to be mindful of the relationships among knobs when recommending a configuration to prevent nonsensical settings. For example, suppose the DBMS does not "know" that it should not allocate the entire system's memory to multiple buffers controlled by different knobs. In that case, one could choose a configuration that causes the DBMS to allocate too much memory and then cause the system to become unresponsive due to thrashing. In other cases, a DBMS will refuse to start when the requested configuration uses too much memory.
- Learn the limitations of tuning a production database. It is tough to figure out how to tune knobs without shutting down a production database. And downtime, of course, can be crippling for a business. There are ways to tackle the challenge by creating copies of databases and so on, but that increases complexity. When tuning a production database, it may only be feasible to tune within a particular time window each day. And, keep in mind that some knobs require a restart to take effect. With a production database, there will likely be limited opportunities for restarts. Whether or not you will have to restart the DBMS to see a significant benefit from performance tuning depends on whether you're running Oracle, PostgreSQL, MySQL, or another DBMS and whether the DBMS is running on-prem or in the cloud.
- Database tuning leads anyone into the great unknown. It is difficult to know why a DBMS's performance has improved or degraded after the tuning begins. It could be because of knob configuration changes. Or it could be because of workload pattern shifts, hardware degradations, or other environmental changes. Determining this cause is difficult for humans to figure out. Thus, it is vital to observe the DBMS's performance over long periods of time to understand whether the performance changes are due to tuning or these other factors.
Using Machine Learning to Automate Database Tuning
The above challenges certainly sound overwhelming. But the good news is that there has been a significant amount of research in the last decade on using machine learning (ML) algorithms to help with database configuration tuning and overcome the challenges. The advantage of these approaches is that unlike a human, an ML algorithm could be watching your database all the time and making sure that it is always using the best configuration no matter how the workload or environment changes.
The crux of these ML-based approaches is to train models from measurements collected from the DBMS and then use the models to recommend knob settings that improve a target objective, such as latency or throughput. To speed up the tuning process, there are ML techniques for the reuse of training data gathered from previous sessions to tune new DBMS deployments. Reusing past experience reduces the amount of time and resources required to tune a DBMS for a new application. An important caveat: the training data from different DBMSs and versions of DBMSs can't effectively be reused or repurposed among databases, at least for the time being. This is because knobs are not standardized across DBMSs or even different releases from the same DBMS since they often add new knobs and deprecate existing ones.
Worth the Time
Database knob configuration is complex, and although ML can be a tremendous help, there are some unexpected twists and turns to automate the process. There are things to know upfront before starting and "under the hood" things to know once tuning starts. Predicting the best possible outcomes and knowing which recommendations to enact beforehand is the most challenging part.
One way is to use an algorithm to predict which knobs to tune for the most positive impact. This additional information will help the person in charge of tuning databases make the best decisions. It is well worthwhile to solve the database tuning problem and to mitigate all the surprises. The industry is making significant strides toward it.