Baseline Aware Analysis: The Cornerstone of Database Continuous Delivery

Jun 9, 2015

By Uri Margalit

<< back Page 2 of 4 next >>

Simply comparing two environments does not provide insight regarding the nature of the differences. For example: (a.) a case where the difference conflicts with an emergency fix, (b.) the trunk/stash/QA environment was already updated with other changes from a different branch, or (c.) the later environment (trunk/stash/QA) is more up-to-date regarding specific objects - thus the difference should not be part of the delta changes script.

This missing information is only available with baseline aware analysis.

Simple Compare & Sync vs. Baseline Aware Analysis

The input for the database build phase should absolutely be taken from the source control repository, which includes only changes that were checked-in, and does not include changes that are still in work-in-progress mode. This brings us to the starting point of the process – source control and how to make sure the build process retrieves the relevant changes.

Develop Using a Reliable Database Source Control

In this phase, developers introduce changes to the database structure, reference lookup content, or logic in the database (procedures, function etc.).

The two common approaches of database development are: (1) using a shared database environment for the team, or (2) Using a private database environment for each developer. Both methods have unique advantages and challenges. Using a shared database environment reduces the code merges for the database code and also reduces the complexity and cost of updating the database structure based on the source control. Using private databases may necessitate more merges of the database code but reduces the potential for accidentally overriding another colleague’s changes. In addition, a private database environment has other factors to consider such as management overhead, licenses, hardware, and cost.

The primary reason the private environment method is not commonly used is related to how developers publish changes from their private (workspace) environment to the integration environment. Publishing changes should not revert changes made by someone else and updating the private environment from the source control repository should not revert work-in-progress.

The same process of building the native code using only changes which are documented in the source control repository, should be applied to database code changes. Developers work on native code in the IDE and then check-in the changes to the source control repository without any additional manual steps. Having a file-based script a developer is maintaining for his/her changes will create a few challenges that will be difficult to resolve and will require a lot of time:

How to guarantee the version control repository correctly represents the database structure that was tested.
Developer A made a number of changes and developer B made other changes to the script; none of the developers can execute his/her entire script because the script overrides (or reverts) the changes introduced by the other developer.

In addition, there are other challenges that occur in the deployment phase but originate in previous phases:

Controlling the order of the execution of scripts created by several developers.
Maintaining the change scripts on a release scope change.

These problems become more severe when there are many small changes to the same object. These changes can be reconciled into a single script, which would reduce the number of scripts that have to be run and may improve deployment times. However, this tends to be difficult to practice, as it will cost a lot of time to constantly regenerate the new script from scratch, test it etc.

Source Control – Single Source of Truth?

Anyone with sufficient database credentials who may login to the database introduces a change and can forget to apply the change in the relevant script of the file-based version control. This is what has reportedly happened in finance, insurance, online travel, algo-trading, gaming and other industries.

Database Deployment logic

Another unique challenge from the database point of view is how deployment is done. Can the database deployment process act as the native code – replacing the existing database/table in production with the new database/table from the development? Or does it have to alter the existing database structure in production from the current state to the target state by preserving the data?

<< back Page 2 of 4 next >>