I often work with student or junior DBAs who ask about the skills they will need to be effective in the modern technical landscape. And, as the technology changes, or course so do the required skill sets.
Sometimes I hear questions such as: “Should I learn Python or PHP to become an effective DBA?” My answer is usually neither. And to expand, I usually recommend that they don’t focus on languages but rather on tools. I don’t mean learning the intricacies of tool commands—those can be looked up—but more importantly tool concepts and fundamentals.
The next logical question is about which tools the modern DBA needs to know. In my opinion, the tool landscape that the DBA needs to understand is moving toward the DevOps realm. That doesn’t mean that DBAs need to be DevOps experts—they’re still different roles. But rather, there is some overlap and, at the very least, the DBAs need to be familiar with and, in some cases, use these tools.
In the remainder of this article, I’ll explore the tools I personally think are most relevant to today’s DBA.
If there were only one DevOps tool that the modern DBA should know, it would be Ansible. Often, I see DBA tasks that need to be performed against multiple databases in the environment. Maybe against all instances in the estate, maybe against just production or non-production, or maybe against a named group of related databases. When environments are large, I’ve seen a “divide and conquer” approach where a DBA team might split up the work saying that person A will handle this group of database; person B, this other group, and so on. However, Ansible is a much better solution for these situations.
Whenever the modern DBA needs to execute the same tasks against multiple databases, the answer should always be Ansible. Ansible is ideal for “automating manual tasks.” It is a free scripting language/tool but part of the beauty is that it’s completely agentless—meaning that the Ansible software itself is only installed on one central computer such as a shared DBA machine or even the DBA’s desktop/workstation. From there, it performs all of the remote commands against target machines via Secure Shell (SSH) protocol. There’s no software to install on the database servers—all that’s required is connectivity via SSH. In addition, targets can be easily grouped into an unlimited number of permutations. For example, “All”; “All prod”; and “All application X”—whatever is required.
There are many online resources and videos to help DBAs get started with Ansible. Once they see it in action and understand its simplicity and power, DBAs usually start to quickly see opportunities for how to use it in their day-to-day jobs.
The next logical tool that DBAs should understand is the concept of Git. Typically, this is in the form of GitHub (the largest online Git service, now owned by Microsoft). But GitHub isn’t the only Git implementation. GitLab, Atlassian Bitbucket, and local Git implementations are other similar options.
The idea behind learning Git for the DBA is script management. Recognizing that administrators are not developers, they will always have some number of scripts which they’ve developed or must maintain. And maybe/hopefully, some of those are Ansible scripts!
Traditionally, DBAs would store these in a shared file system or maybe an NFS mount or even a Windows shared drive. However, in the modern world, DBAs should be storing all of their tools and scripts in a Git repository such as GitHub.
This provides a central repository, change history, and version comparisons, as well as the ability to easily share externally if desired (repositories can be private or shared—functionality and costs depends on the service used). DBAs typically use Git in a simplified workflow, although proper development teams will use it in a more sophisticated manner that includes forks, branches, and other such advanced features.
In fact, DBAs are probably already recognizing that GitHub has become the standard location for external sharing of DBA tools and utilities. It is likely that their favorite public domain tuning script is already being hosted in and shared via GitHub.
The beauty of DBAs learning Git is not just that it allows them to do their own internal script management, but that it also allows them to provide feedback, suggestions, and updates to these public domain tools.
Another important recommendation from my perspective is to use a proper wiki-style documentation system. If you’re still documenting run books and other how-to types of documentation in Microsoft Word (and/or Excel), then I’m afraid you’re already a dinosaur. Such documents are very difficult to keep up-to-date, version properly, and even find. The solution is a proper wiki-style documentation system.
My personal preference is Atlassian Confluence. It’s a very easy to learn WYSIWYG wiki-ish system which allows for easy editing, collaboration, and sharing. With a little CSS customization, it even does a reasonably decent job of exporting content to PDF.
Even if Confluence isn’t an option for you, there are other wiki tools, including ones that are installed locally and others that are online services. Regardless, the point is that any documentation including run books and how-to manuals should be in a system that is easy to update, provides revision history, is easily shared, is easily searchable, and ideally also provides collaboration (i.e., commenting functionality). Document editors and spreadsheets do not meet these requirements.
Vagrant and Docker
Often DBAs want “test systems.” It’s easy to provision virtual machines either with or without software pre-installed online from cloud providers, but sometimes administrators are afraid of cloud costs or for other reasons want to do this locally.
So how can we bring some of the ease and speed of provisioning that we see with online cloud systems to our local environment? The answer is tools such as Vagrant and Docker. Both allow us to quickly spin up virtualized environments with consistency and automation. But which to use?
Deciding which is best, in my opinion, depends on your needs. Docker is based on the concept of containers and may be best when you need multiple copies of similar systems. For example, maybe you need eight environments that are almost exactly the same so you can test scalability, replication, or failover options between them. In a case such as this, Docker is quick and efficient—and usually using containers is significantly more resource-efficient on host systems than full virtual machines are.
But in other cases, DBAs need a mix of environments. They may need to easily create (and soon destroy and recreate) a variety of systems such as different flavors of Linux alongside a Solaris system and a Windows server so they can test how their platform-agnostic tools or scripts will work against all of these. In cases such as these, I tend to recommend Vagrant. Vagrant could almost be considered “a scripting tool for automated deployments of virtual machines.” This means that Vagrant allows for the easy creation and destruction of VM environments, including those for Oracle Virtualbox. Vagrant can start with publicly available VM images (called “boxes”) or you can create your own custom ones as a starting point. From there, Vagrant scripts can be used to customize the machines upon creation.
Other Honorable Mentions
If DBAs are responsible for provisioning cloud resources, then DBAs should also familiarize themselves with Terraform. Terraform is from the same maker as and has some similar concepts as Vagrant. However, Vagrant is for the provisioning of virtual machines whereas Terraform is for the provisioning of other infrastructure using code.
In my experience though, provisioning of cloud infrastructure is usually handled by system administrators, site reliability engineers, or DevOps teams and less commonly by DBAs. Consequently, it’s good for DBAs to familiarize themselves with Terraform, but they should leave the actual implementations to those other experts.
I sometimes hear comments such as: DBAs can automate their software installations and database creations with Puppet.While technically true and I’ve seen this done in real life, it’s not really the best tool for the job.
Puppet is best for enforcing consistent state. For example, for ensuring that a Linux iptables file is configured as per internal standards. If anyone changes the file, Puppet will revert it back. Unlike Ansible, Puppet also requires some additional components: a main server and agents on target machines.
So, for many reasons, I think Puppet is best left in the domain of system administrators. It is definitely useful but less so for DBA purposes. Most similar DBA-type activities that people may consider using Puppet for can often be implemented instead with Ansible.
Modern DBAs should be excited to see how easily they can learn and adopt modern DevOps tools and integrate them into their day-to-day activities. The landscape of options is broad, but the technology stack discussed here can provide a valuable toolkit to help DBAs increase their efficiency and productivity both locally and in the cloud.