The Changing Role of the DBA in a Data Protection-First World

By Matt Hilbert

Jun 16, 2020

DBAs have always been central to how organizations manage, store, and use data. First, they were gatekeepers, limiting access to production environments and carefully shepherding database changes through to avoid the risk of data loss or system downtime.

The rise of DevOps, however, changed the game and encouraged them to be more open data enablers. With changes to front-end applications often requiring the database at the back end to be updated more frequently as well, there is a growing demand to share copies of databases with developers to test their changes against. Limiting access to production environments and excluding the database from DevOps hinder the faster pace of development that can otherwise be achieved.

Now, with the global rise of tougher regulations such as GDPR and CCPA, DBAs need to embrace a new role as guardians, protecting data, yet still ensuring it is available in secure, anonymous ways to enable faster development without the risk of breaches.

With more than 62% of the world’s population being protected by more stringent data privacy laws moving forward, according to Redgate analysis, there are new challenges for DBAs. Becoming a true data guardian now involves taking 10 steps within four areas:

Identifying and cataloging your data

Where is your data?
What is your data?
Where are the risks to your data?

Data spreads across organizations, so DBAs need to create a record of every database, every instance of it, and who has access to it. This can be a bigger task than it first appears because data is used in so many ways. A remote office may have a copy of a customer database open to every employee—for example, a large number of people might have historic access to your production database, and production data is often used in business analysis, sales, and marketing.

The next task is to identify what that data is. It might be standard personal data, such as names and addresses or telephone numbers, or it could be more sensitive, such as a person’s ethnic origin or details about the individual’s health.

With that knowledge, DBAs will be able to spot any risks that exist and categorize the data with a taxonomy that allows them to differentiate between personal and sensitive data. Columns can then be tagged to identify what kind of data they contain, and therefore, which need to be protected.

Protecting your data

Reduce the attack surface area.
Mask data outside production.

Once you have a picture of your data, you can take steps to protect it. As a default, aim to consolidate the storage of data into as few locations as possible, and only let individuals view, modify, or delete personal data that is relevant to their job roles.

Focus on reducing the attack surface area and masking data outside production. Bear in mind that most breaches are not caused by outside hackers, but instead are due to unauthorized access by contractors, third parties, and users without the appropriate permission. That’s why companies need to move to a default methodology of “least access,” whereby people are only allowed to access the data they need in order to perform their jobs.

This, of course, raises another thorny issue because developers have become accustomed to having access to production data to test their proposed changes again—yet production databases invariably contain the kind of personal data that needs to be protected.

This is where data masking measures such as pseudonymization, encryption, anonymization, and aggregation should be adopted, preferably using a third-party tool to ease the process. These protect data while providing a realistic, accurate set of information that matches the size, distribution characteristics, and referential integrity of the original.

Bringing DevOps to your data

Standardize team-based development.
Version-control database code.
Automate where possible.

The rise of DevOps has seen developers being expected to develop the database alongside applications, switching from coding in Java one moment to using languages such as T-SQL the next. Because T-SQL is a looser declarative language, as opposed to an imperative language such as .NET, there are many different styles in use. This can lead to confusion, especially over time, when multiple people have worked on the same code base.

To overcome this, development needs to be standardized—not by forcing developers to change how they work, which would be unpopular and counterproductive, but by adopting tools that can automatically change code to a team’s standard style in seconds and perform static code analysis as code is written. This makes the overall code base easier to understand and also flags errors earlier in the development pipeline.

Similarly, version control is becoming standard in DevOps, with developers checking their changes into a common repository so that one source of truth is maintained. The same approach can be used in database development, preferably using tools that integrate with those used for application version control.

Once you have version control in place, you can then look to automate parts of the development process to make it more reliable. Every time a change is committed to version control, for example, a continuous integration process can be triggered to test the change and flag any errors in the code. Errors can be fixed immediately and tested again, before the change is then passed along to a release management tool, where it can be reviewed before being deployed to production.

Going back to the guardian role, this approach aligns with data privacy requirements and helps with compliance as it enables database updates to be delivered in a consistent, repeatable, and reliable way and provides an audit trail of those changes.

Monitoring your data

Back up every change.
Monitor for compliance.

Every DBA understands the importance of backups, but new data privacy and protection requirements add extra considerations. For example, businesses are expected to be able to restore availability and access to personal data, should any issues occur. Backup schedules will also need to accommodate additional requirements such as data being held for no longer than is necessary. Once the processing for which the data was collected is complete, it will need to be deleted from the backup along with the original database it was stored on.

Businesses therefore need to standardize backup regimes, centralize the management of backups, encrypt them, and have the ability to restore and validate backups when required.

DBAs have always been responsible for monitoring and improving performance, but cybersecurity regulations move this to the next level. For example, companies must now monitor and manage access and ensure data is available and identifiable. If a data breach does occur, they must report it, describing the nature of the breach, the categories and number of individuals concerned, the likely consequences, and the measures taken to address it.

This all makes having an advanced monitoring solution a necessity, enabling DBAs to keep track of the availability of servers and databases containing personal data, and be alerted to issues that could lead to a data breach before it happens.

The DBA’s New Role

In a security-led world, personal data has moved from being a business asset to a business risk. DBAs therefore need to embrace their data guardian role to safeguard privacy and focus on security, while still ensuring faster development by adopting the right processes, tools, and mindsets.

The Changing Role of the DBA in a Data Protection-First World

The DBA’s New Role

Newsletters

Recent Big Data Quarterly Issues

White Papers

Webinars