A friend once told me the main difference between an IT administrator and a developer is that admins are all about stability, while developers are all about change—one wants as little change as possible, while the other wants to constantly deploy new code and use the latest languages and frameworks.
That’s why, as some say, “the world runs on legacy.” At the end of the day, IT admins are responsible for making sure applications run, and run well. If it’s not broken, why mess with it? However, life isn’t easy for admins nowadays because there is so much change. There is the usual change, such as replacing a host or upgrading to a new OS or DBMS. Then, there are the really big changes impacting every IT department.
The first of these are virtualization and cloud. For years, DBAs resisted moving to virtualized environments because of the uncertainty surrounding how a database server would perform on a VM. There was no performance certainty. But today, 80% of databases are running in virtual environments. Amazon is making a billion dollars a year on DBaaS (RDS, Aurora, Dynamo, etc.), not including databases running on EC2 instances.
Now that databases are in a dynamic environment, performance can change at any time: Got a noisy neighbor? An administrator moves the database to another VM, and so on. The next is the migration to new storage systems: flash, hyper-converged systems, and intelligent storage, which does hot/cold tiering (software will dynamically change the underlying storage system based on observed behavior). The final change is a push for continuous development, which means application code is changing all the time, sometimes multiple times a day. This is on top of everything that can and will change in the database itself. With so many changes and so many variables, the old way of figuring things out—trial and error—no longer works. As Yoda said, “There is no try.” What IT admins need, and what the business expects from IT, is performance certainty.
In today's software-defined dynamic environment, there is one more consideration: The direct correlation between performance and cost. Lower performance usually results in provisioning more hardware, or faster hardware, which results in a higher cost.
So, how do you get to performance certainty? Here are a few ideas:
- Adopt performance as a discipline. This means uptime is no longer the key metric for how you measure quality of work; instead, uptime is assumed. The questions become: How fast can you make the system work? How often do the teams talk about performance? What tools do you have to understand and improve performance? Be proactive.
- Adopt a wait-time analysis mindset. The focus must shift from simple resource metrics to time—the time spent on every process, query, wait statem and contribution to time from storage (I/O and latency), networking, and other components supporting the database and the application.
- Establish benchmarks and baselines. Define the key metrics to observe, which should ideally be application-, end user, and throughput- centric metrics (again, not CPU utilization or theoretical IOPS). Statistical baselines help you understand what is normal and how/when performance changes. Alerts based on baselines that are based on relevant metrics then allow you to focus on what matters.
- Before moving to faster hardware or provisioning more resources, understand the peformance contribution of each component, which will show its potential relevance to performance improvement.
- Be the performance guru of the organization. Knowledge is power. With the shift in IT toward performance, the person who better understands performance, what drives it, and how to improve it, quickly becomes more valuable to the organization.
- Report on performance weekly or monthly, and take credit for performance improvements and cost savings resulting fromreclaimed hardware or delayed investments. Share performance data. Be the authority. Report the performance impact and improvement (or not) of each infrastructure component and each team member: “Joe, the code you wrote this week sucks. It’s 25% slower than last week’s. Here’s the data.”
- Plan performance changes. You will know when you have performance certainty and that you have become a performance guru when you can accurately predict application performance before the changes occur and can guide your organizationtoward better performance.
In summary: It’s all about the application. You must be proactive. Performance is the new black. Performance certainty—whenyou know how a system will perform and how to improve it—will very soon become a job requirement.