Next-Gen Data Management: The Problem With Defining Database Workload

Great! Wait ... How do we define database workload? Good question! I’ve researched many approaches to this question and still don’t have a definitive answer or way to measure. Let’s dive into some of the research and personal thoughts I’ve had around this topic.

Here are the Definitions of Workload, According to

  1. The amount of work or of working time expected or assigned
  2. The amount of work performed or capable of being performed (as by a mechanical device) usually within a specific period

I like the second definition a little bit more as it applies to IT concepts. Extrapolating that for our purposes, we can say that database workload is the amount of work performed or capable of being performed by a database within a specific period. I like the idea of it being within a specific period though, so that’s a start. For our purposes, it is useful to stick to what can be observed or measured, so I’m going with the actual amount of work performed for my definition, rather than the potential work that could be done. My kids are capable of getting straight A’s, but that’s not what I’ve observed.

What is database work? Again, another good question! During my research, I found one concept I agreed with—database work should be looked at without regard to resources available or resources consumed. The reason I agree with this is that we should try to remove as many variables as possible. Every system is different, including the number of cores plus clock speeds of the chip architecture; the amount of RAM; disk speeds (spinning versus SSD); local disk versus SAN or NAS; network bandwidth; additional load on the system from non-database processes; virtualization (where other VMs are running on the same physical server) versus  physical; and whether it is cloud-based.

Excluding external dependencies from the equation, what is the actual database work? We could say it is the aggregate “asks” placed on the database engine such as application calls (queries); maintenance jobs; internal database engine management (memory management, parses, optimizer activity, etc.); ad hoc (i.e., administrative queries, data modifications, access control changes, etc.); release changes (i.e., schema changes, data migrations, data definition language, etc.); and anything else asked of the database.

Devilish Details

Conceptually, this is a decent proxy to database work. However, the devil really is in the details. What about this work do we want to capture or use to define database workload? Resources consumed? No, that’s not quite right as that’s more of a byproduct of the workload playing out against our system and is subject to a lot of variables. A tally of database calls regardless of the type of call? That’s not quite right either, as all calls are not created equal. Number of active sessions? That’s still not quite right, plus how should we handle parallel executions? Statistics (logical reads, executions, physical reads, parses, etc.)? Maybe stats can be part of the overall workload definition and quantification, but stats would be dependent on resources allocated to my system (an example being a physical read on one system could be a logical read on another system if you have enough memory to hold the data you need in cache). Some amalgamation of these things?

Is Workload Scaling Linearly?

This may be late in the game, but some reading this article may ask: Why do we want to define workload? We can run a specific static workload set against different systems that can help us determine benchmarks. We can amp up that workload to see where the system breaking point is. However, those scenarios are using a known workload to define other things. What I’m talking about is measuring actual workload with an eye on the prize of knowing when that workload is no longer scaling linearly. This is likely a review but just to level-set, an example of linear scaling is doing 100 units of work in 1 hour, then doing 200 units of work in 2 hours. You are no longer scaling linearly if it then takes 6 hours to do 300 units of work. Somewhere between 200 and 300 units, there was an elbow (started taking more time than previously to do one more incremental unit). That’s where I’m headed—I want to know based on actual workload, where is my elbow or where should I expect it to be?

The Wrap-Up

Another approach could be just to look for inefficient workload. I’ll save that for another article. So, here at the end of this post, there are a lot of ideas around database workload but no definitive answers (in my mind). Please comment with your thoughts around how to define database workload—I would love to hear your thoughts and feedback!