Defining Database Performance

If you are a DBA, or a database performance analyst, chances are that you deal with performance-related issues regarding your database systems every day of the week. But have you ever stopped for a moment and tried to define what you mean when you say "database performance"? Doing so can be a worthwhile exercise, if only to organize your thoughts on the matter.

Think about it; don't we need a firm definition of database performance before we can attempt to manage the performance of our databases?

It can be useful to think about database performance using the familiar concepts of supply and demand. Users demand information from the DBMS. The DBMS supplies information to those requesting it. The rate at which the DBMS supplies the demand for information can be loosely thought of as "database performance." But let's dive a little deeper. There are five factors that influence database performance: workload, throughput, resources, optimization, and contention.

The workload that is requested of the DBMS defines the demand. It is a combination of online transactions, batch jobs, ad hoc queries, business intelligence queries and analysis, utilities, and system commands directed through the DBMS at any given time. Workload can fluctuate drastically from day to day, hour to hour, minute to minute, and yes, even second to second. Sometimes workload can be predicted (such as heavy month-end processing of payroll, or very light access after 7 pm, when most users have left for the day), but at other times it is unpredictable. The overall workload can have a major impact on database performance.

Throughput defines the overall capability of the computer to process data. It is a composite of I/O speed, CPU speed, parallel capabilities of the machine, and the efficiency of the operating system and system software. And don't just base your throughput assumptions on hardware capacity figures (e.g., MHz for Wintel boxes, MSUs and MIPS for mainframes). Perhaps you have installed hard or soft capping on your box, which can impact throughput. And don't forget about those specialty processors (zIIPs and zAAPs) if you are a mainframe DBA.

The hardware and software tools at the disposal of the system are known as the resources of the system. Examples include memory (such as that allocated to buffer pools or address spaces), disk, cache controllers, and microcode.

The fourth defining element of database performance is optimization. All types of systems can be optimized, but relational database systems are unique in that query optimization is primarily accomplished internal to the DBMS. Ensuring that you have provided up-to-date and accurate database statistics for the query optimizer is of the utmost importance in achieving optimal SQL queries. Keep in mind, though, that there are other factors that need to be optimized (database parameters, system parameters, etc.) to enable the relational optimizer to create the most efficient access paths. And there are optimization aspects that are outside the scope and control of the relational optimizer, too, such as efficient script coding, proper application design, coding efficient utility options, and so on.

When the demand (workload) for a particular resource is high, contention can result. Contention is the condition in which two or more components of the workload are attempting to use a single resource in a conflicting way (for example, dual updates to the same piece of data). When one program tries to read data that is in the process of being changed by another, the DBMS must prohibit access until the modification is complete in order to ensure data integrity. The DBMS uses a locking mechanism to enable multiple, concurrent users to access and modify data in the database. Using locks the DBMS automatically guarantees the integrity of data. The DBMS locking strategies permit multiple users from multiple environments to access and modify data in the database at the same time. As contention increases, locking ensues and throughput decreases.

So - putting all of these things together: database performance can be defined as the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.

In addition, database applications regularly communicate with other system software, which must also be factored into performance planning. Many factors influence not only the performance of the DBMS and applications accessing its databases, but also the performance of the other system components (e.g., transactions processor, network software, application servers, etc.)

The Bottom Line

Understanding what database performance is must be a prerequisite for properly assuring the efficiency of your database environment. With a sound definition in hand, managing database performance becomes an achievable task.