Managing Database Storage

Database systems require data files to store the data under management. These files, or data sets, reside on storage media. So storage management should be a key part of the database operations required of a database administrator (DBA). Unfortunately, storage is sometimes relegated to an afterthought; after all, don't we have storage administrators who deal with our disk arrays? But this way of thinking is misguided. To succeed, database administration and storage administration need to cooperate and work together.

For database systems, storage usually will be accomplished by means of fixed disk drives or disk subsystems. Today, that means RAID (redundant arrays of independent disks) in most cases. But database storage can also mean nonvolatile storage, solid state disk, removable storage, optical storage, and even trusty old tape devices.

Some part of the DBA's job must involve planning for the actual storage of database data. To adequately perform this part of the job, the DBA must be conversant in the actual physical storage mechanisms available to be used for data storage. Furthermore, the DBA must understand the ways in which the abstract concept of data within a DBMS interacts with the physical storage of data on persistent storage media.

Most disk storage technologies can work with most DBMS products. But, some storage technologies are better-suited to database processing than others in terms of performance, reliability, usability, and cost. Modern disk drives are more reliable than in years past with an ever-increasing Mean Time Before Failure (MTBF). It is not unheard of for disk drives to achieve in excess of a hundred thousand hours of availability before failing. But the mechanical nature of the disk drive renders them more vulnerable to failure than other computerized components. As the number of physical drives in a system increases so does its vulnerability. Certain RAID levels can be used to address some of the MTBF problems with techniques such as redundancy, striping, and mirroring.

For mission-critical applications data integrity can be more important than data availability. If the storage media is unreliable and a failure causes data corruption, the lost data can be more of a problem than the downtime. It is imperative, therefore, that database storage solutions protect the data at all costs.

Database performance is I/O-dependent-the faster the DBMS can complete an I/O operation the faster the database application will run. Remember that data retrieval from storage media takes much longer to complete than data retrieval from cache or memory. For this reason, most modern storage systems provide their own caching mechanism to prestage data in memory-thereby reducing the wait time associated with traditional disk I/O operations. The DBA needs to understand how the disk caching works with the database cache or buffers.

Multiterabyte databases and data warehouses are becoming more common these days. Storage hardware vendors are keeping up with large databases by providing larger and more capable storage devices. But for the DBA this growth in storage capacity further increases the complexity of managing data and databases. Understanding how the database grows, what that means in terms of disk storage, and avoiding downtime because of out of space conditions are all important DBA duties that require storage-related knowledge.

Managing storage in today's dynamic environment is a challenging DBA task. Goals to consider while building a storage system for your database include the following:

  • The number-one priority, of course, preventing loss of data
  • Assuring that adequate capacity is available and that the storage solution can easily scale as storage needs grow
  • Selecting a solution that provides fast access to data with minimal, or no interruptions to service
  • Choosing storage solutions that are fault-tolerant and that can be repaired rapidly when a failure occurs
  • Selecting a storage solution where you can add or replace disks without taking an outage
  • Combining all of the above into a cost-effective storage solution that is within a budget your company can afford

To accomplish these goals, DBAs and the storage administrators must cooperate with each other. DBAs need to remember that other types of data that are not stored in the DBMS are stored on disk too. Databases use storage differently than nondatabase data. Indexing, partitioning, clustering, and separation of data can cause the database to require more storage (and across more drives) than most storage administrators anticipate. The DBA will need to clearly explain how much storage is required for the database ... and why. The better the DBA communicates, and the better the relationship is between these two IT professionals, the better your database applications will perform.