The Importance of the Relational System Catalog

Bookmark and Share

The heart of any relational database management system is the system catalog that documents the database objects and system settings being used. The system catalog offers a wealth of information about your DBMS. You can think of it as the knowledge base of every piece of data known to the system. For this reason, it is important that DBAs understand what is in the system catalog, as well as how to access and manipulate the information it contains.

Of course, depending upon the DBMS you use, the system catalog may go by another name. Consult Table 1 below for a summary of the name used by the most popular relational systems.

Table 1: A Catalog by Any Other Name . . .

A relational DBMS requires a system catalog, but it may be called something different depending upon the particular DBMS you are using:

In Oracle it is referred to as the Data Dictionary.

In SQL Server it is referred to as the System Catalog.

In DB2 it is referred to as either the DB2 Catalog or System Catalog.

In MySQL it is referred to as the Information Schema.

You can think of the system catalog as a metadata repository for your databases. But it is an active repository. Unlike independent metadata repositories where much of the data must be manually entered, an active system catalog is built, maintained, and used as the result of the creation of the objects defined to the dictionary. In other words, as the user is utilizing the intrinsic functions of the DBMS, metadata is being accumulated and populated in the active repository that is the system catalog.

When you issue SQL against the database, the system catalog is either updated (for DDL) or accessed (for DML). All the information in the system catalog, however, may not completely up-to-date, and some of the tables for some DBMSs must be proactively populated, for example statistics or connection information. Therefore it is probably better to say that the system catalog is semi-active, instead of completely active.

Nevertheless, for the most part, the system catalog operates as an active data dictionary, particularly with regard to SQL. When a CREATE, DROP, or ALTER statement is issued, information is recorded or updated in the system catalog. For example, upon successfully issuing a CREATE TABLE statement, the DBMS populates the metadata for the table into tables named SYSTABLES and SYSCOLUMNS (or something similar). The same is true for authorizations using the GRANT and REVOKE statements. This will cause information to be added or removed from system catalog tables. Data manipulation language SQL (SELECT, INSERT, UPDATE, MERGE, DELETE) statements use the system catalog to ensure that the statements accurately reference existing database objects (such as column names and data types).

The physical location and setup of the system catalog will have an impact on the overall performance of the DBMS. The DBA must decide where it will be installed, on what type of disk, and how much space to allocate. These decisions typically are made at installation time.

As a rule of thumb, place the system catalog on a separate disk device so that it can be managed and tuned independently from other application data. If possible, consider completely dedicating a disk volume or two to the system catalog. Consider placing the indexes and tables on separate disk volumes. In addition, if the DBMS does not already provide a separate data cache (or buffer pool) for the system catalog, consider isolating the system catalog into its own dedicated data cache. Doing this makes it easier to track the efficiency of system I/O versus application I/O.

When changes need to be made to the system catalog database, utilities such as REORG, COPY, and RECOVER or file system commands need to be used. Changes may need to be made to increase the size of the system catalog, to add a new index, or to migrate to a new release of the DBMS. Usually a migration utility is provided to make system catalog changes.

DBAs should use the system catalog to actively manage their database environment. It is a good practice to actively monitor the database objects in the system catalog and delete obsolete objects. For example, in DB2 for z/OS, IBM delivers sample tables with every new version. The database object names have the DB2 version number embedded within them. Every new release causes new sample tables to be created. The DBA should delete the old sample tables when new ones are installed. This advice applies to all unused database objects, not just to sample tables. If a tablespace exists that is no longer used, it is consuming valuable resources (disk space, system catalog space, etc.) that can be freed up when the database object is dropped.

Although you can directly query the system catalog tables using SQL, many DBMS products provide catalog views to simplify writing such queries. For example, Oracle provides views for DBA usage and dynamic performance information.

The bottom line is that the system catalog contains a cornucopia of valuable system information about your data and DBMS. DBAs must take the time to understand what the system catalog contains, and manage it appropriately in terms of its setup and usage.