When a failure occurs, the DBA must ascertain whether recovery is required. It is possible, though not very likely for active databases, that a failure does not impact the integrity of your data. Assuming that recovery is required, the DBA will need to determine what resources (backup copies and log files) are available and how best to perform the needed database recovery. Several questions must be answered to figure out the type and extent of failure. The answers will dictate the steps you take to recover the system.
- What type of failure occurred: media, transaction, or database instance?
- What is the cause of the failure?
- How did the database go down: abort, crash, normal shutdown?
- Did any operating system errors occur?
- Was the server rebooted?
- Are there any errors in the logs: operating system log, alert log, database logs?
- Was a dump produced?
- Were any trace files generated?
- How critical is the lost data?
- Have you attempted any kind of recovery so far? If so, what steps have already been performed?
- What types of backups exist: full, incremental, both?
- What needs to be recovered: the full database, a tablespace, a single table, an index, or combinations thereof?
- Does your backup strategy support the type of recovery required (recover-to-current vs. point-in-time)?
- If you have cold backups, how was the database shut down when the cold backups were taken?
- Are all of the archived database logs available for recovery?
- Do you have a recent logical backup (EXPORT or UNLOAD)?
- What concurrent activities were running when the system crashed?
- Can you bring the DBMS instance up?
- Can you access the database objects?
- What are your system availability requirements?
- How much data must be recovered?
Additionally, DBMS version migration can impact recoverability. Perhaps, you have migrated to a new version of the DBMS since the last backup was taken, which can cause problems. Sometimes DBMS vendors change the format of data files, image copy backup files or log files, rendering any recovery using the old format impossible or problematic. Perhaps, the new version changed the functionality of the RECOVER utility, such that it could not read older backup files. In either case, the DBA is stuck because there is no valid backup that can be used for recovery. For these reasons, be sure to investigate the particulars of backup and recovery for each new DBMS version before migration, and take new image copy backups after migration if the old backups will be unusable.
Of course, these are just some of the questions that the DBA must be prepared to answer to effectively recover database objects. Additionally, the DBA will need to understand all DBMS-specific details—for example, information on the Oracle rollback segments or how to identify a specific log RBA (relative byte address) for DB2 recovery. When using Microsoft SQL Server, the DBA can select from several recovery models to use for database implementations.
General Steps for Database Object Recovery
At a high level, the following steps are common to most database recoveries:
- Identify the failure. The detection of an outage is usually simple: either the database is not responding to the application or the DBMS has displayed some type of error message. Some problems are more insidious, though, such as a corrupt control file. This type of problem take more skill to identify.
- Analyze the situation. The DBA must analyze the error to determine the cause, type, and scope of the failure. Based on the results of this analysis, the DBA will choose a recovery method. This is usually the most time-consuming recovery task.
- Determine what needs to be recovered. The DBA must determine which database objects (and perhaps other components such as logs) are failing and prepare a recovery script that is appropriate for each component. This task can also consume a significant amount of time, especially for larger systems.
- Identify dependencies between the database objects to be recovered. The failure of one database object can impact other database objects (e.g., indexes and referentially related tables). Loss of data or recovery to a prior point-in-time will most likely affect related database objects.
- Locate the required image copy backup(s). The closer the image copy backup is to the recovery point in time, the shorter amount of time it will take to recover. Keep in mind other factors such as the time it takes to find tapes in the library and the possibility of the tape being located at an off-site location.
- Restore the image copy backup(s). Restoration is accomplished using the database recovery utility or file system recovery command of choice
- Roll forward through the database log(s). To recover to current or to a point in time after the image copy backup was taken, the database logs will need to be processed.
At the very basic level, every database recovery will involve most of these seven steps. However, depending on the situation and the type of recovery requested, certain steps might be eliminated or changed significantly.