By Santiago Jaramillo
For Oracle database administrators (DBAs), one of the most critical yet routine tasks is managing backups. Recovery Manager (RMAN), Oracle's native tool for backup and recovery, remains the go-to solution due to its robustness and flexibility. However, as database sizes grow and uptime demands increase, optimizing RMAN performance becomes essential.
In this post, we’re sharing technical insights and a real-life use case showing how backup time was reduced from 65 hours to 18 hours using RMAN and Dell’s PowerProtect Data Manager (PPDM). Let's dive into the key concepts and practical adjustments that made it possible.
Understanding RMAN Backup Phases
An RMAN operation involves three main phases where performance bottlenecks may arise:
- Read Phase: Data blocks are read from the source disk.
- Copy Phase: Data is moved and processed between memory buffers.
- Write Phase: Data is written to the final backup destination (disk or tape).
1. Read Phase
To monitor and measure the performance of the backup process, the following SQL query can be used:
SELECT s.inst_id,
a.sid,
client_info AS channel,
a.status,
TO_CHAR(a.open_time,'YYYY/MM/DD HH24:MI:SS') AS open_time,
ROUND(bytes/1024/1024) AS sofar_mb ,
ROUND(total_bytes/1024/1024) AS total_mb,
io_count,
ROUND(bytes/total_bytes*100,2) AS "% Complete" ,
a.type,
ROUND(a.effective_bytes_per_second/1024/1024,1) AS effective_mega_bytes_per_second,
a.filename
FROM gv$backup_async_io a, gv$session s
WHERE
open_time > TO_DATE('&start_date_time_YYYYMMDD_HH24MISS', 'YYYYMMDDHH24MISS')
AND a.STATUS NOT IN ('UNKNOWN')
AND a.sid = s.sid
ORDER BY "% Complete";
To evaluate the effectiveness of the tuning changes, you can execute the following SQL query before and after applying them. This will help determine how much time is being saved in the backup process:
SELECT TO_CHAR(start_time,'YYYY/MM/DD HH24:MI:SS') AS start_time,
TO_CHAR(end_time,'YYYY/MM/DD HH24:MI:SS') AS end_time,
output_device_type,
autobackup_done,
input_type,
input_bytes_display,
output_bytes_display,
time_taken_display,
status
FROM v$rman_backup_job_details
WHERE start_time > (sysdate - INTERVAL '96' HOUR)
ORDER BY end_time;
- Count the number of CPU cores assigned to the database machines.
- Allocate a proper number of channels—generally between 50% and 75% of the core count. For small servers with a low workload, match the number of channels to cores.
- Check the `alert.log` file for messages related to the `LARGE_POOL_SIZE` parameter. Increase it if needed.
- Verify if the OS supports asynchronous disk I/O. If not, set `DBWR_IO_SLAVES` greater than zero.
- Monitor backup job performance via log files and the views `V$BACKUP_SYNC_IO` or `V$BACKUP_ASYNC_IO`.
- Use the `SECTION SIZE` option for bigfile tablespaces to distribute load across channels.
To limit the number of bytes RMAN reads, set the `RATE` parameter in the channel allocation.
Note: Always check the proper syntax. Sometimes commands don’t fail, but behave differently than expected. See Doc ID 2446980.1 for an example.
Do not overallocate channels—monitor overall server workload. Excessive channels may consume too many resources and overload the database server.
2. Copy Phase
Minimize unnecessary compression or encryption unless required. These processes can significantly increase CPU usage.
Be aware of the encryption algorithm and compression level being used, as both can significantly affect CPU consumption. The additional resource overhead depends on:
- The complexity of the encryption algorithm (e.g., AES128, AES192, AES256)
- The selected compression level (basic, low, medium, or high)
- The combination of both features when used together.
Each Oracle database environment has different capacities and workloads during business and off-business hours. Choosing an inappropriate setting may lead to excessive resource usage—potentially increasing CPU consumption by 50% and doubling backup duration (this is an approximate estimation).
Whenever possible, consider leveraging hardware or tape vendor-based compression, which can offer better performance and lower CPU impact.
3. Write Phase
- The same channel and memory parameters from the read phase typically apply here.
- For tape backups, check `BACKUP_TAPE_IO_SLAVES` to configure whether RMAN uses SGA or PGA memory.
- If writing to tape storage, consult with your vendor to:
- Determine limitations in the number of supported RMAN channels.
- Ensure support for Oracle RAC or multi-node configurations.
- Adjust any defaults that may not scale well for large environments.
One of the most effective ways to catch performance issues is by regularly checking and comparing job performance over time. While it may sound simple, tracking metrics from one execution to another can provide valuable insights for DBAs and help define the proper action plan.
All RMAN performance improvements can have a positive impact on various types of jobs, including (but not limited to):
- Backup
- Restoration
- Cloud-based backups
- Database cloning
- Database migration
- Data Guard configuration
RMAN Performance Tuning Real-World Example
The Challenge: A Level 0 (full) backup took 65 hours for a 140 TB Oracle RAC database (19c), running on two servers with 16 CPU cores each.
Vendor Product: Dell PowerProtect Data Manager (PPDM)
Issue: The default configuration only used four channels for the backup operation. After vendor consultation, multi-channel and multi-node support were confirmed.
Solution: An RMAN script was deployed directly on the servers to fully utilize available resources.
rman
CONNECT TARGET /
RUN {
ALLOCATE CHANNEL CH0 TYPE 'SBT_TAPE'
CONNECT='username/password@server1:1521/locallistenerservice as sysbackup'
PARMS 'SBT_LIBRARY=/u01/app/oracle/product/ddrman/lib/libddobk.so,
ENV=(RMAN_AGENT_HOME=/u01/app/oracle/product/ddrman,
SCHEDULED_BACKUP=BACKUP_STANDALONE,
STORAGE_UNIT=storage_value, BACKUP_HOST=ppdm_server)'
FORMAT './PLCTLP-f36ec87c-44c1-46e8-bdf5-d51d210c7564/%U';
ALLOCATE CHANNEL CH1 TYPE 'SBT_TAPE'
CONNECT='username/password@server2:1521/locallistenerservice as sysbackup'
PARMS 'SBT_LIBRARY=/u01/app/oracle/product/ddrman/lib/libddobk.so,
ENV=(RMAN_AGENT_HOME=/u01/app/oracle/product/ddrman,
SCHEDULED_BACKUP=BACKUP_STANDALONE,
STORAGE_UNIT=storage_value, BACKUP_HOST=ppdm_server)'
FORMAT './PLCTLP-f36ec87c-44c1-46e8-bdf5-d51d210c7564/%U';
BACKUP INCREMENTAL LEVEL 0 DATABASE INCLUDE CURRENT CONTROLFILE
FORMAT './PLCTLP-f36ec87c-44c1-46e8-bdf5-d51d210c7564/%U';
RELEASE CHANNEL CH0;
RELEASE CHANNEL CH1;
}
Key Takeaways:
- Eight channels were allocated on each server—50% of the available CPU cores.
- Resources were balanced across both servers.
- Vendor confirmed support for 16 channels against the backup server.
Result: Full backup time reduced from 65 hours to just 18 hours.
Note: The incremental level 1 and archived redo log files backup jobs were also benefited in around 50% compared to the initial time.
Enabling Multi-Channel from the PPDM Console
After a successful validation via script, the same configuration was applied through the PPDM UI:
- Go to Infrastructure > Assets > Oracle.
- Select the Oracle asset.
- Click More Actions > Set Stream Count.
- Set stream counts (1–255) for Full, Cumulative, Differential, and Log backups.
- Click Save.
- Go back to Infrastructure > Assets > Oracle.
- Select the asset and set Cluster Backup Nodes.
- Enter server names, ports, service names, and number of channels.
- Click Apply.
Other Tools with Similar Performance Features
If you're using Oracle RAC and looking for additional tools with support for parallel RMAN backups, consider these options:
- Veritas NetBackup: Supports multiplexed backups and restores.
- Commvault: Offers RMAN multistream and IntelliSnap technology.
- IBM Spectrum Protect: Supports RMAN channel-based scripting.
- Cohesity: Also leverages RMAN channel-based scripting.
Sometimes, DBAs must rely on trial and error, but having a strong starting point makes all the difference.
Explore more DBA resources at Datavail.com.