AWR Warehouse, the Basics and Beyond

Page 1 of 2 next >>

The argument for everyone to have an AWR Warehouse as part of their IT Infrastructure seems like a simple one to me.   We so often discuss  its features, but rarely do we talk about why IT should take advantage of it, where the value is, or how simple it is to use.

The DBA Should Never Be the Last to Know

Before joining Oracle, I spent a lot of time in meetings as a DBA and as a consultant convincing businesses that their DBAs deserved to have tools and resources available to them to answer all the questions posed to them, often with short timelines due to project and performance challenges.  Frequently, I’d find a DBA  or developer scrambling to build a query to not only gather the data they needed, but also to find a tool that would present the data in a format that the stakeholder could understand.

The introduction of the Automatic Workload Repository (AWR), along with its impressive reports, removed some of those challenges.  The new question is “When are database resources available for the DBA to run reports vs. the database availability to users?”  The idea to have a data warehouse storing performance information is not new and I’m one of a number of DBAs who has traveled down this path.  To have this challenge taken on by Oracle is impressive as it is now designed to work with the database and is supportable, something to be built on and enhanced. 

AWR Warehouse

AWR Warehouse was created with the sole purpose of offloading AWR data and retaining it indefinitely in a separate repository.  No longer was the DBA or developer left to run reports or query the source database to look at historical performance.  A DBA's primary goal is to protect the database, the entire database and all of the data within it (so help me, God…), meaning that the idea that they would create a performance impact to users is an impossible expectation, and making offloading this data is a very logical conclusion.

The AWR Warehouse was released in July 2014 after receiving a lot of support from the database community.  Having AWR and ASH data to report and query had been known as an intelligent choice for answering business and performance questions, along with building data-driven analysis and trending of a company’s individual database usage.  With the introduction of the AWR Warehouse, this was taken to a whole new level.

The AWR Warehouse configuration, design and configuration is well thought-out and after reading on, I expect many a DBA to start building it out as their next project.

Simple Setup

Once you’ve created an Oracle Database, version or higher, to use for your AWR Warehouse repository (it is also recommended, due to the different requirements and usage, to have a separate database for the AWR Warehouse repository and NOT to use Oracle Enterprise Manager 12c’s Oracle Management Repository [OMR]), and discovered it as a target in Oracle Enterprise Manager Cloud Control, post applying the necessary patches, (see Master Doc ID: 19074335.1.), go to Databases, AWR Warehouse.  The first time accessed, the AWR Warehouse wizard will take you through the steps to configure the repository and ETL loading process, finishing up with an Enterprise Manager job that can be viewed via the EM Job Activity page with the naming convention CAW_LOAD_SETUP_*.

AWR Warehouse Configuration

You can find this data via the console by logging into the AWR Warehouse and clicking on the  icon.  All high-level data about configuration and space allocation is displayed on the right of the screen:


Figure 1

There is also a handy monitor displaying percentage used, percentage left and warning levels:

Figure 2

Figure 2

This gives the administrator a quick view of how much space is currently being allocated and used by the AWR Warehouse repository.  Thresholds are built in, along with rule sets to alert on warning at 80% and critical space usage at 95%.  Once reached, the AWR Warehouse administrator or anyone assigned privileges to manage snapshots in the AWR Warehouse can decide to purge specific snapshots that are no longer required if more space isn’t available to be allocated.

AWR Warehouse Schema

The schema contains the same objects as the source database.  As the original was identified by the DBID for the source database, little changes were required outside of partitioning of objects by DBID, SNAP_ID or a combination of both.  This partitioning comes as part of the limited EE license that is included with the AWR Warehouse, but keep in mind, this does not allow for the DBA or developer to add any partitioning outside of what is included.

The benefit of the schema staying as it was in the original source is three parts:

  1. Easy and less resource intensive ETL loading.
  2. Mature understanding of performance implications of long term retention makes partitioning requirements simplified.
  3. Only minor changes are required to existing AWR/ASH queries to make them AWR Warehouse compliant.

For those of us that have a vested interest in the way of AWR/ASH advance queries, it is a great benefit to have the same schema and only a small additional join to update existing queries.

Simple and Powerful ETL

Data loading processes often are under- or over-designed, yet the one for the AWR Warehouse is simple and clean.  The targets involved- source and destination - are clearly taken into consideration when they planned out the ETL jobs, ensuring as little resource impact as possible. 

To upload the AWR snapshots from the target to the AWR warehouse, an extract is added with a DBMS job as part of a new collection on a regular interval.  There is first a check to verify the snapshot hasn’t been added to the warehouse repeatedly and then the extract is identified by DBID to ensure unique info is extracted.  If this is a new addition to the AWR Warehouse, then the ETL job runs every three hours and starts from the oldest snapshot first to ensure that it offloads as much data as possible and not be hindered by existing source database retention windows.  There is a “throttling” process that considers the size and amount of data to be uploaded in the ETL process and this assists to ensure that there isn’t an impact to the source database while the ETL load occurs. 

Page 1 of 2 next >>