Inside Informix V11.5: IBM's smart and powerful OLTP cluster engine

Jun 15, 2008

By Eric Herber

IDS 11.5 - the newest version of Informix Dynamic Server - was officially announced by IBM at the International Informix User Conference in Kansas City in April 2008. The announcement created a tremendous interest from customers, analysts and the press.

IDS 11.5 - code-named "Cheetah 2" - is the successor of version 11.1 - code-named "Cheetah" - which had been released in June 2007. IDS 11.1 provided a whole bunch of new functionality. Based on this foundation, IBM offers with IDS 11.5 a data server, that delivers mainframe availability capabilities on the distributed platform. The term "distributed" needs a little more explanation in this context. It doesn't only mean LUW (Linux, Unix, Windows). IDS 11.5 has been ported to a platform that gains more and more popularity: MAC OS X 10.5, a.k.a., "Leopard." A professional data server available for a fast growing platform that is surrounded by innovative developers - a probably powerful combination.

MACH - The Foundation of IDS 11.5

The predessor IDS 11.1 delivered a high availability and scalability technology called MACH (Multi-Node Active Cluster for High Availability). MACH allows the creation of a database cluster based on IDS. The cluster consists of several IDS instances (nodes), that are capable of accessing the same data concurrently through a shared disk array or maintain a separate, synchronized copy of the data on dedicated disks. Data could be accessed from each cluster node, but only applications connected to the primary node were allowed to write data.

Beside the primary node, MACH clusters could consist of SDS (shared disk server), RSS (remote standalone server) and HDR (high availability data replication) nodes. Those secondary node types are synchronized with the primary to build a continuously available, scalable IDS data store. It doesn't matter if you prefer shared disk access or separated copies of your data - MACH gives you the freedom to choose the approach that fits your needs best. Even a combination of shared disk access and additional synchronized, local copies of the data is supported. Multiple geographically dispersed MACH clusters can even work together and
synchronize data thru ER (enterprise replication).

Write Access on Every Cluster Node

IDS 11.5 provides the next step in MACH technology and supports write access on every node in the cluster. It doesn't matter if the client is connected to the primary or any secondary node type (SDS, HDR or RSS). All nodes can be configured for read and write access and fully utilized by the application. MACH clusters don't require any special type of hardware like fast interconnects. They can be build upon standard hardware components like cost efficient blade servers powered by Linux. This makes IDS 11.5 attractive even for smaller businesses and installations that require high availability and dynamic scalability as well.

In order to enable write access on a secondary cluster node (SDS, HDR or RSS), the onconfig parameter REDIRECTED_WRITES must be enabled by setting it to a value other than zero. This value also specifies the number of network connections (SMX pipes) that will be established by IDS between this secondary and the primary cluster node.

Connecting to a Service

Applications can directly connect to a dedicated cluster node or benefit from the brand new connection manager available with IDS 11.5. The connection manager allows an abstraction of a MACH cluster. The application connects to a service instead of a concrete node. Connecting to a service allows more flexibility regarding the distribution of load and enables automatic failover. The connection manager itself connects to each node, monitors the current workloads and routes incoming client connections to the node with the lowest utilization in the cluster (see Figure 1).

fig1

Figure 1: IDS 11.5 - Database Cluster Architecture

Dynamic Re-Routing Through Connection Manager

The dynamic routing can be further customized by defining service level agreements in the connection manager configuration file. Different types of cluster nodes - Primary, SDS, HDR, RSS - can be combined in any order to build logical connection groups. Depending on the type of workload that an application primarily generates, it can be connected to a logical group that
optimally serves those requirements. For example a write-intensive application might connect to a logical group consisting of the Primary and several SDS nodes. In contrast a reporting application might favor a group with HDR and RSS nodes. The HDR and RSS node types currently support only the "dirty read" isolation level while accessing data. The SDS nodes support, analog to the Primary, "committed read" as well as "committed read last committed." "Cursor Stability" and "Repeatable Read" as well as all kinds of DDL statements are exclusively available on the Primary only.

fig2

Figure 2: OAT - Dynamically adding a new SDS node to the cluster

Automatic Failover Through Connection Manager

The connection manager also acts as an arbitrator responsible for automatic failover. If a node fails, the connection manager will be aware of the failure. It automatically routes incoming client connections to the surviving nodes based on the defined service level agreements. If the Primary fails, the connection manager informs another node to takeover the role. Which secondary node becomes the new Primary can be defined thru a FOC (fail-over configuration) policy. All remaining secondary node types inside the cluster - SDS, HDR and RSS - will automatically reconnect to the new primary node. ER nodes replicating data to the cluster will be reconnected as well.

Multiple connection managers can be configured and started on separate machines which further raises the level of high availability. Those separate connection managers can be combined to a single group in the "sqlhosts" communication configuration file. Applications connecting to this group entry will be routed to the available connection managers in round-robin mode, leaving out failed connection managers. It it important to notice that the connection
manager only reroutes an application to the most appropriate node. After this has been done the application talks directly to that specific node, i.e., IDS instance not relying upon the connection manager anymore.

The connection manager is part of the Informix Client SDK and can be downloaded for free from the IBM Web site. There will be no license cost, regardless of how many instances of the connection manager you are planning to deploy. All connections between clients, the connection manager and the cluster nodes as well as the communication inside the cluster between the nodes can be fully encrypted via SSL.

Version Columns

Additionally, IDS 11.5 contains another interesting feature called "version columns." Version columns can be added to existing tables using the 'alter table <tabname> add vercols' SQL statement. They are "invisible" because they will not be delivered to the application when performing a normal 'select *' SQL query. The application needs to explicitly ask IDS to retrieve those columns by naming them in the select statement: 'select *, ifx_insert_checksum, ifx_row_version from <tabname>'. The 'ifx_insert_checksum' column remains constant during the life of that particular row. If the row is deleted and re-inserted another checksum value will be assigned by IDS. IDS increases the value in column 'ifx_row_version' as soon as a row is updated. Applications - especially in a Web-based environment - can benefit from this new functionality. The version columns enable an efficient way for an application to check if a previous selected row is still valid or has changed in the meantime.

Administration via Browser

OAT - Open Admin Tool - is a Web interface written in PHP and was first released with IDS 11.1. It can be download at no cost. IBM continuously enhances OAT, adding new functionality to make the administration of IDS even smoother. In IDS 11.5, OAT has a new look and feel, menus are restructured and contextual help topics have been added. The configuration, administration and monitoring of a MACH cluster thru OAT has been enhanced. Nodes can be started and stopped, new SDS nodes can be dynamically added to the cluster (see Figure 2). OAT allows the definition of service level agreements as well as failover configurations for the connection manager.

fig3

Figure 3: OAT - Automatic Update Statistics Configuration

In addition to the enhanced administration of MACH clusters, the updated OAT includes:

    * Readonly Group Support
      Instances can be combined in a 'readonly' connection group. Users connecting thru OAT to this group will be able to monitor the members of the group but will not be allowed to perform any administrative tasks.
    * Managing User Privileges
      Database- and table-privileges as well as roles can now be viewed in a clearly arranged form. OAT supports granting and revoking privileges as well.
    * Automated Update Statistics (AUS)
      An automated update statistics task can be setup. The predefined AUS task includes some nice smarts and can be further customized thru OAT. It will be executed on a regular basis by the IDS database scheduler (see Figure 3).
    * Dynamic Update of Configuration Parameters
      Certain IDS configuration parameters can be dynamically updated thru OAT without the need to restart the data server.
    * System Validation
      The integrity of data and extents could be checked.
    * Creating new Tasks
      OAT allows the setup of new tasks that will be executed by the database scheduler.
    * New System Reports
      New predefined system reports have been added to OAT as well as the improved tracking of historical performance data.

Command Line and SQL Admin API

Software vendors, that embed IDS as part of their integrated solution, can utilize the SQL Administration API. It allows the administration of a MACH cluster via a set of SQL API calls that can be easily embedded in the application. (Unix) administrators, that prefer the administration via 'onmode' directly from the command line, can continue to do so. For maximum flexibility a combination of OAT, SQL Administration API and command line is supported.

The following example removes the RSS node 'gepard_rss1' from a MACH cluster:

* Delete RSS node thru 'onmode' onmode -d delete RSS gepard_rss1
* Delete RSS node thru SQL Administration API execute function sysadmin:task("ha rss delete", "gepard_rss1");

fig4

Figure 4: Installation Wizard - Sample Instance Configuration

What else is new in IDS 11.5 ?

Beyond the ability to write data at every node in a MACH cluster and the new connection manager, IBM introduced other new functionality in IDS 11.5. Below you will find a summary of the most notable features:

    * Instance Configuration Wizard
      A new instance configuration wizard allows the setup of a fully functional IDS instance at installation time. The configuration wizard automatically adjusts key server parameters based on user input and creates an optimally tuned configuration for the specific type of workload expected (see Figure 4).
    * Dynamic SQL in Stored Procedures
      The Informix Stored Procedure Language now supports the execution of dynamic SQL. SQL statements can be dynamically constructed during runtime, strengthen the power of stored procedures by giving application developers greater
      flexibility.
    * Improved Standard Configuration File
      The standard configuration file 'onconfig.std' has been completely restructured. For each parameter there is a description included in a comment section above the parameter. Parameters are grouped together in functional areas and have improved default values. The support of the environment variable $INFORMIXDIR in pathname parameters offers more flexibility.
    * Encryption via SSL
      In addition to the traditional Informix network encryption methods, IDS 11.5 supports encryption via SSL. The SSL method is mandatory if encryption between a DRDA client and IDS is needed. Traditional clients can use either SSL or native Informix network encryption. Connections between servers - for example in a MACH cluster - can also be enrypted via SSL.
    * SSO Support
      Single sign-on support in conjunction with Kerberos authentication is new in IDS 11.5.
    * XML Enhancements
      XML documents can be indexed thru BTS (Basic Textsearch Datablade - included in IDS 11.5). In contrast to IDS 11.1, this must not be a full text index, instead XML tag values can be indexed in separate fields either by tag name or by path. This allows for more flexible query options for example thru XPATH expressions. The transformation of XML documents thru XSLT is possible by utilizing the new xsltransform() UDR.

Conclusion

IDS 11.5 is a huge step forward for IBM's distributed database business. It allows the creation of high availability database clusters on cost-efficient hardware. Clusters that offer efficient workload balancing across all nodes and the ability to scale dynamically by adding additional nodes to the cluster without interruption of service. Continuous availability combined with an
impressive simple administration establishes IDS as an excellent alternative to the competition. With the delivery of IDS 11.5, IBM emphasizes its dual database strategy with DB2 as data warehouse and XML data store and IDS as the powerful cluster engine for critical OLTP workloads as well as preferred data server for integrated solutions.

Download the freely available IDS 11.5 Developer Edition for your favored OS platform and convince yourself, that IBM IDS provides real "cool technology" - technology that makes your critical business data available 24x7.

Newsletters

Inside Informix V11.5: IBM's smart and powerful OLTP cluster engine

White Papers

Sponsors