<< back Page 2 of 3 next >>

Moving Beyond Relational With Data Integration

Another significant trend impacting data integration is the advent of NoSQL database systems. At a high level, NoSQL implies non-relational, distributed, flexible, and scalable. Many are also open source. Additionally, some common attributes of NoSQL DBMSs include the lack of a schema, data clustering, replication support, and an “eventually consistent” capability (instead of the typical ACID transaction capability).

There are four popular types of NoSQL database offerings: document stores, column stores, key/value pairs, and graph databases. Each offer different ways of storing and accessing data. The typical use case for a NoSQL database is to support web and mobile applications with requirements that are difficult to achieve or deliver sub-optimal performance using a traditional relational database. 

Which brings us to the term “polyglot persistence,” which in the NoSQL community means using different types of databases for different types of applications and use cases based upon your needs. When an organization embraces polyglot persistence, there will be more types of data and databases that must participate in data integration.

Finally, we have NewSQL, which is a new type of relational/SQL DBMS designed to combat NoSQL. Applications which have a large number of short transactions that access a small amount of indexed data and execute repetitively can benefit from NewSQL. The goal of NewSQL is to deliver high availability and performance to modern data without sacrificing the robust consistency requirements and transaction capabilities that NoSQL frequently skimps on.

Impact of Data Trends on Data Integration

The most significant impact on data integration is the requirement to support more than just traditional relational DBMS sources and targets. And let’s face it, there are many data integration tools that fall short of being able to support the top five or six leading relational DBMSs. Does your vendor offer a single tool that can integrate data to and from Oracle, Microsoft SQL Server, IBM Db2, Sybase ASE, MySQL, and PostgreSQL? Now add to that list Hadoop, MongoDB, Cassandra, Redis, HBase, Neo4j, and the task gets even more difficult.

Along with this comes the need to support not just structured data but also unstructured data. NoSQL databases and Hadoop are often used for storing and managing non-traditional data. Unstructured data may be images, audio, or video, but it can also be large text documents and emails. The manner in which this data is stored, queried, and modified differs in many significant ways with structured data. This means that data integration tools need to be engineered properly to handle both structured and unstructured data appropriately.

Many organizations have adopted dashboards that can access data where it resides, whether in an RDBMS, in a NoSQL DBMS, or on a Hadoop platform. Such technology can remove the requirement to move the data, but it can introduce other issues, such as performance inefficiencies, when transactional and analytical processes are run against the same data. Running both OLTP and OLAP against the same database is known as hybrid transaction analytical processing, or HTAP.

There are several approaches taken by different DBMS vendors to tackle HTAP without causing performance degradation. Let’s look at two disparate approaches. IBM Db2 for z/OS uses an accelerator appliance for analytical workloads that shuttles appropriate queries to the accelerator without requiring application changes. A different approach is taken by NuoDB, which separates transaction processing from data durability. This enables users to spin up separate transaction engines for OLAP and OLTP in NuoDB.

The Changing World of Data Analysis and Analytics

It is not just how and where we store data that is changing; the way in which we access and analyze the data is also shifting. While typical business intelligence applications comprise historical data with relatively simple SQL queries, modern applications with advanced analytics capabilities deliver the ability to forecast future performance and events using sometimes complex predictive models coupled with data mining, data visualization, and artificial intelligence.


<< back Page 2 of 3 next >>


Subscribe to Big Data Quarterly E-Edition