Transitioning Big Data Processing for Quicker Decision-Making

<< back Page 4 of 4

Fast access in-memory

With big data volumes many of the technology solutions face additional bottlenecks like memory space, network bandwidth, etc. in processing data. However one possibility is to examine the fastest possible times that can be achieved, i.e., in-memory and then ask whether this can be scaled up. In fact, for many years, technologists have believed that placing data in-memory can solve the time response problem. If, for example, we can carve a huge memory area in DRAM then potentially the response times can be hastened tremendously. So, in figure 7, we have some in-memory block that can keep RDBMS, data warehouse data, then interaction, real-time analytics, record lookup can be done very fast, nanoseconds at best. This would be a huge improvement if we can guarantee reliable and repeatable performance numbers that exceed the current status quo.

It is clear that without big memory space it is slow, expensive and rather complicated to access data over the network server and database connections. With in-memory data operations are fast, cost efficient since DRAM prices have reduced and access to data is simple – it is there in memory. Now, therefore our enterprise architecture can be the following with an in-memory server (figure 8), keeping a large portion of the database/data warehouse data resident in-memory.

However, housing large data reliably on a memory structure becomes tricky. Current applications use Java programming language to implement complex business programs, but a known limitation of JVM is memory management on the node’s heap memory.

The migration of data from databases/warehouses brings up an interesting architecture scenario. Most business data - other than the historical and exploratory analytics use case - can effectively be managed in memory. An in-memory platform should provide reliable highly scalable platform for this purpose. With easy access APIs into traditional RDBMS and MVC frameworks, this transition is highly feasible for existing business applications. Furthermore, APIs must be provided into HDFS connectors for Hadoop processing.

It is also possible that,  for certain use cases, enterprises can completely forgo their databases/warehouses, since there is recent move towards NoSQL storage as “system of record.” A huge working data set can be held in-memory and periodically flushed into NoSQL storage. A Hadoop process can run batch-oriented jobs for historical and exploratory analysis. Results from the batch job on Hadoop can be re-incorporated into the existing working set data in-memory.

About the Author

Bala Chander is a Senior Solutions Architect with Software AG with special focus on big data. He has been consultant, architect, engineering software lead for various Silicon Valley, CA based companies for 20 years. Recently at Motorola Mobility, a Google company, he was involved in big data HADOOP processing of mobile phone data. Previously he was at Yahoo. Bala has a M.S. in Computer Science and B.Tech in Electrical Engineering. Contact email:

<< back Page 4 of 4