Delivering on Promise, Pivotal Open Sources Key Component of Its Big Data Suite

Pivotal has proposed “Project Geode” for incubation by the Apache Software Foundation (ASF).  A distributed in-memory database, Geode will be the open source core of Pivotal GemFire, and is now available for review at  Pivotal plans to contribute to, support, and help build the Project Geode community while simultaneously producing its commercial distribution of Pivotal GemFire. 

Geode consists of more than 1 million lines of code developed over 12 years by Pivotal and its predecessors. “Donating so much R&D is actually pretty unique in the industry. This doesn’t happen every day,” said Roman Shaposhnik,director of open source, at Pivotal. “On top of that, we are not just donating it as a random GitHub project; we are not just opening it up and dumping source code on everybody. We are actually proposing Gemfire as an Apache Software Foundation Project for incubation.”

Open Source Reference Implementations 

Open source reference implementations essentially have assumed the place of the standards developed by standards committees back in the 1980s and 1990s, Shaposhnik contends. The problem with standards committees is that the period of time it typically takes to come up with a standard can be 3 to 5 years. Now, he said, “instead of locking people in a room for 5 years and expecting them to come up with a paper standard,” the industry collaborates on a common core and that is the defacto implementation reference standard.

“The Linux kernel was basically made that way. CloudFoundry is being made that way, and we surely hope that this will follow in the footsteps of these great projects,” said Shaposhnik.

Big Data Quarterly is a new magazine and digital resource, from the editors of Database Trends and Applications (DBTA) magazine. Subscribe now.

Like Pivotal GemFire, Geode is a distributed in-memory database for high scale custom applications. It provides in-memory access for all operational data spread across hundreds of nodes with a “shared nothing” architecture, enabling low latency data access to applications at massive scale with many concurrent transactions involving terabytes of operational data.

Cross-Industry Collaboration

According to Shaposhnik, the move is a signal to the industry that Pivotal is committed cross-industry collaboration on the Gemfire open source technology. “This will enable different companies to basically collaborate on common code and then even compete with each other. Today, this is exactly the strategy that Pivotal has with CloudFoundry. We have the CloudFoundry Foundation and that is the entity that is managing cross-industry collaboration.” Shaposhnik said the project has not yet been accepted, and still has to get the blessing of the Apache incubator forum.

In addition, he noted, given how important the relationship with ASF is to Pivotal, the company is also announcing an increased level of sponsorship for the foundation, moving to platinum sponsorship. ASF now plays a critical role in the life of open source projects, said Shaposhnik. “ASF is in the business of guaranteeing that the communities that you build around your open source project will be viable, self sustained, openly governed communities that won’t disappear in a matter of days.  

Why ASF Matters

He pointed to the moment that the Heartbleed bug came to light and it was broadly understood that the support for OpenSSL, a key piece of technology, had long rested on one person’s shoulders. “ASF is precisely in the business of making sure this never happens to any piece of software that belongs to the Apache Software Foundation. The motto is ‘community over code’ because the belief is that a long-term, sustainable community can produce great software, but even if you start with great software and no community you are doomed.”

Designed for maintaining consistency of concurrent operations across its distributed data nodes, Geode can support ACID transactions for massively scaled applications such as stock trading, financial payments and ticket sales, already proven in customer deployments of more than 10 million user transactions a day. Originally developed to serve data for mission-critical applications in the financial industry, Geode offers built-in fail-over and resilient self-healing clusters to allow developers to meet the most stringent service level requirements for data accessibility.

More Open Source Components to Come

Geode represents the first step in open sourcing core components of Pivotal’s Big Data Suite. Key code in Pivotal HAWQand Pivotal Greenplum Database will be made available to the open source community later this year.

Meanwhile, Pivotal GemFire will continue to feature enterprise-level support and maintenance as well as proprietary features only available from Pivotal including native clients, continuous queries, and WAN-connectivity between clusters.

Image courtesy of Shutterstock.


Related Articles

Hadoop heavyweight Pivotal is open sourcing components of its Big Data Suite, including Pivotal HD, HAWQ, Greenplum Database, and GemFire; forming the Open Data Platform (ODP), a new industry foundation along with founding members GE, Hortonworks, IBM, Infosys, SAS, and other big data leaders; and forging a new business and technology partnership with Hortonworks.

Posted February 17, 2015


Subscribe to Big Data Quarterly E-Edition