Data continues growing rapidly, flowing into enterprises from traditional sources as well as new pipelines fueled by web and social media. Often presented in a range of formats and structures, this data onslaught phenomenon has come to be known as "big data." Companies, educational institutions, and government agencies are striving to meet the management challenge of this data deluge as well as mine this wealth of information for business advantage. In this special section, DBTA asks key vendors to explain their strategies for enabling customers to better handle ever-increasing data stores. Susie Siegesmund, Mike Ruane, Ken Dickinson, Doug Leupen, Mark Pick, Lee Burstein, Fred Owen, Doug Owens and Tim Spells share their viewpoints below.
Vice President and General Manager
U2 Brand, Rocket Software
The MultiValue data model was designed from the start to efficiently store and retrieve data. Data is stored in variable length dynamic arrays with variable length fields, values and multi-values for maximum efficiency. MultiValue databases can be used to store any type of data of any length, using hashing algorithms to quickly perform direct I/O reads and writes, thus eliminating the requirement for SQL-based access. The use of multivalues and subvalues either individually or in associations along with the other characteristics means that data sets will be significantly smaller than relational databases and require significantly less resources to store, search and analyze.
Combined with the smaller file size resulting from the use of multivalues and dynamic files, Rocket U2 supports distributed files to disperse large data sets across multiple file systems, which means that the theoretical data size is only limited by the capabilities of the operating system. Data distribution can be done on a specific field value and indexing can be done on the separate "part" files to allow efficient analysis on all or part of the data. This allows effective management of "big data" while maintaining high performance.
Rocket U2 offers extensive interoperability and integration capabilities. Numerous MultiValue vendors offer native ETL solutions for U2 databases and U2 interfaces such as our ADO.NET provider allow relational tools such as Microsoft SSIS to pull U2 data into data warehouses. Our External Database Access technology allows a combined view of disparate data sources at the server level for application integration.
With U2 DataVu, our BI tool released in 2010, we offer a business intelligence and reporting solution that natively accesses our databases as well as a wide variety of relational and OLAP data stores. U2 DataVu can combine data from disparate data sources, allowing you to visualize operational data in graphical reports and interactive dashboards that allow drill down capabilities. U2 DataVu offers zero coding, drag and drop development with hundreds of analytic and graphical functions to allow content authors to easily provide consumable, actionable business intelligence. Rocket U2 provides the total solution: the ability to store very large amounts of data and the ability to do real time analysis to enable business decisions.
President and CEO
Revelation Software's OpenInsight and OpenInsight for Web (O4W) offer a number of solutions when we talk about big data. First, our own native data store, our linear hash filing system, easily handles millions of records. But, for the situations where our customers' systems require more storage, they can use other data stores by using our connector technology. Using our data connectors such as our D3, U2, or SQL connector we provide a bridge between our own data store and other MultiValue and non-MultiValue data stores.
Once these data stores have been attached to OpenInsight via our connector technology we can make use of our entire tool set to analyze and manipulate data in a very efficient and cost-effective manner. With our connector and EngineServer technologies, we can easily span data over a broad network distributing the processing load.
Are we part of the NoSQL movement? Depending upon whose definition of NoSQL, we either are or aren't, but in general I'd say we are a NoSQL database as we do not need to normalize our data, and we pretty well fit the ideas of key-value pairings and data sharding. Our development tools target both the experienced developer as well as the less tech-savvy user, as seen in our O4W product. Users can create browser-based applications using just wizards, while developers can augment the wizard output with Basic code.
MultiValue databases are extremely fast transactional environments and support many types of software applications such as education, manufacturing and accounting. For the "big data challenge" Kore believes the continued use of MultiValue databases for data collection is ideal as these embedded database environments are easily adaptable to the changing application needs. However, when it comes to analytical and statistical reporting Kore believes, as most business intelligence (BI) experts do, that it is best to move the data off of the transactional system and optimize the data to better serve the organization's reporting needs. Kore recommends using a relational database like Microsoft SQL Server as this centralized repository. This enables data to be easily gathered, transformed, and federated into meaningful business metrics and key performance indicators. Key advantages include faster and easier data access with more choices for reporting and analysis tools.
The corporate challenge is integrating MultiValue with other disparate data into a common repository and then using business intelligence and analytics tools to produce actionable information. Therefore, Kore has been developing its next-generation of Kourier Integrator, our integration tool suite designed specifically for MultiValue. Kourier Integrator has helped hundreds of companies integrate their MultiValue data with third party applications and with Microsoft SQL Server to build robust data warehouses and business intelligence solutions - with near real-time refresh rates.
The new Enterprise Edition of Kourier Integrator will have architecture to support virtually any database source or application's API. This will provide more speed, flexibility, and scalability for even the most complex BI or integration projects. Companies with an environment consisting of multiple heterogeneous databases and applications will be able to leverage the new technology as a single solution for handling all of their integration requirements.
Although Kore supports the use of any BI solution, Kore's partnership with Rocket Software and their CorVu BI suite allows us to recommend a BI reporting platform that is powerful and cost-effective. In addition to data integration, Kore can now develop reports and dashboards needed by our clients.
President and CEO
Organizational data volumes, from both internal and external sources, are growing rapidly. In today's "information world" organizations know that to grow revenue and profits they need to harness this information to enable both intelligent strategic decisions and provide up-to-the-minute operational information to their front-line managers and staff. The current mix of operational spreadsheets, departmental reports, and custom applications offers limited, often conflicting views of the data. Lacking coherence in reporting, organizations struggle to enable employees to act on information and collaborate effectively with others.
Informer is a browser-based operational business intelligence (BI) solution that provides native access to multiple data sources (MV and SQL) with or without a data warehouse in place. Unlike many other BI solutions that require a separate data store to report off multiple databases, Informer leverages a powerful metadata model that creates consistency among disparate data descriptions and structures to provide a single point of secure information access. Informer offers powerful analytical and graphical capabilities to enable technical and business users to slice and dice data on the fly according to their own specific needs. Additionally with its Java-based calculated columns feature and SDK (software development kit), Informer can quickly be turned into a dynamic information resource center that becomes an important tool to link across organizational systems and disparate data sources.
Most importantly, Informer is an end-user intuitive tool that distributes reporting to the people who need it, freeing the IT staff to focus on other critical technical projects. And with fast, easy implementation and predictable cost structure, Informer delivers a quick ROI.
Vice President Reality
MultiValue has always been efficient in handling large data volumes. Since there's no fixed record size, as with traditional RDBMs, MultiValue data just grows. We also align with the emerging NoSQL movement, being widely accepted to be more efficient and cost-effective than relational databases.
Typically, MultiValue databases suffer from working on a fixed file size, with performance curtailed if files aren't properly sized and data becomes too large. Reality resizes files in real time, with little overhead while maintaining full access to your data.
Reality boasts some of the largest MultiValue sites in the world and is inherently scalable. From a few users to 10,000-plus, the performance of Reality is the same when running on scalable host systems. Reality's multiplatform design ensures low cost of ownership through efficient use of hardware resources.
Integrating disparate systems is easy with Reality as you can create a "view" to any RDBMS database. Once created, you have both read and write access to that data, making data warehousing a breeze.
Reality is one of the most resilient databases with fault-tolerant capabilities built in to protect against server failure, and encryption to address security concerns. This is coupled with the "Fast Backup" capability which can backup a database in a fraction of the time of a traditional MultiValue save, while still having access to the database as it completes.
MultiValue Product Manager
As a provider of advanced technologies for breakthrough applications, InterSystems has long recognized the challenges of "big data" as well as their commonality across all markets, including the MultiValue arena. The functionality to effectively manage the constant, growing big data blitz has been addressed at a core level in all of the components of the InterSystems technology stack. The seamless integration of MultiValue capabilities with core engine sectors enables MultiValue applications to leverage those features throughout the entire technology stack.
High volumes of data can be inserted into the InterSystems Caché database from a wide range of sources including other computer systems, web pages, and multiple mobile and stationary data collection devices. The information can be persisted in Caché and performance is extremely rapid. For example, a front-end Java application can stream data into Caché at over 100,000 inserts per second.
Using Ensemble, our rapid development and integration platform, data from disparate sources can be transformed, when necessary, combined with data gathered from other systems, and stored in a consistent structure in alternative data repositories. And, storage is not restricted to the capacity of a single server. InterSystems Enterprise Cache Protocol allows for multiple application servers (distributing the workload) and multiple database servers (distributing the data load).
Finally, InterSystems DeepSee business intelligence software enables building analytical data cubes accessed via a web-based dashboard. And, DeepSee indexes can be automatically updated by data collection applications rendering new information immediately available to the dashboard.
MITS-Management Information Tools, Inc.
The combination of exploding data volumes and increasing business complexity results in a greater need to provide meaningful summary metrics to the people called upon to lead their organizations. Our industry is in the infancy of using advanced analytics to compete in an increasingly complex, competitive world.
MITS Discover utilizes the flexibility of the MultiValue database to provide a rich, compelling end-user experience. It is commonly accepted that this database technology excels at the complex, rules-based data management necessary to support the core operation of the organization. It is less common to see this capability applied to the data analysis initiatives. With MITS Discover, we provide a high-level, dashboard-view of the status of an organization, consolidating data from the central database and other out-board systems such as phone records, web visits, or competitors' published pricing, as well as spreadsheet data containing projections or targets. With the power and flexibility of a subroutine call, that information can be combined with complex rules to create a metric that is a launching point for detailed drill-down analysis.
In the realm of ad hoc reporting, the challenges of big data are different; how to provide high-performance access to larger data sets without the requirement for database query experts. With MITS Report, we offer a reporting solution that can provide nearly instantaneous ad hoc access to data from a variety of sources in a simple, browser-based environment suitable for nontechnical users.
Vice President, Sales
Ashwood Computer, Inc.
Ashwood's solutions for our customer's "big data" challenges have included combinations of systems hardware, software and our professional services. Most of our solutions address four basic needs that we have found our customers share - compliance, data retention, data availability, and data security.
Having successfully addressed big data challenges with some of our largest customers, our experiences gained managing their project efforts have compounded and continue to benefit us as we work with new MultiValue database user sites across the country assisting them as they endeavor to address similar projects.
We have found that the MultiValue databases are excellent tools for managing big data and that when hosted on performance optimized systems hardware, our solutions can also be utilized to cost-effectively address our customers' high availability and business continuity needs.
An authorized reseller of enterprise-level systems from IBM, Oracle Sun, and HP, Ashwood has extensive sales, support and systems integration experience. We provide very powerful, robust and reliable Unix, Linux and Windows-based systems with the data storage capacity to address each customer's specific data needs.
The early development teams designed MV databases to also perform as operating systems and as such, they were engineered to make efficient, effective use of very limited systems resources. At that time, computers and systems resources such as, processor, memory, and disk were limited in supply and extremely costly.
Now fast forward to 2011. We are configuring 8-Way 4GHz 64bit processors with more on-board cache memory than most of those early computers could handle. We are utilizing up to hundreds of gigabytes of the fastest error checking memory available, and we are configuring a new disk drive technology to obtain between 75000 and 120000 input/output operations per second. We then add a current version of MultiValue database and we tune this system and database until we obtain optimum performance.
Next, depending upon the "big data" challenge in question, we utilize tools like mv.NET, mv.SSIS, DesignBais, and O4W; there is very little that we cannot accomplish for our customers.
Users are becoming more sophisticated and demand the productivity tools to access information that resides in disparate systems, including MultiValue and relational databases, through one common user interface. Business applications that have been successfully developed, deployed and maintained for many years have one thing in common: How do they leverage the intrinsic capabilities of their data models?
Developing software applications loosely coupled from the database engine imposes several restrictions that become apparent when performance and scalability are required. The farther away the application is from data, the more complex integration becomes which results in poor performance and end-user dissatisfaction.
The Onsystex Application Server suite offers an agile approach for business application development where data integration across non-SQL and SQL databases is demanded. The new wave of software applications built for the cloud in SaaS offerings require highly scalable and flexible environments with the ability to service thousands of users. These requirements alone can be very expensive. With Onsystex Application Server, SaaS applications for the cloud can be developed and deployed easily and cost-effectively.
The Onsystex Application Server suite makes use of the best capabilities of the multidimensional data model and software application development technologies such as Microsoft .NET Framework and Java. The suite allows software developers to build applications quickly against a high performance and customizable application server platform with built-in MultiValue and relational data model support resulting in seamless coexistence, ease of use, and standards-based data access.