Q&A with Teradata’s Chris Twogood

Teradata made its fourth acquisition of 2014 in the big data space with the purchase of Rainstor, a privately held company specializing in online big data archiving on Hadoop. Chris Twogood, vice president of products and services marketing at Teradata, spoke with Big Data Quarterly about why the newly added technologies and services are important to Teradata’s big data portfolio.

How has Teradata’s mission changed and what are the goals with the new technologies that have been added?

Chris Twogood: Twenty years ago, Teradata came out with an architecture around an enterprise data warehouse and we did very well in the marketplace with the idea of that Enterprise Data Warehouse for loading data once and using it many times. Now, what we are representing in the marketplace is our Unified Data Architecture [UDA]and that is as important to Teradata and the market as EDW was 20 years ago. The UDA includes Hadoop as a core component, Aster Data, as well as the data warehouse—but knitted together through glue and software. To enable that knitting together, we are acquiring and have added additional value-added IP on top of Hadoop.

What do the new technologies add?

CT: Revelytix, our first acquisition, was really all about metadata management, data lineage, and data wrangling because one of the problems is that people are dumping data into Hadoop but they certainly do not know where the data came from, whether it is ready for consumption, and they don’t know the lineage. And so that helps us solve some key problems in Hadoop. And then we also acquired Hadapt. That is a company that is a pioneer in SQL on top of Hadoop, and in getting data out of Hadoop for more business-type users. That is a big need, especially as a broader set of users seek to have access to data.

And Think Big?

CT: The Think Big Analytics acquisition was all about helping companies make sense of all this complexity of open source to solve unique problems. And, now with Rainstor, we are showing we can add apps on top of Hadoop. Rainstor is basically a big data archive app that sits on top of Hadoop. All of this is focused on building out the value proposition around Hadoop that extends what we deliver within the UDA so I don’t think it changes our mission. Teradata’s mission has always been about helping companies extract value from data. I just think with this big data phenomenon, the architecture extends to more of an analytical ecosystem which requires a broader set of solutions and tools.

Rainstor has three key value-adds. One is extreme compression. The patented algorithms that it has can compress data from 10x to 40x. Being able to compress data to 40x helps companies reduce the costs of storing all this data. The second piece is that the data is immutable, which means that once you write the data to Rainstor, you cannot change it, and this is important for compliance and security regulations and Sarbanes-Oxley. With database technology, you can update it and so data can be manipulated. In this environment, it cannot, so it meets all of your archive compliance needs.

And the third?

CT: The third dimension is that it is all accessible by SQL. So it is the compression, the fact that the data is immutable, and the fact that it has SQL. The other thing that is key for us in terms of this acquisition is that now Teradata can go sell an enterprise archive solution. This solution is not just for backing up Teradata databases; we can use this solution for backing up Teradata, Oracle, SQL Server, applications, tape—and so now there are new sales motions that Teradata can engage in around selling Rainstor for enterprise archive needs across a company’s datasets.

What else is on the horizon?

CT: SQL is a big driver, and search is also coming up as an interesting technology on top of Hadoop. There are lots of competing approaches and technologies. I think in 2015 you will see a bit of a shakeout in the market as far as which of these are starting to make traction, and I frankly think you will see some of them fall off the grid, and other technologies will take over. 2015 will be an interesting year to see what happens with SQL on Hadoop.

What are some of the challenges you are seeing as customers try to get the most from the greater range of data?

CT: A lot of it is trying to understand how customers include that data in with the rest of their operations so that they can extend their work. For example, if someone is querying the data warehouse in an effort to understand warranty information on electric vehicles as it relates to batteries, they might see an uptick in warranty usage during a certain month. They can analyze that in the warehouse, but if they then want to look at the sensors that are monitoring these batteries to predict a failure, they need to be able to join that data in its raw form with data that is outside of the data warehouse. This has been a challenge, but there are technologies such as Teradata’s QueryGrid that enable users to do that.

How so?

CT: Teradata plays a very strong role in allowing you to do analytics on data that is really well known or tightly coupled, and data that is more loosely coupled or more unknown, to extract value. That really becomes a big value proposition for our customers and it has been something they have been struggling with—what goes where and how do they integrate these technologies? The market has evolved to the point where there is not a single technology that is going to meet all enterprise data needs. It is an analytical ecosystem that pulls together and enables the real extraction of value from data of all sizes and shapes.

What are the technology components that you see playing a role in a big data scenario?

CT: When we look at this, you have multiple technologies within a single enterprise. For example, 1) core SQL and relational structures for the data warehouse; 2) data streaming technology for more real-time analysis in order to be able to extract some of the exhaust data; 3) operational systems which really tie together some of the NoSQL technologies; 4) big data systems like Hadoop; and 5) discovery capabilities for being able to run different analytic engines including graph, MapReduce, and SQL. It is all about how you tie those things together in a cohesive data fabric that enables users to leverage the power of all of that without being experts. That has really been part of the core DNA of Teradata—to reduce the complexity of these analytical ecosystems. That is a huge part of the continued value proposition that we bring to our customers.

Integration is one of the key objectives?

CT: Yes, but the integration is for making sure that you leverage the technology for what it is good at doing, it is not about tightly stitching it together. Some of the integration might be very loosely coupled and some might be more tightly coupled. It depends on how often the data is used and how often it is leveraged. That determines how tightly you integrate the technologies. Having them integrated in a way that is transparent to the user is the most important dimension.

Is there a common thread running through projects that are successful?

CT: The common thread has been that the customers have a defined strategy as to what they want to accomplish. This idea—that you just build the technology, throw data in there, and the users will come—doesn’t work. Customers that have a defined strategy become more successful—whether it is to do better predictive parts maintenance, or to do sentiment analysis of how customers view their company in the open market, or behavioral analytics to understand time series data, path and pattern analysis, or social network analysis.

If you could offer one piece of advice to a company initiating a big data project what would that be?

CT: You need to understand your strategy and objectives, and understand that a technology platform is not a strategy. You need to think about whether your company and your culture is ready for big data, and how you are going to operationalize the data. And the biggest advice is to work with a vendor that has experience in your industry. Especially with a lot of the open source technologies, there are a lot of failures and successes, but you don’t want to be on the failure side. Choosing somebody that has experience can help you navigate those choppy waters.


Interview conducted, edited, and condensed by Joyce Wells.


Subscribe to Big Data Quarterly.




Newsletters

Subscribe to Big Data Quarterly E-Edition