Image courtesy of Shutterstock
While the influx of data provides many benefits to an organization, it can come with some challenges as well. The cost and lack of elasticity due to the volume and multiple data sources have become difficult for businesses to deal with. In a recent webcast, three Hadoop and cloud experts, Merv Adrian, Gartner research VP; Shaun Connolly, Hortonworks VP of strategy; and Lance Olson, Microsoft GM of platform and cloud, discussed key technology solutions such as Apache Hadoop and cloud, and emerging best practices for meeting today’s tough data challenges.
Hadoop is on the rise for companies that are dealing with big data, all agreed. According to a survey from Gartner, 44% have either installed Hadoop or plan to deploy within 2 years. Initially Hadoop was used for basic ETL functions, but as time has gone on it has been discovered to have much more use.
“Grabbing big data is where we started, but over the last few years, new use cases have evolved such as data reservoir creation, machine learning, and enriching the data and so on,” stated Adrian. When considering Hadoop, companies must consider management, automation, and governance.
Another important aspect that all companies are paying more attention to today is security. Hadoop security has begun to mature by protecting data through encryption and third parties increasingly supporting Hadoop distributions.
When a business is deploying Hadoop, it will have three deployment options: on-premise, the cloud, or a hybrid approach. One of the benefits of cloud deployment is a business will only have to pay for what they need. An attractive option is also hybrid deployment. Hybrid deployment can provide some of the bests of both worlds. It does come with some concerns though as well such as moving data back and forth between environments. Overall, Adrian recommends, “start with small achievable projects; confirm a team; and integrate with your data center specialists.”
Olson noted a few use cases for hybrid architecture: development testing, IoT applications, and business analytics and machine learning. “Doing the testing and piloting in the cloud makes a lot of sense. It allows users to change the hardware configurations and that lets you optimize the hardware for the workload,” stated Olson.
But the question is not really where data should be, said Connolly. “The bottom line is it is not necessarily an ‘up there’ or ‘down here’ question. The reality is you want to have the ability to deploy everywhere,” stated Connolly.
To view a replay of this webinar, go here.