An Introduction to Hadoop at Data Summit 2018

Hadoop has forever changed the economics and dynamics of large-scale computing, and its use among enterprises looking to augment their traditional data warehouses continues to grow.

During the first day of Data Summit, May, 21, 2018, Marco Vasquez, senior technical director, MapR, held a pre-conference workshop exploring the basics of Hadoop.

“Hadoop doesn’t move the data, it brings a function to the data,” Vasquez said. “You don’t need a lot of clusters and infrastructure to get going.”

Hadoop assumes data is independent and there are a variety of strategies that can propel businesses forward when organizing data, Vasquez explained.

“For MapR Hadoop really means there’s a whole ecosystem,” Vasquez said.

The core of Hadoop is HDFS and MapReduce, he explained. MapR’s approach to Hadoop sets itself apart from competitors because of its file system components. The file system concept dictates how data is stored and retrieved.

Attendees of the workshop took part in exercises to simulate downloading data sets to get an idea of how MapR makes Hadoop easy to configure and run.

“The key thing is to figure out what you’re looking at,” Vasquez said.

MapR gives users a quicker way to tap into their files on whatever file distribution they are accustomed to, Vasquez explained.

Data Summit 2018 is taking place at the Hyatt Regency Boston, May 22-23, with pre-conference workshops on Monday, May 21. Cognitive Computing Summit will also be co-located at the event.

For more information about Data Summit 2018, and to register, go here.