The Elephant is coming back to NYC ...


Since its beginning as a project aimed at building a better web search engine for Yahoo – inspired by Google’s well-known MapReduce paper – Hadoop has grown to occupy the center of the big data marketplace. From data offloading to preprocessing, Hadoop is not only enabling the analysis of new data sources amongst a growing legion of enterprise users; it is changing the economics of data. Alongside this momentum is a budding ecosystem of Hadoop-related solutions, from open source projects like Spark, Hive and Drill, to commercial products offered on-premises and in the cloud. These new technologies are solving real-world big data challenges today.

Whether your organization is currently considering Hadoop and Hadoop-related solutions or already using them in production, Hadoop Day is your opportunity to connect with the experts in New York City and expand your knowledge base. This unique event has all the bases covered:

  • Enterprise Use Cases Today
  • Architecting a Scalable Hadoop Platform
  • Building Hadoop Applications
  • Data Warehouse Optimization with Hadoop
  • Troubleshooting Hadoop Performance Issues
  • Data Science with Hadoop
  • Machine-Learning with Spark
  • Data Analysis with Hive & Pig
  • Taking Advantage of SQL-on-Hadoop Solutions
  • Running Hadoop in the Cloud
  • Securing Data in Hadoop
  • Optimizing ETL with Hadoop
  • Diving into the Data Lake
Tuesday, May 16, 2017
8:00 a.m. - 9:00 a.m.
  • Continental Breakfast
9:00 a.m. - 9:45 a.m.
  • WELCOME & KEYNOTE: The Human Side of the Data Revolution
  • For more than a decade “data” has been at or near the top of the enterprise agenda. A robust ecosystem system has emerged around all aspects of data—collection, management, storage, exploitation and disposition. And yet, more than 66% of Global 2000 senior executives are dissatisfied with their data investments/capabilities. This is not a technology problem. This is not a technique problem. This is a people problem. Futurist Thornton May, in a highly interactive session, shares research results of his multi-institution examination of the human side of the data revolution.

    Thornton A May, CEO, FutureScapes Advisors, Inc.
9:45 a.m. - 10:00 a.m.
10:00 a.m. - 10:45 a.m.
  • COFFEE BREAK in the Data Solutions Showcase
10:45 a.m. - 11:45 a.m.
  • H101: Unleashing the Power of Hadoop
  • Hadoop is here to stay, but so are a host of other approaches. To be effective, they must all work together in the enterprise.

  • Accelerating Big Data Implementations Through Hadoop Interoperability

    During the last 10 years, Apache Hadoop has proven to be a popular platform among developers who require a technology that can power large, complex applications. For customers, partners, and application ISVs who write on top of Hadoop, there is still one huge issue that remains—interoperability. Steve Jones and John Mertic take a closer look at how Apache Hadoop can become more interoperable to accelerate Big Data implementations.

    John Mertic, Director, ODPi
  • Steve Jones, Global VP, Capgemini
  • SQL on Hadoop & Big Data Systems

    SQL has been with us for more than 40 years and Hadoop, about 10. Even though when Hadoop was born there was no SQL interface to it, it has become imperative that SQL on Hadoop solutions are brought to the market. This talk provides an overview of SQL on Hadoop, including low latency SQL on Hadoop for analytic workloads, and how SQL engines are innovating

    Sumit Pal, Big Data and Data Science Architect, Independent Consultant
12:00 p.m. - 12:45 p.m.
  • H102: Harnessing Big Data With Spark
  • Open source platforms and frameworks such as Apache Spark have paved the way for commodity-priced processing on a massive scale.

  • Build Machine-Learning Algorithms Powered by Spark

    The journey of leveraging IoT starts by learning how sensors/robotic brains and complex micro controller networks can push data to a private/public cloud or to a centralized compute platform. Learn how to build a recipe from scratch using an electronic sensor, an open source micro controller circuit, C programming, a few wires, and cloud programming.

    Abhik Roy, Database Engineer, Experian
12:45 p.m. - 2:00 p.m.
  • ATTENDEE LUNCH in the Data Solutions Showcase
2:00 p.m. - 2:45 p.m.
  • H103: The Streaming Future of Big Data
  • Real-time utilization of streaming data requires a modern architecture that can scale. Learn about the technologies that can help.

  • Event-Driven Microservices With Streams & Docker

    This presentation covers how to build a multiple location, event-driven architecture that uses streaming data to interconnect Docker-hosted microservices that allow implementation of scalable, redundant, and highly available services across multiple data centers. Using Docker containers and single-purpose microservices, this presentation demonstrates how these services are interconnected with event-driven streams and how this architecture can be deployed

    Paul Curtis, Senior Field Enablement Engineer, MapR Technologies
2:45 p.m. - 3:15 p.m.
  • COFFEE BREAK in the Data Solutions Showcase
3:15 p.m. - 4:00 p.m.
  • H104: Building an Enterprise Data Lake
  • The concept of an enterprise data lake is enticing. Find what’s needed and the technologies available to help build a data lake for the enterprise.

  • Open Source, Code-Free Data Pipelines

    An enterprise data lake typically requires substantial effort to ingest, process store, secure, and manage data from a variety of sources. Cask Data Application Platform (CDAP) is an open source solution, which offers a self-service user interface for creating data lakes and simplifies the building and managing of production data pipelines on Spark, Spark Streaming, MapReduce and Tigon. This talk discusses how to achieve broad, self-service access to Hadoop while maintaining the controls and monitors necessary within the enterprise.

    Jonathan Gray, CEO & Founder, Cask
4:15 p.m. - 5:00 p.m.
  • H105: Integrating Hadoop Into Your BI Environment
  • A recent Unisphere Research survey on data management found that Apache Hadoop is gaining significant traction. About 40% of respondents now have a Hadoop installation.

  • The Do’s & Don’ts for Success With BI on Big Data

    Think Hadoop is not in your future? According to a recent survey, 97% of organizations working with Hadoop anticipate onboarding analytics and BI workloads to Hadoop. When this happens, the companies which have disregarded the Big Data opportunity may be left behind. The good news is that onboarding your business intelligence workloads to Hadoop is not as complicated as it used to be. If you understand some key concepts, the transition can be simpler and more successful—allowing you to recycle current skill sets while avoiding either a rip-and-replace of your technical stack or elimination of business analysts to hire data scientists.

    Josh Klahr, VP, AtScale

Don’t Miss These Special Events