Apache Hadoop is an open source platform designed to query and analyze big data distributed across large clusters of servers with a very high degree of fault tolerance. In this course, students will learn to write applications on Hadoop. Topics include how to create a Hadoop project through completion, how to write Map Reduce programs, working with APIs, HDFS training including the loading and processing of data with CLI and API and processing applications using Apache Hadoop. Other topics include MapReduce mapper, reducer, combiner, practitioner and streaming. Workflow implementations will also be introduced.
- Why Apache Hadoop?
- Hadoop: Basic Concepts
- Loading data into HDFS
- Writing a MapReduce Program
- Debugging MapReduce Programs
- Integrating Hadoop into the Workflow
- The Hadoop API
- MapReduce Algorithms
- Advanced MapReduce Programming
- Joining Data Sets in MapReduce
- Graph Manipulation in Hadoop
Prior programming experience is required, prior Java programming experience is recommended.
24 Hours | 4 Days or 8 Nights