CDH305: Cloudera Hadoop Development

CDH305: Cloudera Hadoop Training for Developers



8  hands-on Training course for Hadoop Developers ( Spark +  MapReduce)

  • Basic Concept and HDFS
  • Introduction to Hadoop Ecosystem
  • Introduction to MapReduce
  • Write a MapReduce Program in Java
  • Practical Development Tips and Techniques
  • Data input and output for Hadoop (Data Engineer Focus)
  • Integrating Hadoop into the Enterprise Workflow
  • Modeling and managing Hive, Impala, and Pig
  • Spark Basics
  • Working with RDDs in Spark
  • Writing and Deploy Spark Appications
  • Spark Caching and Persistence
  • Common Patterns in Spark Data Processing (Data Engineer Focus)
  • Spark SQL and DataFrames

This course is designed for Developers and Data Engineer who want to understand how to processing data in Hadoop. We are introducing two Data Engines in Hadoop ( MapReduce  and Spark).


Do I need to know Java, Scala, or Python before the class?

A:  Of course knowing these languages and having programming experience will help you accelerate your learning speed. But in class, we will explain the code line by line, it is a good time to start looking at new data processing language like python.

We are doing all the hands-on exercises on VM which have been pre-installed Hadoop Ecosystem.  You don’t need to worry about the environment s. When you are ready for the knowledge of installation, you are free to install Hadoop or Spark locally on your machine.

In this class, we will also cover Hive, Impala (Hadoop SQL component), SPARK SQL whose syntax is much like SQL but you  help you to understand the magic behind the scenes running on MapReduce or Spark.