CDH205: Cloudera Hadoop Data Analysis

CDH205: Cloudera Hadoop Data Analysis Training

 

This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators.

 

Duration: 24 Hours

Instructor: Cloudera Certified Trainer & Certified Developer

Skills Gained

Through session, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:

  • The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysis
  • The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop
  • How Pig, Hive, and Impala improve productivity for typical analysis tasks
  • Joining diverse datasets to gain valuable business insight
  • Performing real-time, complex queries on datasets

 

 

Using Pig, Hive and Impala

 

  • Hadoop Fundamentals
  • Introduction to Hadoop and Hive
  • Getting Data Into Hive
  • Manipulating Data with Hive
  • Partitioning and Bucketing
  • Advanced Hive
  • Hive Best Practices
  • Introduction to Pig
  • Pig Architecture
  • Reading and Writing Data
  • PigLatin
  • Advanced PigLatin
  • Debugging Pig
  • Pig Best Practices