3 arrows

Get a Free AWS Exam retake using promo code DOUBLESHOT

closeClose

Cloudera Data Analyst Training

  • Tuition USD $3,195 GSA  $2,736.27
  • Reviews star_rate star_rate star_rate star_rate star_half 523 Ratings
  • Course Code DATA-ANALYST
  • Duration 4 days
  • Available Formats Classroom, Virtual

Apache Hive makes transformation and analysis of complex, multi-structured data scalable in Hadoop. Apache Impala enables real-time interactive analysis of the data stored in Hadoop using a native SQL environment. Together, they make multi-structured data accessible to analysts, database administrators, and others without Java programming expertise.

Prerequisites

This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Some knowledge of SQL is assumed, as is basic Linux command-line familiarity. Prior knowledge of Apache Hadoop is not required.

Course Details

Introduction

Apache Hadoop Fundamentals

  • The Motivation for Hadoop
  • Hadoop Overview
  • Data Storage: HDFS
  • Distributed Data Processing: YARN, MapReduce, and Spark
  • Data Processing and Analysis: Hive and Impala
  • Database Integration: Sqoop
  • Other Hadoop Data Tools
  • Exercise Scenario Explanation

Introduction to Apache Hive and Impala

  • What Is Hive?
  • What Is Impala?
  • Why Use Hive and Impala?
  • Schema and Data Storage
  • Comparing Hive and Impala to Traditional Databases
  • Use Cases

Querying with Apache Hive and Impala

  • Databases and Tables
  • Basic Hive and Impala Query Language Syntax
  • Data Types
  • Using Hue to Execute Queries
  • Using Beeline (Hive's Shell)
  • Using the Impala Shell

Common Operators and Built-In Functions

  • Operators
  • Scalar Functions
  • Aggregate Functions

Data Management

  • Data Storage
  • Creating Databases and Tables
  • Loading Data
  • Altering Databases and Tables
  • Simplifying Queries with Views
  • Storing Query Results

Data Storage and Performance

  • Partitioning Tables
  • Loading Data into Partitioned Tables
  • When to Use Partitioning
  • Choosing a File Format
  • Using Avro and Parquet File Formats

Working with Multiple Datasets

  • UNION and Joins
  • Handling NULL Values in Joins
  • Advanced Joins

Analytic Functions and Windowing

  • Using Analytic Functions
  • Other Analytic Functions
  • Sliding Windows

Complex Data

  • Complex Data with Hive
  • Complex Data with Impala

Analyzing Text

  • Using Regular Expressions with Hive and Impala
  • Processing Text Data with SerDes in Hive
  • Sentiment Analysis and n-grams in Hive

Apache Hive Optimization

  • Understanding Query Performance
  • Cost-Based Optimization and Statistics
  • Bucketing
  • ORC File Optimizations

Apache Impala Optimization

  • How Impala Executes Queries
  • Improving Impala Performance

Extending Apache Hive and Impala

  • Custom SerDes and File Formats in Hive
  • Data Transformation with Custom Scripts in Hive
  • User-Defined Functions
  • Parameterized Queries

Choosing the Best Tool for the Job

  • Comparing Hive, Impala, and Relational Databases
  • Which to Choose?

Conclusion

Apache Kudu

  • What Is Kudu?
  • Kudu Tables
  • Using Impala with Kudu

When does class start/end?

Classes begin promptly at 9:00 am, and typically end at 5:00 pm.

Does the course schedule include a Lunchbreak?

Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.

How can someone reach me during class?

If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

What does GTR stand for?

GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals, and private on sites. View our group training page for more information.

What does vendor-authorized training mean?

As a vendor-authorized training partner, we offer a curriculum that our partners have vetted. We use the same course materials and facilitate the same labs as our vendor-delivered training. These courses are considered the gold standard and, as such, are priced accordingly.

Is the training too basic, or will you go deep into technology?

It depends on your requirements, your role in your company, and your depth of knowledge. The good news about many of our learning paths, you can start from the fundamentals to highly specialized training.

How up-to-date are your courses and support materials?

We continuously work with our vendors to evaluate and refresh course material to reflect the latest training courses and best practices.

Are your instructors seasoned trainers who have deep knowledge of the training topic?

ExitCertified instructors have an average of 27 years of practical IT experience. They have also served as consultants for an average of 15 years. To stay up to date, instructors will at least spend 25 percent of their time learning new emerging technologies and courses.

Do you provide hands-on training and exercises in an actual lab environment?

Lab access is dependent on the vendor and the type of training you sign up for. However, many of our top vendors will provide lab access to students to test and practice. The course description will specify lab access.

Will you customize the training for our company’s specific needs and goals?

We will work with you to identify training needs and areas of growth.  We offer a variety of training methods, such as private group training, on-site of your choice, and virtually. We provide courses and certifications that are aligned with your business goals.

How do I get started with certification?

Getting started on a certification pathway depends on your goals and the vendor you choose to get certified in. Many vendors offer entry-level IT certification to advanced IT certification that can boost your career. To get access to certification vouchers and discounts, please contact Edu_customerexperience@techdata.com

Will I get access to content after I complete a course?

You will get access to the PDF of course books and guides, but access to the recording and slides will depend on the vendor and type of training you receive.

Joel was great and handled our questions with great knowledge and professionalism.

Instructor was well?prepared and explained topics in much detail. Was also very responsive to questions asked by the audience.

Eric has been great at explaining and guiding the course. He is very knowledgeable and able to help tie the topics and concepts shared in the class back to our real life use cases.

Have been working in Cloudera for several months now, but haven't completely understood the animals and when to use as well as how things are working in the background. This class answered those questions.

Not every topic applied to my job but many topics applied. Joel spent a lot of time explaining how processing worked with the different tools which was also valuable!

4 options available

undo
  • Nov 9, 2021 Nov 12, 2021 (4 days)
    Location
    Virtual
    Language
    English
    Time
    9:00 am 5:00 pm
    Enroll
    Enroll
  • Dec 7, 2021 Dec 10, 2021 (4 days)
    Location
    Virtual
    Language
    English
    Time
    9:00 am 5:00 pm
    Enroll
    Enroll
  • Dec 20, 2021 Dec 23, 2021 (4 days)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM EST
    Enroll
    Enroll
  • Feb 8, 2022 Feb 11, 2022 (4 days)
    Location
    iMVP
    Language
    English
    Time
    12:00PM 8:00PM EST
    Enroll
    Enroll
Contact Us 1-800-803-3948
Contact Us
FAQ Get immediate answers to our most frequently asked qestions. View FAQs arrow_forward