3 arrows

Get a Free AWS Exam retake using promo code DOUBLESHOT

closeClose

Cloudera Administrator Training for Apache Hadoop

  • Tuition USD $3,195 GSA  $2,736.27
  • Reviews star_rate star_rate star_rate star_rate star_half 973 Ratings
  • Course Code HADOOP-ADMIN
  • Duration 4 days
  • Available Formats Classroom

Cloudera University’s four-day administrator training course for Apache Hadoop provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster using Cloudera Manager. From installation and configuration through load balancing and tuning, Cloudera’s training course is the best preparation for the real-world challenges faced by Hadoop administrators.

Skills Gained

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:

  • Cloudera Manager features that make managing your clusters easier, such as aggregated logging, configuration management, resource management, reports, alerts, and service management
  • Configuring and deploying production-scale clusters that provide key Hadoop-related services, including YARN, HDFS, Impala, Hive, Spark, Kudu, and Kafka
  • Determining the correct hardware and infrastructure for your cluster
  • Proper cluster configuration and deployment to integrate with the data center
  • Ingesting, storing, and accessing data in HDFS, Kudu, and cloud object stores such as Amazon S3
  • How to load file-based and streaming data into the cluster using Kafka and Flume
  • Configuring automatic resource management to ensure service-level agreements are met for multiple users of a cluster
  • Best practices for preparing, tuning, and maintaining a production cluster
  • Troubleshooting, diagnosing, and solving cluster issues

Who Can Benefit

This course is best suited to systems administrators and IT managers who have basic Linux experience. Prior knowledge of Apache Hadoop is not required.

Course Details

The Cloudera Enterprise Data Hub

  • Cloudera Enterprise Data Hub
  • CDH Overview
  • Cloudera Manager Overview
  • Hadoop Administrator Responsibilities

Installing Cloudera Manager and CDH

  • Cluster Installation Overview
  • Cloudera Manager Installation
  • CDH Installation
  • CDH Cluster Services

Configuring a Cloudera Cluster

  • Overview
  • Configuration Settings
  • Modifying Service Configurations
  • Configuration Files
  • Managing Role Instances
  • Adding New Services
  • Adding and Removing Hosts

Hadoop Distributed File System

  • Overview
  • HDFS Topology and Roles
  • Edit Logs and Checkpointing
  • HDFS Performance and Fault Tolerance
  • HDFS and Hadoop Security Overview
  • Web User Interfaces for HDFS
  • Using the HDFS Command Line Interface
  • Other Command Line Utilities

HDFS Data Ingest

  • Data Ingest Overview
  • File Formats
  • Ingesting Data using File Transfer or REST Interfaces
  • Importing Data from Relational Databases with Apache Sqoop
  • Ingesting Data From External Sources with Apache Flume
  • Best Practices for Importing Data

Hive and Impala

  • Apache Hive
  • Apache Impala

YARN and MapReduce

  • YARN Overview
  • Running Applications on YARN
  • Viewing YARN Applications
  • YARN Application Logs
  • MapReduce Applications
  • YARN Memory and CPU Settings

Apache Spark

  • Spark Overview
  • Spark Applications
  • How Spark Applications Run on YARN
  • Monitoring Spark Applications

Planning Your Cluster

  • General Planning Considerations
  • Choosing the Right Hardware
  • Network Considerations
  • Virtualization Options
  • Cloud Deployment Options
  • Configuring Nodes

Advanced Cluster Configuration

  • Configuring Service Ports
  • Tuning HDFS and MapReduce
  • Enabling HDFS High Availability

Managing Resources

  • Configuring cgroups with Static Service Pools
  • The Fair Scheduler
  • Configuring Dynamic Resource Pools
  • Impala Query Scheduling

Cluster Maintenance

  • Checking HDFS Status
  • Copying Data Between Clusters
  • Rebalancing Data in HDFS
  • HDFS Directory Snapshots
  • Upgrading a Cluster

Monitoring Clusters

  • Cloudera Manager Monitoring Features
  • Health Tests
  • Events and Alerts
  • Charts and Reports
  • Monitoring Recommendations

Cluster Troubleshooting

  • Overview
  • Troubleshooting Tools
  • Misconfiguration Examples
  • Essential Points

Installing and Managing Hue

  • Overview
  • Managing and Configuring Hue
  • Hue Authentication and Authorization

Security

  • Hadoop Security Concepts
  • Hadoop Authentication Using Kerberos
  • Hadoop Authorization
  • Hadoop Encryption
  • Securing a Hadoop Cluster

Apache Kudu

  • Kudu Overview
  • Architecture
  • Installation and Configuration
  • Monitoring and Management Tools

Apache Kafka

  • What Is Apache Kafka?
  • Apache Kafka Overview
  • Apache Kafka Cluster Architecture
  • Apache Kafka Command Line Tools
  • Using Kafka with Flume

Object Storage in the Cloud

  • Object Storage
  • Connecting Hadoop to Object Storag

When does class start/end?

Classes begin promptly at 9:00 am, and typically end at 5:00 pm.

Does the course schedule include a Lunchbreak?

Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.

How can someone reach me during class?

If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

What does GTR stand for?

GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals, and private on sites. View our group training page for more information.

What does vendor-authorized training mean?

As a vendor-authorized training partner, we offer a curriculum that our partners have vetted. We use the same course materials and facilitate the same labs as our vendor-delivered training. These courses are considered the gold standard and, as such, are priced accordingly.

Is the training too basic, or will you go deep into technology?

It depends on your requirements, your role in your company, and your depth of knowledge. The good news about many of our learning paths, you can start from the fundamentals to highly specialized training.

How up-to-date are your courses and support materials?

We continuously work with our vendors to evaluate and refresh course material to reflect the latest training courses and best practices.

Are your instructors seasoned trainers who have deep knowledge of the training topic?

ExitCertified instructors have an average of 27 years of practical IT experience. They have also served as consultants for an average of 15 years. To stay up to date, instructors will at least spend 25 percent of their time learning new emerging technologies and courses.

Do you provide hands-on training and exercises in an actual lab environment?

Lab access is dependent on the vendor and the type of training you sign up for. However, many of our top vendors will provide lab access to students to test and practice. The course description will specify lab access.

Will you customize the training for our company’s specific needs and goals?

We will work with you to identify training needs and areas of growth.  We offer a variety of training methods, such as private group training, on-site of your choice, and virtually. We provide courses and certifications that are aligned with your business goals.

How do I get started with certification?

Getting started on a certification pathway depends on your goals and the vendor you choose to get certified in. Many vendors offer entry-level IT certification to advanced IT certification that can boost your career. To get access to certification vouchers and discounts, please contact Edu_customerexperience@techdata.com

Will I get access to content after I complete a course?

You will get access to the PDF of course books and guides, but access to the recording and slides will depend on the vendor and type of training you receive.

The course is absolutely fantastic! The instructor is knowledgeable and provided great explanation for all the aspects I wanted to learn.

Troy is an excellent instructor. He's very knowledgeable and offers plenty of time to ask questions and for classmates to interact with each other.

Excellent teacher for an excellent course. I was already a little familiar with Hadoop before the class started and now I feel much more knowledgeable

Thank you so much , the class was very informative and helpful. The instructor is so awesome and the material was very good too.

There was lots of information, but it was well taught and re-enforced by labs and materials.

0 options available

There are currently no scheduled dates for this course. If you are interested in this course, request a course date with the links above. We can also contact you when the course is scheduled in your area.

Contact Us 1-800-803-3948
Contact Us
FAQ Get immediate answers to our most frequently asked qestions. View FAQs arrow_forward