When does class start/end?
Classes begin promptly at 9:00 am, and typically end at 5:00 pm.
This Data Engineering with Python course teaches attendees how to use Apache Spark and AWS Glue to build scalable and reliable data pipelines. Skills Gained Understand the Spark platform and its...
Read MoreThis Data Engineering with Python course teaches attendees how to use Apache Spark and AWS Glue to build scalable and reliable data pipelines.
Developers, Software Engineers, Data Scientists, and IT Architects.
Participants must have practical experience coding in Python or another modern programming language. Knowledge of AWS Management Console is desirable but not necessary. The students are expected to be able to quickly learn the new material and reinforce the knowledge of a learned topic by doing programming exercises (labs).
Chapter 1. Introduction to Apache Spark
Chapter 2. The Spark Shell
Chapter 3. Spark RDDs
Chapter 4. Introduction to Spark SQL
Chapter 5. Overview of the Amazon Web Services (AWS)
Chapter 6. Introduction to AWS Glue
Chapter 7. Introduction to Apache Spark
Chapter 8. AWS Glue PySpark Extensions
Lab Exercises