Data Engineering Course Outline
Data Engineering Course Outline
ETL Processes
Extraction, transformation, and loading methods
Tools for data integration
Batch and Streaming Processing
Differences between batch and real-time data processing
Use cases for each approach
Hands-on Projects
Building small-scale ETL pipelines using popular tools
Week 9-10: Cloud Computing for Data Engineers
Hadoop Ecosystem
Distributed data storage and processing (HDFS, MapReduce, YARN)
Apache Spark
Introduction to Spark for large-scale data processing
In-memory vs. disk-based processing
Hands-on Projects
Data processing using Hadoop and Spark
Beginner Projects
Building a simple web scraper and basic data cleaning
Intermediate Projects
Cloud-based data warehouse setup or recommendation engine
Advanced Projects
Machine learning pipelines and real-time analytics dashboards