Class Starts Sept 24, 2024 - Dec 19, 2024
Class Times Tuesdays and Thursdays
6:30 pm : 8:30 pm
The bootcamp is designed to equip you with the essential skills and experience to land a job as an Azure Databricks Data Engineer.
Hi, I’m Mezue your instructor
I am a seasoned Azure Databricks Data Engineer with 10 years of experience. I have worked with various clients, consulting across different industries, and currently, I am employed at Capgemini, a top IT consulting firm
Weekly Topics
Week 1
Advanced SQL Techniques Review
We will refresh on SQL core concepts using in data engineering like Window functions, CTE, SQL Merge statement, Joins, and solve some complex coding challenges. Handling missing values, identifying and resolving duplicate values.
Week 2
Advanced Python for Data Engineering
We will refresh on python data structures. Solve some problems involving string manipulation and recursion. Understanding Big O notation and optimizing code performance. Real-world interview python problems
Week 3
Azure Cloud Fundamentals
We will learn basics of cloud computing, Azure administration basics, access control and network security.
Week 4
Data Pipeline (ETL) Concepts - Azure Data Factory
We will cover basics of Azure data Factory, setting up linked services, data sets, designing full and incremental load using Copy activity. Creating configurable metadata-driven pipelines from a SQL database.
Week 5
Azure Data Factory Project
Project: Ingesting Multiple Tables from SQL Server to Azure Data Lake Storage (ADLS)
Project: Reverse ETL - Design and implement a pipeline that handles the reverse data flow, moving data back from ADLS to a SQL Server.
Week 6
Spark Fundamentals
We will learn big data fundamentals and distributed architecture basics. Big data file formats and delta lake. Will cover spark architectures,
Week 7
Introduction to PySpark on Databricks
PySpark Basics: Introduction to PySpark, setting up environments, basic transformations and actions.
Problem-Solving with PySpark: Identify a data problem and solve it using PySpark with integrated Python scripts.
Week 8
Databricks Architecture and Data Lake
We will cover some of the most important features of Databricks like Delta Lake house architecture, Databricks workflows, unity catalog, and SQL warehouse, Magic commands, Databricks architecture. connecting Databricks to your data lake
Week 9
Spark optimization on Databricks
Spark optimization, understanding SparkUI. Debugging slow running job. Databricks delta table tuning best practice
Week 10
Data Warehousing and Data modeling crash course:
We will discuss OLTP vs OLAP, star schema, snowflake schema, One big table, normalization, denormalization, 1-3rd normal form, batch vs streaming, primary key, foreign key. Implementation of SCD Type 1 and Type 2.
Week 11
Advanced Databricks Features
o Unity Catalog: Managing and securing data access across Databricks.
o Auto Loader Project: Building and implementing an auto loader in Databricks.
o Spark Streaming: Introduction to real-time data processing with Spark Streaming.
Change Data Feed
o Tasks: Set up and manage incremental data loading using Databricks and Spark’s change data feed capabilities.
Week 12
Capstone Project: Final week will be mentoring students as they work on the capstone project. The project will be an end to end data engineering project. The goal is to build a metadata driven data pipeline using Azure Data Factory and Databricks to process various files performing full and incremental load into the gold zone in the data lake house
Building a SAP Pipeline
o Tasks: Implement a pipeline that integrates data from SAP systems.
o Auto Loader Project: Building and implementing an auto loader in Databricks.
Capstone project review, Resume building, Interview prep tips.
Frequently Asked Questions
Can I get a refund if I'm unhappy with my purchase?
You can enroll in the class for one week free of charge, without any financial commitments. However, if you complete the program, you will not be eligible for a refund.
Will you provide session recordings for future reference since the training is live online? If yes, how long will I have access to the recordings?
Yes, all sessions will be recorded, and you'll have unlimited access to them for at least a year. If you need more time, feel free to discuss it with me.
Are there any reference notes available for these classes?
Yes, I have reference PowerPoint slides and notebooks that can help you. I also recommend some Udemy courses and YouTube videos for further study.
For the projects, will you provide the sample data files, or will I need to prepare my own?
Yes, I will provide the sample data files.
Since the training is for Azure Cloud, I'll need to sign up for an Azure account. After the free one-month trial, there will be charges. Based on your experience, if I complete all the projects and practice along with you, can you give me an estimate of the cost?
You'll need an Azure account, which typically costs between $5 to $20 per month based on your usage.
My bundle includes coaching. How do I schedule my appointment?
Upon purchasing a bundle that includes coaching, you'll receive further instructions on how to book a time for your appointment.