Number of questions : 45
Type of questions : Multiple choice questions
Duration : 90 Min
Passing score : 70%
Where to register for the certification : https://www.webassessor.com/databricks
Expiration : 2 years
Topics covered :
- Databricks Lakehouse Platform
- ELT with Spark SQL and Python
- Incremental Data Processing
- Production Pipelines
- Data Governance
Practice tests: Link
How to prepare for the certification:
Complete The Data Engineering with Databricks ( Databricks Academy)
Complete The Data Engineering Notebooks( Link)
Read the databricks documentation (recommended)
Features you should know before taking the exam:
Lakehouse
Delta Lake ( Time Travel, Merge, Optimization, CTAs, Insert)
Delta Live Tables (DLT + Autoloader)
Incremental processing ( Autoloader, Copy Into)
Data Permissions in Unity Catalog
Additional resources :
Data Engineer Associate Slides
Minimally Qualified Candidate :
The minimally qualified candidate should be able to:
Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:
- Data Lakehouse (architecture, descriptions, benefits)
- Data Science and Engineering workspace (clusters, notebooks, data storage)
- Delta Lake (general concepts, table management and manipulation, optimizations)
Build ETL pipelines using Apache Spark SQL and Python, including:
- Relational entities (databases, tables, views)
- ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)
- Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)
Incrementally process data, including:
- Structured Streaming (general concepts, triggers, watermarks)
- Auto Loader (streaming reads)
- Multi-hop Architecture (bronze-silver-gold, streaming applications)
- Delta Live Tables (benefits and features)
Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:
- Jobs (scheduling, task orchestration, UI)
- Dashboards (endpoints, scheduling, alerting, refreshing)
Understand and follow best security practices, including:
- Unity Catalog (benefits and features)
- Entity Permissions (team-based permissions, user-based permissions)
Article written by Youssef Mrini