Associate Developer for Apache Spark Certification
Number of questions : 60
Type of questions : Multiple choice questions
Duration : 120 Min
Passing score : 70%
Where to register for the certification : https://www.webassessor.com/databricks
Expiration : 2 years
Topics covered :
- Apache Spark architecture concepts
- Spark Architecture applications
- Spark Dataframe API applications
Practice tests: Link
How to prepare for the certification:
Complete the following Databricks academy self paced courses ( If you don’t know how to register for it check this article) :
- Just enough Python for Apache Spark ( Repos link)
- Apache Spark Programming with Databricks ( Repos Link)
Learn Spark SQL : Link
Learn Spark : Link
Read the Spark documentation (recommended)
Features you should know before taking the exam:
Partitioning
Lazy Evaluation
Transformation vs actions
Broadcasting
Coalescing
Out of memory errors
Subsetting dataframes
Column manipulation
String manipulation
Combining Dataframes
Reading/Writing Dataframes
Additional resources :
How to pass the Spark 3.0 accreditation
Minimally Qualified Candidate :
Understanding the basics of the Spark architecture, including Adaptive Query Execution
- Apply the Spark DataFrame API to complete individual data manipulation task, including:
- selecting, renaming and manipulating columns
- filtering, dropping, sorting, and aggregating rows
- joining, reading, writing and partitioning DataFrames
- working with UDFs and Spark SQL functions
While it will not be explicitly tested, the candidate must have a working knowledge of either Python or Scala. The exam is available in both languages.
Bonus :
You can practice spark on the Databricks community : Link
You need to make sure to get familiar with the Spark documentation because during the exam you can’t use the search bar
Article written by Youssef Mrini