CI/CD with Databricks Asset Bundles (DAB)
Published 5/2025
Duration: 6h 40m | .MP4 1920x1080, 30 fps(r) | AAC, 44100 Hz, 2ch | 3.67 GB
Genre: eLearning | Language: English
Published 5/2025
Duration: 6h 40m | .MP4 1920x1080, 30 fps(r) | AAC, 44100 Hz, 2ch | 3.67 GB
Genre: eLearning | Language: English
Build production-grade deployment pipelines with Databricks Asset Bundles. Package your Project as Code!
What you'll learn
- Package notebooks, jobs, and configurations as versioned code with Databricks Asset Bundles
- Create automated CI/CD pipelines that deploy reliably from development to production
- Build and distribute custom Python packages for use in your Databricks environment
- Implement unit testing and validation for Databricks code
- Set up GitHub Actions workflows for automated builds, tests, and deployments
- Apply DevOps best practices to Databricks
Requirements
- Experience with Databricks fundamentals (notebooks, clusters, jobs)
- Basic Python knowledge
- Understanding of YAML Syntax
- Awareness of Git and GitHub
- Awareness of CI/CD
Description
Are you ready to put DevOps and CI/CD to work in your Databricks deployments?
In this course, you’ll become an expert in Databricks Asset Bundles—the official “workspace-as-code” framework that brings true DevOps to your analytics platform. You’ll learn to bundle notebooks, jobs, pipelines, cluster specs, infrastructure and workspace configurations into a single, versioned package—and then automate its validation, testing, and multi-stage deployment through CI/CD pipelines. No more one-off clicks or hidden drift—just repeatable, reliable releases.
High-Level Curriculum Overview
Introduction & Core Concepts
Get oriented with Databricks Asset Bundles and CI/CD concepts. Review the course goals, the “infinite delivery loop,” and where to find code samples for each hands-on module.
Environment & Setup
Provision your Azure Databricks workspaces, configure VS Code, install the Databricks CLI, and prepare Databricks Connect for IDE-driven development.
Asset Bundles Fundamentals
Learn the core databricks bundles commands—init, validate, deploy, run, and destroy—and how to define, version, and manage your analytics project in databricks.yml.
Local Development and Unit Testing
Integrate PyTest for unit and integration tests, run tests via CI or Databricks Connect, and generate coverage reports to enforce quality gates.
Understand how to switch between local PySpark for rapid unit testing and Databricks Connect to execute and debug code on real clusters, ensuring parity between your IDE and the cloud.
Hands-On Projects
Apply your knowledge in three practical hands-on projects:
Notebook ETL pipelines (Bronze→Silver→Gold)
Python script tasks and .whl-packaged jobs
Delta Live Tables streaming pipelines
Git Integration & CI/CD Pipelines
Onboard your project to Git, adopt branch-based workflows, and author GitHub Actions or Azure Pipelines to automate builds, tests, staging (with approval), and production rollouts.
By the end of this course, you’ll have an automated end to end CI/CD process for your entire Databricks environment.
Who this course is for:
- Data Engineers working in Databricks environments
- DevOps engineers supporting data teams
- Team leads wanting to implement deployment best practices in Databricks
More Info