dbt Version Control and CI/CD
Managing dbt Projects with Git and CI/CD
Overview
Version control for dbt projects enables collaboration, code review, and safe deployments. This guide covers Git workflows, branching strategies, and CI/CD integration for dbt projects.
Git Workflow
Repository Structure
dbt-project/βββ dbt_project.ymlβββ profiles/β βββ profiles.ymlβββ models/β βββ staging/β β βββ stg_customers.sqlβ β βββ stg_orders.sqlβ β βββ staging.ymlβ βββ intermediate/β β βββ int_customer_orders.sqlβ β βββ int_order_items.sqlβ β βββ intermediate.ymlβ βββ marts/β β βββ mart_sales_summary.sqlβ β βββ mart_customer_360.sqlβ β βββ marts.ymlβ βββ marts.ymlβββ tests/β βββ generic/β βββ singular/β βββ data/βββ seeds/β βββ reference_data.csvβ βββ raw_data.csvβββ macros/β βββ generate_schema_name.sqlβ βββ generate_surrogate_key.sqlβββ analyses/β βββ customer_lifecycle.sqlβ βββ order_trends.sqlβββ .github/ βββ workflows/ βββ dbt-ci.ymlBranching Strategy
Git Branch Model
Branch Types:
- main: Production-ready code, tagged releases
- develop: Integration branch for features
- feature/*: Feature branches (e.g., feature/add-customer-mart)
- bugfix/*: Bug fixes (e.g., bugfix/fix-aggregation)
- hotfix/*: Production hotfixes (merge to main directly)
dbt in CI/CD
GitHub Actions
name: dbt CI/CD
on: pull_request: branches: [main, develop] paths: - 'models/**' - 'tests/**' - 'macros/**' - 'dbt_project.yml' push: branches: [main] paths: - 'models/**' - 'tests/**' - 'macros/**' - 'dbt_project.yml'
env: DBT_PROFILES_DIR: profiles DBT_CLOUD_PROJECT_ID: ${{ secrets.DBT_CLOUD_PROJECT_ID }}
jobs: dbt-compile: name: dbt compile runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Install dbt run: | pip install dbt-core pip install dbt-postgres
- name: Run dbt compile run: dbt compile
dbt-parse: name: dbt parse runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Install dbt run: | pip install dbt-core pip install dbt-postgres
- name: Run dbt parse run: dbt parse
dbt-test: name: dbt test runs-on: ubuntu-latest needs: [dbt-compile, dbt-parse] steps: - uses: actions/checkout@v3
- uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Configure dbt run: | mkdir -p profiles echo "${{ secrets.DBT_PROFILES_YML }}" > profiles/profiles.yml
- name: Install dbt run: | pip install dbt-core pip install dbt-postgres
- name: Run dbt test run: dbt test --schema env: DBT_TEST_DATABASE_URl: ${{ secrets.DBT_TEST_DATABASE_URL }}
dbt-run-dev: name: dbt run (dev) if: github.event_name == 'push' && github.ref == 'refs/heads/develop' runs-on: ubuntu-latest needs: [dbt-test] environment: dev steps: - uses: actions/checkout@v3
- uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Configure dbt run: | mkdir -p profiles echo "${{ secrets.DBT_PROFILES_YML_DEV }}" > profiles/profiles.yml
- name: Install dbt run: | pip install dbt-core pip install dbt-postgres
- name: Run dbt (incremental) run: dbt run --profile dev --full-refresh env: DBT_DATABASE_URL: ${{ secrets.DBT_DEV_DATABASE_URL }}
dbt-run-prod: name: dbt run (production) if: github.event_name == 'push' && github.ref == 'refs/heads/main' runs-on: ubuntu-latest needs: [dbt-test] environment: production steps: - uses: actions/checkout@v3
- uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Configure dbt run: | mkdir -p profiles echo "${{ secrets.DBT_PROFILES_YML_PROD }}" > profiles/profiles.yml
- name: Install dbt run: | pip install dbt-core pip install dbt-postgres
- name: Run dbt (incremental) run: dbt run --profile prod env: DBT_DATABASE_URL: ${{ secrets.DBT_PROD_DATABASE_URL }}
- name: Run dbt docs generate run: dbt docs generate --profile prod
- name: Upload docs to S3 uses: jakejarvis/s3-sync-action@v0.5.1 with: args: --delete-removed aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: us-east-1 s3-bucket: my-company-dbt-docs s3-prefix: docs/ source-dir: target/Environment Management
Multi-Environment Setup
dev: target: dev outputs: dev: type: postgres host: localhost port: 5432 user: dbt_user pass: dbt_pass dbname: data_dev schema: analytics threads: 4prod: target: prod outputs: prod: type: postgres host: prod-db.example.com port: 5432 user: dbt_user pass: "{{ env_var('DBT_PROD_PASSWORD') }}" dbname: data_prod schema: analytics threads: 16Environment Variables
# DevelopmentDBT_DEV_DATABASE_URL=postgresql://dbt_user:dbt_pass@localhost:5432/data_dev
# ProductionDBT_PROD_DATABASE_URL=postgresql://dbt_user:REDACTED@prod-db.example.com:5432/data_prod
# SecretsDBT_CLOUD_API_KEY=REDACTEDDBT_CLOUD_PROJECT_ID=my-projectdbt Packages
Using dbt Packages
packages: - package: dbt-labs/dbt_utils version: 1.0.0
- package: calogica/dbt_expectations version: 0.9.0
- package: dbt-labs/dbt_audit version: 0.6.0# Install packagesdbt deps
# Update packagesdbt deps --update
# Clean packagesdbt cleanDeployment Strategies
Blue-Green Deployment
Blue-Green Process:
- Deploy new version to staging environment
- Run tests on staging
- If tests pass, switch production to staging
- If tests fail, keep production on old version
- Gradual rollback if issues detected
Rollback Strategy
dbt Rollback
# 1. Identify last good run# Check manifest.json for last successful run# dbt runs are stored in target/
# 2. Revert to last commitgit revert HEAD~1
# 3. Force refresh modelsdbt run --full-refresh --select <model_name>
# 4. Or use dbt run --exclude for partial rollbackdbt run --exclude models/marts/new_feature
# 5. Or restore from backupdbt seed --select reference_dataGit Best Practices
DO
# 1. Use .gitignore for generated files# dbt_packages/
# 2. Use feature branches# feature/add-customer-mart
# 3. Write descriptive commit messages# feat: Add customer 360 mart with lifetime value calculation
# 4. Use pull requests for code review# All changes to main via PR
# 5. Tag releases# git tag -a v1.0.0 -m "Release version 1.0.0"DONβT
# 1. Don't commit target/ directory# Generated files, not source code
# 2. Don't commit secrets# Use environment variables or secret managers
# 3. Don't push directly to main# Use PRs for code review
# 4. Don't ignore test failures# All tests must pass before merging
# 5. Don't forget to pull before pushing# Avoid merge conflictsKey Takeaways
- Git workflow: Feature branches, PRs, code review
- CI/CD: Automated testing and deployment
- Environments: Separate dev, staging, prod
- Packages: Manage dependencies with dbt packages
- Deployment: Blue-green for zero-downtime
- Rollback: Revert commits or full-refresh models
- Secrets: Use environment variables or secret managers
- Documentation: Document changes in commit messages and PRs
Back to Module 4