Skip to content

dbt Version Control and CI/CD

Managing dbt Projects with Git and CI/CD


Overview

Version control for dbt projects enables collaboration, code review, and safe deployments. This guide covers Git workflows, branching strategies, and CI/CD integration for dbt projects.


Git Workflow

Repository Structure

dbt-project/
β”œβ”€β”€ dbt_project.yml
β”œβ”€β”€ profiles/
β”‚ └── profiles.yml
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ staging/
β”‚ β”‚ β”œβ”€β”€ stg_customers.sql
β”‚ β”‚ β”œβ”€β”€ stg_orders.sql
β”‚ β”‚ └── staging.yml
β”‚ β”œβ”€β”€ intermediate/
β”‚ β”‚ β”œβ”€β”€ int_customer_orders.sql
β”‚ β”‚ β”œβ”€β”€ int_order_items.sql
β”‚ β”‚ └── intermediate.yml
β”‚ β”œβ”€β”€ marts/
β”‚ β”‚ β”œβ”€β”€ mart_sales_summary.sql
β”‚ β”‚ β”œβ”€β”€ mart_customer_360.sql
β”‚ β”‚ └── marts.yml
β”‚ └── marts.yml
β”œβ”€β”€ tests/
β”‚ β”œβ”€β”€ generic/
β”‚ β”œβ”€β”€ singular/
β”‚ └── data/
β”œβ”€β”€ seeds/
β”‚ β”œβ”€β”€ reference_data.csv
β”‚ └── raw_data.csv
β”œβ”€β”€ macros/
β”‚ β”œβ”€β”€ generate_schema_name.sql
β”‚ └── generate_surrogate_key.sql
β”œβ”€β”€ analyses/
β”‚ β”œβ”€β”€ customer_lifecycle.sql
β”‚ └── order_trends.sql
└── .github/
└── workflows/
└── dbt-ci.yml

Branching Strategy

Git Branch Model

Branch Types:

  • main: Production-ready code, tagged releases
  • develop: Integration branch for features
  • feature/*: Feature branches (e.g., feature/add-customer-mart)
  • bugfix/*: Bug fixes (e.g., bugfix/fix-aggregation)
  • hotfix/*: Production hotfixes (merge to main directly)

dbt in CI/CD

GitHub Actions

.github/workflows/dbt-ci.yml
name: dbt CI/CD
on:
pull_request:
branches: [main, develop]
paths:
- 'models/**'
- 'tests/**'
- 'macros/**'
- 'dbt_project.yml'
push:
branches: [main]
paths:
- 'models/**'
- 'tests/**'
- 'macros/**'
- 'dbt_project.yml'
env:
DBT_PROFILES_DIR: profiles
DBT_CLOUD_PROJECT_ID: ${{ secrets.DBT_CLOUD_PROJECT_ID }}
jobs:
dbt-compile:
name: dbt compile
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dbt
run: |
pip install dbt-core
pip install dbt-postgres
- name: Run dbt compile
run: dbt compile
dbt-parse:
name: dbt parse
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dbt
run: |
pip install dbt-core
pip install dbt-postgres
- name: Run dbt parse
run: dbt parse
dbt-test:
name: dbt test
runs-on: ubuntu-latest
needs: [dbt-compile, dbt-parse]
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Configure dbt
run: |
mkdir -p profiles
echo "${{ secrets.DBT_PROFILES_YML }}" > profiles/profiles.yml
- name: Install dbt
run: |
pip install dbt-core
pip install dbt-postgres
- name: Run dbt test
run: dbt test --schema
env:
DBT_TEST_DATABASE_URl: ${{ secrets.DBT_TEST_DATABASE_URL }}
dbt-run-dev:
name: dbt run (dev)
if: github.event_name == 'push' && github.ref == 'refs/heads/develop'
runs-on: ubuntu-latest
needs: [dbt-test]
environment: dev
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Configure dbt
run: |
mkdir -p profiles
echo "${{ secrets.DBT_PROFILES_YML_DEV }}" > profiles/profiles.yml
- name: Install dbt
run: |
pip install dbt-core
pip install dbt-postgres
- name: Run dbt (incremental)
run: dbt run --profile dev --full-refresh
env:
DBT_DATABASE_URL: ${{ secrets.DBT_DEV_DATABASE_URL }}
dbt-run-prod:
name: dbt run (production)
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
needs: [dbt-test]
environment: production
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Configure dbt
run: |
mkdir -p profiles
echo "${{ secrets.DBT_PROFILES_YML_PROD }}" > profiles/profiles.yml
- name: Install dbt
run: |
pip install dbt-core
pip install dbt-postgres
- name: Run dbt (incremental)
run: dbt run --profile prod
env:
DBT_DATABASE_URL: ${{ secrets.DBT_PROD_DATABASE_URL }}
- name: Run dbt docs generate
run: dbt docs generate --profile prod
- name: Upload docs to S3
uses: jakejarvis/s3-sync-action@v0.5.1
with:
args: --delete-removed
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
s3-bucket: my-company-dbt-docs
s3-prefix: docs/
source-dir: target/

Environment Management

Multi-Environment Setup

profiles/environments/profiles_dev.yml
dev:
target: dev
outputs:
dev:
type: postgres
host: localhost
port: 5432
user: dbt_user
pass: dbt_pass
dbname: data_dev
schema: analytics
threads: 4
profiles/environments/profiles_prod.yml
prod:
target: prod
outputs:
prod:
type: postgres
host: prod-db.example.com
port: 5432
user: dbt_user
pass: "{{ env_var('DBT_PROD_PASSWORD') }}"
dbname: data_prod
schema: analytics
threads: 16

Environment Variables

.env
# Development
DBT_DEV_DATABASE_URL=postgresql://dbt_user:dbt_pass@localhost:5432/data_dev
# Production
DBT_PROD_DATABASE_URL=postgresql://dbt_user:REDACTED@prod-db.example.com:5432/data_prod
# Secrets
DBT_CLOUD_API_KEY=REDACTED
DBT_CLOUD_PROJECT_ID=my-project

dbt Packages

Using dbt Packages

packages.yml
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
- package: calogica/dbt_expectations
version: 0.9.0
- package: dbt-labs/dbt_audit
version: 0.6.0
Terminal window
# Install packages
dbt deps
# Update packages
dbt deps --update
# Clean packages
dbt clean

Deployment Strategies

Blue-Green Deployment

Blue-Green Process:

  1. Deploy new version to staging environment
  2. Run tests on staging
  3. If tests pass, switch production to staging
  4. If tests fail, keep production on old version
  5. Gradual rollback if issues detected

Rollback Strategy

dbt Rollback

Terminal window
# 1. Identify last good run
# Check manifest.json for last successful run
# dbt runs are stored in target/
# 2. Revert to last commit
git revert HEAD~1
# 3. Force refresh models
dbt run --full-refresh --select <model_name>
# 4. Or use dbt run --exclude for partial rollback
dbt run --exclude models/marts/new_feature
# 5. Or restore from backup
dbt seed --select reference_data

Git Best Practices

DO

target/
# 1. Use .gitignore for generated files
# dbt_packages/
# 2. Use feature branches
# feature/add-customer-mart
# 3. Write descriptive commit messages
# feat: Add customer 360 mart with lifetime value calculation
# 4. Use pull requests for code review
# All changes to main via PR
# 5. Tag releases
# git tag -a v1.0.0 -m "Release version 1.0.0"

DON’T

Terminal window
# 1. Don't commit target/ directory
# Generated files, not source code
# 2. Don't commit secrets
# Use environment variables or secret managers
# 3. Don't push directly to main
# Use PRs for code review
# 4. Don't ignore test failures
# All tests must pass before merging
# 5. Don't forget to pull before pushing
# Avoid merge conflicts

Key Takeaways

  1. Git workflow: Feature branches, PRs, code review
  2. CI/CD: Automated testing and deployment
  3. Environments: Separate dev, staging, prod
  4. Packages: Manage dependencies with dbt packages
  5. Deployment: Blue-green for zero-downtime
  6. Rollback: Revert commits or full-refresh models
  7. Secrets: Use environment variables or secret managers
  8. Documentation: Document changes in commit messages and PRs

Back to Module 4