Orchestration
Workflow Orchestration for Data Platforms
Overview
Orchestration tools manage dependencies, scheduling, and execution of data workflows. This section covers Airflow (traditional), Dagster (data-aware), Prefect (modern), and Kubernetes (cloud-native).
Tool Comparison
| Tool | Paradigm | Maturity | Best For |
|---|---|---|---|
| Airflow | Task-oriented | Very mature | Traditional ETL |
| Dagster | Data-oriented | Modern | ML, analytics |
| Prefect | Code-first | Modern | Resilient workflows |
| Kubernetes | Cloud-native | Mature | Scalable, cloud-native |
Guides
| Document | Description | Key Topics |
|---|---|---|
| Airflow Guide | Traditional orchestration | TaskFlow API, task groups, providers |
| Dagster Guide | Data-aware orchestration | Assets, IO managers, testing |
| Prefect Guide | Modern orchestration | Flows, tasks, state handling |
| Kubernetes Guide | Cloud-native orchestration | Operators, CronJobs, monitoring |
Selection Framework
Typical Workflow
Airflow Workflow
from airflow.decorators import dag, taskfrom datetime import datetime
@dag(schedule_interval='@daily', start_date=datetime(2025, 1, 1))def my_dag(): @task def extract(): return "data"
@task def transform(data): return data + " transformed"
@task def load(data): print(data)
data = extract() transformed = transform(data) load(transformed)
my_dag()Dagster Workflow
from dagster import asset
@assetdef raw_data(): return "data"
@asset(deps=[raw_data])def transformed_data(raw_data): return raw_data + " transformed"
@asset(deps=[transformed_data])def final_data(transformed_data): print(transformed_data)Prefect Workflow
from prefect import flow, task
@taskdef extract(): return "data"
@taskdef transform(data): return data + " transformed"
@taskdef load(data): print(data)
@flowdef my_flow(): data = extract() transformed = transform(data) load(transformed)Learning Path
- Start with: Kubernetes Guide - Understand cloud-native patterns
- Choose your tool:
- Traditional ETL → Airflow Guide
- ML/Analytics → Dagster Guide
- Modern/Resilient → Prefect Guide
Back to Module 3