Calabi Pipelines — Pipeline Orchestration
Professional+
Calabi Pipelines is the orchestration engine that schedules and runs your data pipelines. In Calabi, Calabi Pipelines runs with the KubernetesExecutor — each task spawns an isolated Kubernetes pod, giving perfect task isolation and horizontal scalability.
Accessing Calabi Pipelines
Navigate to calabi.{domain}/airflow. Your Calabi SSO credentials work automatically.
Core Concepts
| Concept | Description |
|---|---|
| DAG | Directed Acyclic Graph — a pipeline defined as Python code |
| Task | One unit of work inside a DAG (run a dbt model, call an API, run a SQL query) |
| DAG Run | One execution of a DAG at a specific point in time |
| Task Instance | One execution of a task within a specific DAG run |
| Schedule | A cron expression or preset (e.g., @daily, 0 6 * * *) |
Production DAGs
| DAG | Schedule | Purpose |
|---|---|---|
prod_data_unification_dag | Daily 06:00 UTC | Runs all dbt models in data_unification project |
dynamo_to_warehouse_incremental | Hourly | DynamoDB → data warehouse incremental sync |
calabi_data_ingestion | Daily | Raw data preparation |
psychometric_reporting_dag | Weekly (Sunday) | Psychometric scoring models |
recon_asgard_production_validation | Daily | Data reconciliation validation |
KubernetesExecutor
Calabi runs Calabi Pipelines with KubernetesExecutor:
Benefits:
- Perfect isolation — one pod per task
- No shared workers — tasks can't interfere with each other
- Automatic resource cleanup after task completion
- Scale to zero when no tasks are running