Calabi Platform Architecture
Calabi is a unified data platform deployed entirely within your own infrastructure. All Calabi services run inside your Kubernetes cluster — on AWS EKS, on-prem, or any cloud — connecting to your data sources, warehouses, and compute resources within your own network boundary. No data is routed through Calabi-operated infrastructure. You retain complete ownership, security, and compliance control from day one.
Architecture Diagram
The diagram below illustrates how your users interact with the Calabi control plane, and how the control plane interfaces with your data plane resources.
Control Plane vs. Data Plane
Calabi follows a strict separation between the control plane (where platform logic runs) and the data plane (where your data lives).
Control Plane
The control plane is the set of Calabi platform services — ingestion schedulers, transformation runners, catalogue indexers, BI servers, AI agents, and so on. These services run as pods in your EKS cluster inside the master-prod-de namespace.
Key properties:
- Deployed and managed by Calabi via Helm
- Runs inside your AWS account — not in a Calabi-operated cloud
- Receives user requests via the Istio gateway
- Orchestrates operations against your data plane resources
Data Plane
The data plane is where your actual data lives: S3 buckets, Redshift clusters, RDS databases, and any other AWS-managed data stores.
Key properties:
- Always stays in your AWS account and region
- Never routed through Calabi-operated servers
- Access controlled via IAM roles scoped to each workload
- Your team retains full read/write/delete access independently of Calabi
Calabi platform services access your data plane via IAM roles and VPC-private endpoints. Credentials are stored in AWS Secrets Manager within your account. Your data never transits through any Calabi-operated infrastructure.
Deployment Model
Calabi is packaged as per-service Docker images published to public.ecr.aws/calabi/. A single Helm chart with tier-specific value files controls which services deploy. A license key (JWT RS256, offline validated) controls what the UI unlocks.
| Layer | Technology | Role |
|---|---|---|
| Image registry | public.ecr.aws/calabi/ | All Calabi images — same across all tiers |
| CE deployment | Docker Compose | 7 containers, quick start |
| Paid deployment | Helm | 12–22+ containers, Starter / Professional / Enterprise |
| License control | JWT RS256 license key | Tier, module access, user/asset limits |
| Secret management | AWS Secrets Manager + External Secrets Operator | Credential injection into pods |
| Ingress | Calabi Gateway + Istio (optional) | Path-based routing to all services |
All tiers use the same Docker images. Helm decides which containers deploy; the license key decides what the UI unlocks. Upgrading from one tier to the next is a single helm upgrade — no data migration required.
Helm Install by Tier
helm repo add calabi https://helm.calabi.ai
# CE (no license key)
helm install calabi calabi/calabi
# Starter / Professional / Enterprise
helm install calabi calabi/calabi \
-f values-{tier}.yaml \
--set global.licenseKey="$(cat license.key)"
Networking & Security
All user traffic enters the cluster through a single Istio Gateway that terminates TLS. No service is exposed directly to the internet.
Internet → AWS ALB → Istio IngressGateway → OAuth2 Proxy → Calabi Service
Authentication & Authorization
| Layer | Mechanism |
|---|---|
| Identity provider | SSO via SAML 2.0 / OIDC (your corporate IdP) |
| Token issuance | OAuth2 Authorization Code flow |
| Session validation | OAuth2 Proxy sidecar on every service |
| API access | Short-lived JWT bearer tokens |
| Service-to-service | Istio mTLS (mutual TLS inside the mesh) |
Network Isolation
- All inter-service traffic stays within the VPC via private DNS
- No public endpoints for any data plane resources
- AWS Security Groups restrict inbound access to ALB only
- Istio
AuthorizationPolicyenforces per-service allow-lists - Egress to external SaaS connectors (for Calabi Connect) routes through a NAT gateway with a static IP that you can allowlist at source systems
Component Overview
| Component | Purpose | Default Path | Tier |
|---|---|---|---|
| Calabi Catalogue | Data discovery, lineage, glossary & governance | / | CE+ |
| CalabiIQ | BI dashboards, SQL Lab, scheduled reports | /bianalyst | CE+ |
| Data Governance | Tags, classification, owners, access policies | / | CE+ |
| Calabi Connect | ELT data ingestion from 90+ connectors | /dataingestion | Starter+ |
| Calabi Pipelines | DAG authoring, scheduling & monitoring | /airflow | Starter+ |
| Calabi Notebooks | Interactive data science notebooks | /notebooks | Starter+ |
| Calabi IDE | Browser-based VS Code for pipeline development | /ide | Starter+ |
| Calabi ML | Experiment tracking, model registry & artefacts | /mlflow | Professional+ |
| Calabi AI Agent | Natural language queries over your data stack | Built-in | Professional+ |
| Calabi AI Builder | Custom AI chatflows and LLM pipeline builder | /aibuilder | Professional+ |
| AI Chat | Private local LLM inference (air-gapped) | /openwebui | Professional+ |
| Calabi Automate | No-code workflow automation and alerting | /automate | Professional+ |
| Cloud Operations | Cloud inventory, compliance, cost analytics | /cloud | Enterprise |
| Monitoring & Logs | Metrics, dashboards, alerts, log aggregation | /monitoring | Enterprise |
What's Next
- Deployment Architecture — Kubernetes service layout, Helm chart structure, secret management
- Medallion Architecture — How to structure Bronze, Silver, and Gold data layers in Calabi
- Governance Overview — RBAC, SSO, audit logs, and compliance
- Data Management Overview — The full ingest → transform → govern lifecycle