Skip to main content

Calabi Platform Architecture

All Tiers

Calabi is a unified data platform deployed entirely within your own infrastructure. All Calabi services run inside your Kubernetes cluster — on AWS EKS, on-prem, or any cloud — connecting to your data sources, warehouses, and compute resources within your own network boundary. No data is routed through Calabi-operated infrastructure. You retain complete ownership, security, and compliance control from day one.


Architecture Diagram

The diagram below illustrates how your users interact with the Calabi control plane, and how the control plane interfaces with your data plane resources.

Your UsersBrowser · IDE · API Clients
Data EngineersPipelines · Connect · dbt
AnalystsCalabiIQ · SQL Lab · Dashboards
ML EngineersCalabi ML · AI Builder · Local Models
AdminsUser mgmt · SSO · Monitoring
HTTPS / WebSocket
Calabi Control PlaneKubernetes (EKS) — your cloud account
Calabi CatalogueDiscovery · Lineage · Governance
CalabiIQDashboards · SQL Lab · Reports
Calabi ConnectData Ingestion · 90+ Connectors
Calabi PipelinesDAG Orchestration · Scheduling
Calabi MLExperiment Tracking · Model Registry
Calabi AI AgentNatural Language · Tool Use
Calabi AI BuilderVisual Pipeline · RAG Agents
Calabi AutomateWorkflow Automation · Integrations
Query · Ingest · Orchestrate · Store
Your Data PlaneCloud Infrastructure — all within your account boundary
S3 / Data LakeBronze · Silver · Gold layers
Data WarehouseRedshift · Snowflake · BigQuery
DatabasesPostgres · MySQL · RDS · Aurora
SaaS & APIsSalesforce · HubSpot · Stripe · custom
ComputeEKS Cluster · Lambda · Glue
Secrets & ConfigAWS Secrets Manager · Parameter Store
Calabi runs entirely inside your cloud account — no data leaves your infrastructure

Control Plane vs. Data Plane

Calabi follows a strict separation between the control plane (where platform logic runs) and the data plane (where your data lives).

Control Plane

The control plane is the set of Calabi platform services — ingestion schedulers, transformation runners, catalogue indexers, BI servers, AI agents, and so on. These services run as pods in your EKS cluster inside the master-prod-de namespace.

Key properties:

  • Deployed and managed by Calabi via Helm
  • Runs inside your AWS account — not in a Calabi-operated cloud
  • Receives user requests via the Istio gateway
  • Orchestrates operations against your data plane resources

Data Plane

The data plane is where your actual data lives: S3 buckets, Redshift clusters, RDS databases, and any other AWS-managed data stores.

Key properties:

  • Always stays in your AWS account and region
  • Never routed through Calabi-operated servers
  • Access controlled via IAM roles scoped to each workload
  • Your team retains full read/write/delete access independently of Calabi
Data Sovereignty Guarantee

Calabi platform services access your data plane via IAM roles and VPC-private endpoints. Credentials are stored in AWS Secrets Manager within your account. Your data never transits through any Calabi-operated infrastructure.


Deployment Model

Calabi is packaged as per-service Docker images published to public.ecr.aws/calabi/. A single Helm chart with tier-specific value files controls which services deploy. A license key (JWT RS256, offline validated) controls what the UI unlocks.

LayerTechnologyRole
Image registrypublic.ecr.aws/calabi/All Calabi images — same across all tiers
CE deploymentDocker Compose7 containers, quick start
Paid deploymentHelm12–22+ containers, Starter / Professional / Enterprise
License controlJWT RS256 license keyTier, module access, user/asset limits
Secret managementAWS Secrets Manager + External Secrets OperatorCredential injection into pods
IngressCalabi Gateway + Istio (optional)Path-based routing to all services

All tiers use the same Docker images. Helm decides which containers deploy; the license key decides what the UI unlocks. Upgrading from one tier to the next is a single helm upgrade — no data migration required.

Helm Install by Tier

helm repo add calabi https://helm.calabi.ai

# CE (no license key)
helm install calabi calabi/calabi

# Starter / Professional / Enterprise
helm install calabi calabi/calabi \
-f values-{tier}.yaml \
--set global.licenseKey="$(cat license.key)"

Networking & Security

All user traffic enters the cluster through a single Istio Gateway that terminates TLS. No service is exposed directly to the internet.

Internet → AWS ALB → Istio IngressGateway → OAuth2 Proxy → Calabi Service

Authentication & Authorization

LayerMechanism
Identity providerSSO via SAML 2.0 / OIDC (your corporate IdP)
Token issuanceOAuth2 Authorization Code flow
Session validationOAuth2 Proxy sidecar on every service
API accessShort-lived JWT bearer tokens
Service-to-serviceIstio mTLS (mutual TLS inside the mesh)

Network Isolation

  • All inter-service traffic stays within the VPC via private DNS
  • No public endpoints for any data plane resources
  • AWS Security Groups restrict inbound access to ALB only
  • Istio AuthorizationPolicy enforces per-service allow-lists
  • Egress to external SaaS connectors (for Calabi Connect) routes through a NAT gateway with a static IP that you can allowlist at source systems

Component Overview

ComponentPurposeDefault PathTier
Calabi CatalogueData discovery, lineage, glossary & governance/CE+
CalabiIQBI dashboards, SQL Lab, scheduled reports/bianalystCE+
Data GovernanceTags, classification, owners, access policies/CE+
Calabi ConnectELT data ingestion from 90+ connectors/dataingestionStarter+
Calabi PipelinesDAG authoring, scheduling & monitoring/airflowStarter+
Calabi NotebooksInteractive data science notebooks/notebooksStarter+
Calabi IDEBrowser-based VS Code for pipeline development/ideStarter+
Calabi MLExperiment tracking, model registry & artefacts/mlflowProfessional+
Calabi AI AgentNatural language queries over your data stackBuilt-inProfessional+
Calabi AI BuilderCustom AI chatflows and LLM pipeline builder/aibuilderProfessional+
AI ChatPrivate local LLM inference (air-gapped)/openwebuiProfessional+
Calabi AutomateNo-code workflow automation and alerting/automateProfessional+
Cloud OperationsCloud inventory, compliance, cost analytics/cloudEnterprise
Monitoring & LogsMetrics, dashboards, alerts, log aggregation/monitoringEnterprise

What's Next