Governance Overview
Governance in Calabi spans every layer of the platform — from how users authenticate, to who can run which queries, to how sensitive data is classified and audited. This section describes the governance model and links to detailed configuration guides for each area.
Calabi's governance philosophy is default-secure: no user has access to anything until access is explicitly granted, all actions are logged, and data classification is a first-class citizen alongside schema definitions.
Governance Architecture
Every request passes through all five layers in sequence — allowed only when every layer's checks are satisfied.
Layer 1 — Identity (SSO)
Calabi does not maintain its own user password database. All user identity is delegated to your corporate Identity Provider (IdP) via SAML 2.0 or OIDC.
Supported providers include:
- Okta
- Microsoft Azure Active Directory
- Google Workspace
- Any SAML 2.0 or OIDC-compliant provider
When a user visits calabi.{company}.com, they are immediately redirected to your IdP login page. After authentication, the IdP issues a signed assertion that Calabi validates. No credentials are stored in Calabi.
Group synchronisation: IdP groups can be mapped directly to Calabi teams and roles, so that onboarding and offboarding happen automatically as users join or leave groups in your IdP.
Layer 2 — Authentication (OAuth2)
After SSO verification, the OAuth2 Proxy issues a short-lived JWT access token and a refresh token stored in an httpOnly cookie. Every subsequent request from the browser or API client carries this token.
| Token type | Lifetime | Purpose |
|---|---|---|
| Access token (JWT) | 1 hour | Authorises individual API requests |
| Refresh token | 8 hours (session) | Renews access tokens without re-login |
| API key | Until revoked | Used by programmatic clients and integrations |
The JWT payload contains the user's email, team memberships, and assigned roles — all validated without a database lookup on every request.
Layer 3 — Authorization (RBAC)
Calabi uses a role-based access control model. Every user belongs to one or more Teams, and each team is assigned one or more Roles.
Built-in Roles
| Role | Scope | Capabilities |
|---|---|---|
Admin | Platform-wide | Full access: user management, system configuration, all data assets |
Data Engineer | Platform-wide | Manage ingestion, transformation, pipelines, and warehouse schemas |
Data Steward | Domain-scoped | Manage metadata, classify data, approve data product requests |
Analyst | Workspace-scoped | Query governed data assets, create dashboards, export data |
Viewer | Asset-scoped | Read-only access to dashboards and data assets |
AI User | Platform-wide | Access Calabi AI Agent and Calabi AI Builder for query and exploration |
Custom Roles
Admins can create custom roles with granular permission sets. Permissions are grouped by resource type (tables, dashboards, pipelines, models) and action (view, edit, delete, run, export).
Teams
Teams are the organisational unit for access control. A user's effective permissions are the union of all roles assigned to all their teams. Teams map directly to business domains (e.g., Commerce Data Team, Finance Analytics, ML Platform).
Layer 4 — Data Access Policies
Beyond RBAC, Calabi supports fine-grained data access policies that apply at the data layer — independent of which tool is querying.
Tag-Based Policies
Tables, columns, and data products can be tagged with classification labels (e.g., PII, Confidential, Internal). Policies can then be attached to these tags:
PIItag → mask column values for all users except Data StewardsConfidentialtag → restrict row access to members of the Finance teamInternaltag → allow all authenticated users to query
Row-Level Security
Policies can restrict which rows a given user or team sees when querying a table. For example, a regional sales team might only see rows where region = 'APAC', enforced transparently at query time.
Data Product Access Grants
Data Products in Calabi Catalogue are curated, governed collections of data assets. Access to a Data Product can be requested through a self-service workflow and approved by the owning Data Steward — without any involvement from engineering or IT.
Layer 5 — Audit and Compliance
All user actions and system events in Calabi are written to an immutable audit log. The audit log captures:
| Event category | Examples |
|---|---|
| Authentication | Login, logout, failed login, token refresh |
| Data access | Table query, dashboard view, data export |
| Metadata changes | Tag applied, owner changed, description edited |
| Pipeline operations | DAG triggered, connection created, sync started |
| Administration | User added, role changed, policy created/deleted |
Audit logs are:
- Stored in your AWS account (S3 + CloudWatch)
- Append-only — Calabi cannot modify or delete them
- Queryable via CalabiIQ for compliance reporting
- Exportable in JSON format for SIEM integration
Data Retention
Default retention policies per tier:
| Tier | Audit log retention | Query history retention |
|---|---|---|
| Starter | 90 days | 30 days |
| Professional | 1 year | 90 days |
| Enterprise | 7 years (configurable) | 1 year |
Governance Sections
Administration
User accounts, team management, workspace settings, and platform configuration. Covers how to add users, configure SSO, manage service accounts, and set organisation-level defaults.
Security and Access Control
Step-by-step guides for configuring RBAC roles, setting up SSO with your IdP, creating and managing API keys, and configuring IP allowlists.
Go to Security and Access Control →
Compliance
Audit log access, data retention configuration, regulatory framework guidance (SOC 2, GDPR, HIPAA), and how to run compliance reports in CalabiIQ.
Multi-Tenancy
How tenant isolation works in Calabi, per-tenant configuration overrides, and how to onboard a new tenant without affecting existing deployments.
What's Next
- Platform Architecture — Understand the infrastructure security model (VPC, Istio mTLS, IRSA)
- Data Management Overview — How governance integrates with the data lifecycle
- Calabi Catalogue — The governance control surface for metadata, tags, and lineage