Data Tests

Professional+

Calabi Transform's data testing framework lets you assert expectations about your data and get alerted when those expectations are violated. Tests run against the warehouse after models are built, catching data quality issues before they reach dashboards or downstream consumers. Test failures surface in the run log, in Calabi Catalogue's data quality panel, and in any configured alert channels.

Test Types Overview

Type	Where It Lives	What It Does
Generic	`.yml` files alongside models/sources	Reusable parameterised test applied to columns
Singular	`tests/` directory as `.sql` files	One-off SQL assertion custom to your data
Custom generic	`tests/generic/` as Jinja macros	Build your own reusable generic test

Generic Tests

Calabi Transform ships with four built-in generic tests that cover the most common data quality requirements.

`not_null`

Asserts that a column contains no NULL values.

models:
  - name: fct_orders
    columns:
      - name: order_id
        data_tests:
          - not_null
      - name: customer_id
        data_tests:
          - not_null

`unique`

Asserts that all values in a column are distinct.

models:
  - name: fct_orders
    columns:
      - name: order_id
        data_tests:
          - unique

`accepted_values`

Asserts that a column contains only values from a predefined list.

models:
  - name: stg_orders
    columns:
      - name: status
        data_tests:
          - accepted_values:
              values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']

Set quote: false if comparing against numeric or boolean values:

- accepted_values:
    values: [1, 2, 3]
    quote: false

`relationships`

Asserts referential integrity — every value in the column exists in a related table.

models:
  - name: fct_orders
    columns:
      - name: customer_id
        data_tests:
          - relationships:
              to: ref('dim_customers')
              field: customer_id

To test against a source table:

- relationships:
    to: source('jaffle_shop', 'customers')
    field: id

Applying Multiple Tests to One Column

You can stack as many tests as needed on a single column:

models:
  - name: fct_orders
    columns:
      - name: order_id
        data_tests:
          - not_null
          - unique
      - name: status
        data_tests:
          - not_null
          - accepted_values:
              values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
      - name: customer_id
        data_tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id
      - name: amount_dollars
        data_tests:
          - not_null

Singular Tests

Singular tests are custom SQL files in the tests/ directory. They are flexible enough to express any assertion your business logic requires. A singular test passes if it returns zero rows — returning rows means the assertion was violated.

Example: No negative revenue

-- tests/assert_orders_have_positive_revenue.sql

select
    order_id,
    amount_dollars
from {{ ref('fct_orders') }}
where amount_dollars < 0

Example: Payment amounts must not exceed order amounts

-- tests/assert_payments_not_greater_than_orders.sql

select
    o.order_id,
    o.amount_dollars     as order_amount,
    p.total_paid
from {{ ref('fct_orders') }} o
join {{ ref('int_order_payments') }} p using (order_id)
where p.total_paid > o.amount_dollars * 1.01  -- allow 1% rounding tolerance

Example: Every order must have a corresponding customer

-- tests/assert_all_orders_have_customers.sql

select o.order_id
from {{ ref('stg_orders') }} o
left join {{ ref('stg_customers') }} c on o.customer_id = c.customer_id
where c.customer_id is null

Test Severity

By default, a test failure causes the test run to exit with an error code (severity: error). You can configure individual tests to emit a warning instead — useful for tests that catch data quality drift without blocking the pipeline.

Setting severity in YAML

models:
  - name: fct_orders
    columns:
      - name: status
        data_tests:
          - accepted_values:
              values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
              config:
                severity: warn   # Warn but don't fail the run

Threshold-based severity (`error_if` / `warn_if`)

Instead of failing on any violation, set thresholds:

- not_null:
    config:
      error_if: ">100"    # Error if more than 100 rows fail
      warn_if:  ">10"     # Warn if more than 10 rows fail

This is useful when a small number of null values is expected (e.g., optional fields on legacy records) but a large number indicates a systemic problem.

Storing Test Failures

Use the --store-failures flag to write failing test rows to the warehouse. This makes debugging much faster — instead of re-running the model and test, you can query the failure table directly.

dbt test --store-failures

Failure tables are written to a schema named dbt_test__audit (configurable) and named after the test. For example, the unique test on fct_orders.order_id writes failures to:

dbt_test__audit.unique_fct_orders_order_id

Persisting failures per model

models:
  - name: fct_orders
    config:
      store_failures: true
    columns:
      - name: order_id
        data_tests:
          - unique
          - not_null

Custom Generic Tests

If you need a reusable test that doesn't exist in Calabi Transform's built-ins, write a custom generic test using Jinja macros.

Example: test that a numeric column is within a range

Create the file at tests/generic/test_between.sql:

-- tests/generic/test_between.sql

{% test between(model, column_name, min_value, max_value) %}

select {{ column_name }}
from {{ model }}
where {{ column_name }} < {{ min_value }}
   or {{ column_name }} > {{ max_value }}

{% endtest %}

Apply it in YAML like any built-in test:

models:
  - name: fct_orders
    columns:
      - name: discount_pct
        data_tests:
          - between:
              min_value: 0
              max_value: 100

Example: test that a date column is not in the future

-- tests/generic/test_not_in_future.sql

{% test not_in_future(model, column_name) %}

select {{ column_name }}
from {{ model }}
where {{ column_name }} > current_timestamp

{% endtest %}

- name: created_at
  data_tests:
    - not_in_future

Running Tests

# Run all tests in the project
dbt test

# Test a specific model
dbt test --select fct_orders

# Test a model and all its upstream dependencies
dbt test --select fct_orders+

# Test only sources
dbt test --select source:*

# Test a specific source
dbt test --select source:jaffle_shop

# Run tests with a tag
dbt test --select tag:critical

# Run tests and store failures
dbt test --store-failures

# Run tests in parallel (default threads from profiles.yml)
dbt test --threads 8

Test Results in Calabi Catalogue

Every test run result is published to Calabi Catalogue, where you can:

View current test pass/fail status per model column
See historical test results and trend charts (is data quality improving or degrading?)
Filter tables by data quality score
Set up notifications when a test transitions from passing to failing

Navigate to Calabi Catalogue → [Table Name] → Data Quality to see results for a specific model.

Tag critical tests

Apply a critical tag to tests that should block downstream use of a model. This allows you to filter by criticality in Calabi Catalogue and configure stricter alerting.

- not_null:
    config:
      tags: ['critical']

Testing Best Practices

Practice	Recommendation
Primary keys	Apply `unique` + `not_null` to every primary key column
Foreign keys	Apply `relationships` to every foreign key column
Enumerated fields	Apply `accepted_values` to status, type, and category columns
Amounts	Apply a custom `between` test or singular test for reasonable ranges
Critical models	Tag key tests as `critical` and configure alerts in Calabi Catalogue
Source tests	Apply `not_null` and `unique` to source primary keys — catch issues at the raw layer
Severity	Use `warn` for soft rules, `error` for hard data contracts

Next Steps

Running Transformations — Execute your models and tests via Calabi Pipelines or CLI
Sources & Raw Data — Apply tests to raw source tables
Models — Understand model structure and materialisation

Test Types Overview​

Generic Tests​

not_null​

unique​

accepted_values​

relationships​

Applying Multiple Tests to One Column​

Singular Tests​

Example: No negative revenue​

Example: Payment amounts must not exceed order amounts​

Example: Every order must have a corresponding customer​

Test Severity​

Setting severity in YAML​

Threshold-based severity (error_if / warn_if)​

Storing Test Failures​

Persisting failures per model​

Custom Generic Tests​

Example: test that a numeric column is within a range​

Example: test that a date column is not in the future​

Running Tests​

Test Results in Calabi Catalogue​

Testing Best Practices​

Next Steps​