Skip to main content

Data Quality Tests

Professional+

Data Quality Tests in Calabi Catalogue give you automated, continuously-run assertions over your data. Tests run on a schedule, results are stored historically, and failures trigger alerts — so data problems are caught before they reach dashboards and reports.


Test Architecture


Test Types

Test typeWhat it checksExample
Not NullColumn has no NULL valuescandidate_id IS NOT NULL
UniqueAll values in a column are distinctattempt_id is unique
Not In SetValues do not appear in a forbidden liststatus NOT IN ('voided', 'test')
In SetValues are restricted to an allowed listregion IN ('EMEA', 'AMER', 'APAC')
MinimumAll values meet a lower boundscore >= 0
MaximumAll values meet an upper boundscore <= 100
Between (Range)Values fall within a rangepass_threshold BETWEEN 50 AND 90
Min Row CountTable has at least N rowsAt least 1,000 rows in scores
Max Row CountTable has at most N rowsAt most 100M rows in raw_events
Row Count BetweenRow count falls in a rangescores between 500K and 50M rows
Column Values to Be of TypeColumn data type matches expectationexam_date is a timestamp
Custom SQLArbitrary SQL assertionSee examples below

Creating a Test Suite

A Test Suite is a named collection of tests for one or more tables. Tests must belong to a suite before they can be scheduled.

Step 1 — Create the suite

  1. Navigate to Data QualityTest Suites+ New Test Suite.
  2. Enter a Name (e.g., Exam Scores — Core Quality Checks).
  3. Add a Description and Owners.
  4. Click Save.

Step 2 — Add tables to the suite

  1. Open the suite.
  2. Click + Add table.
  3. Search for and select the tables you want to test.

Step 3 — Add test cases

  1. Inside the suite, select a table.
  2. Click + Add test case.
  3. Configure the test (see below).
  4. Click Save.

Configuring Individual Tests

Not Null test

FieldValue
Test namecandidate_id_not_null
Columncandidate_id
Test typeColumn Values To Not Be Null
Fail threshold0 (zero nulls allowed)

Unique test

FieldValue
Test nameattempt_id_unique
Columnattempt_id
Test typeColumn Values To Be Unique

Range test

FieldValue
Test namescore_range_0_to_100
Columnscore
Test typeColumn Values To Be Between
Min value0
Max value100

Row count test

FieldValue
Test namescores_min_row_count
Test typeTable Row Count To Be Between
Min value500000
Max value— (leave blank for no upper limit)

Custom SQL Tests

Custom SQL tests let you write any assertion that evaluates to a single numeric value. The test passes when the result equals zero (no failures found).

-- Test: no future exam dates
-- Pass condition: COUNT = 0 (no rows with exam_date in the future)
SELECT COUNT(*)
FROM psychometric_reporting.scores
WHERE exam_date > CURRENT_TIMESTAMP;
-- Test: pass rate is not implausibly high (> 99%)
-- Pass condition: result = 0 (rate is within plausible range)
SELECT CASE
WHEN SUM(CASE WHEN passed THEN 1 ELSE 0 END) * 1.0 / COUNT(*) > 0.99
THEN 1
ELSE 0
END
FROM psychometric_reporting.scores
WHERE exam_date >= CURRENT_DATE - INTERVAL '1 day';
-- Test: no orphaned candidate_id references
SELECT COUNT(*)
FROM psychometric_reporting.scores s
WHERE NOT EXISTS (
SELECT 1 FROM dim.candidates c WHERE c.candidate_id = s.candidate_id
);
-- Test: no duplicate (candidate_id, exam_id, exam_date) combinations
SELECT COUNT(*) - COUNT(DISTINCT candidate_id || '|' || exam_id || '|' || exam_date::text)
FROM psychometric_reporting.scores;

Custom SQL test — configuration

FieldValue
Test nameno_future_exam_dates
Test typeCustom SQL
SQL expressionPaste your SQL (must return a single number)
Pass conditionEqualTo
Pass value0

Scheduling Tests

  1. Open the test suite.
  2. Click Schedule Pipeline+ Add ingestion.
  3. Configure the schedule:
FieldDescriptionExample
NamePipeline namescores-quality-daily
ScheduleCron expression0 6 * * * (daily at 6 AM)
TimezoneCron evaluation timezoneUTC
Retry countRetries on failure2
  1. Click Deploy.

The test pipeline appears in SettingsIngestion Pipelines and can be monitored, paused, or run manually from there.

Data freshnessRecommended test schedule
Real-time / near-real-timeEvery 15–60 minutes
Daily ETL30–60 minutes after the ETL completes
Weekly batchWeekly, 2 hours after batch window
Archived / reference dataMonthly

Running Tests Manually

  1. Navigate to the asset page → Quality tab.
  2. Click Run TestsRun all tests (or select specific tests).
  3. Results appear within seconds to minutes depending on table size.

Or from the test suite:

  1. Data QualityTest Suites → open the suite.
  2. Click Run Now (top-right).

Viewing Test Results

On the asset page

  1. Open any table's asset page.
  2. Click the Quality tab.

The Quality tab shows:

  • A summary badge: Pass / Fail / Warning count.
  • A trend graph of historical pass rates.
  • Per-test results with status, last run time, and failure details.

Test result status codes

StatusMeaning
SuccessThe assertion passed with zero failures
FailedThe assertion returned a non-zero count
AbortedThe test run was cancelled or timed out
QueuedThe test is scheduled but not yet started
RunningCurrently executing

Viewing failure details

Click on a Failed test to see:

  • The number of rows that violated the assertion.
  • A sample of up to 10 failing values (for column-level tests).
  • The SQL generated for the test (expandable).
  • The full error message if the test threw an exception.

Alerting on Test Failures

Configure alerts so the right team is notified immediately when quality degrades.

  1. Navigate to SettingsAlerts & Reports (Admin) or GovernanceQuality Alerts.
  2. Click + New Alert.
  3. Configure:
FieldDescriptionExample
NameAlert nameScores Table — Critical Failure Alert
EntityTable or test suite to watchpsychometric_reporting.scores
TriggerCondition to alert onAny test failure
RecipientsEmail addressesdata-engineering@yourcompany.com
Slack channelOptional Slack destination#data-alerts
Webhook URLOptional HTTP endpointPagerDuty / OpsGenie integration URL
  1. Click Save.

Alert channels

ChannelSetup
EmailConfigure SMTP under SettingsEmail; add recipient email addresses
SlackConfigure Slack integration; specify channel name with #
WebhookPOST a JSON payload to any HTTP endpoint (PagerDuty, OpsGenie, Teams)

dbt Test Integration

If your Calabi Transform (dbt) project includes tests, Calabi Catalogue automatically ingests dbt test results as part of the dbt metadata ingestion pipeline.

dbt test results appear on the same Quality tab as native Catalogue tests, clearly labelled with their source:

TestSource
not_null:candidate_iddbt
unique:attempt_iddbt
accepted_values:regiondbt
no_future_exam_datesCatalogue Custom SQL

This gives you a single quality dashboard per table regardless of whether tests were defined in dbt or Calabi Catalogue.


Test Result History

The Catalogue retains full result history for every test. Historical results power:

  • Trend charts on the Quality tab — see pass rates rise or fall over time.
  • SLO tracking — compute uptime and reliability metrics for critical tables.
  • Incident investigation — identify exactly when a failure first appeared.

History is retained for 90 days on Professional tier. Contact your administrator to configure custom retention.


Quality Score

Each table in Calabi Catalogue is automatically assigned a Quality Score (0–100%) based on the ratio of passing tests to total tests in the most recent run.

ScoreBadge
90–100%Green
70–89%Yellow
< 70%Red

Quality Scores appear on asset pages, in search results, and can be used as a filter in the Explore page.


Best Practices

PracticeWhy
Test every column that appears in a certified metricEnsures metric inputs are clean
Add a row count test to every critical tableCatch missing data loads before dashboards show wrong numbers
Use custom SQL tests for business-rule assertions that schema tests can't expressMaximum coverage
Set failure alerts for Gold-tier tablesImmediate notification when production data degrades
Run tests after every ETL job, not just on a time scheduleAligns test results to data freshness
Review and update test thresholds as data growsPrevents false failures from legitimate volume growth