Skip to main content

Datasets

Starter+

A dataset in CalabiIQ is the layer between your raw database tables and your charts. Every chart is built on exactly one dataset. Datasets centralise metric definitions, column descriptions, and access controls so that all charts sharing the same dataset stay consistent.


Physical vs Virtual Datasets

CalabiIQ supports two kinds of datasets:

TypeDefinitionWhen to use
PhysicalPoints directly to a single table or view in a connected databaseThe data is already well-shaped in the warehouse
VirtualDefined by a custom SQL query (a sub-select)You need to join tables, filter rows, or derive columns before charting

Creating a Physical Dataset

  1. Navigate to DataDatasets in the top navigation.
  2. Click + Dataset.
  3. Select the Database from the dropdown.
  4. Select the Schema.
  5. Select the Table from the list of available tables.
  6. Click AddAdd dataset and create chart (to immediately build a chart) or Add dataset (to save without charting).

The new dataset appears in the Datasets list, ready to be explored.


Creating a Virtual Dataset

Use virtual datasets when you need a calculated or joined view that does not exist as a table in the database.

  1. Navigate to SQLSQL Lab.
  2. Write a query that produces the shape you want to chart:
SELECT
DATE_TRUNC('month', exam_date) AS exam_month,
country_name,
COUNT(*) AS total_attempts,
SUM(CASE WHEN passed THEN 1 ELSE 0 END) AS passes,
ROUND(100.0 * SUM(CASE WHEN passed THEN 1 ELSE 0 END) / COUNT(*), 2) AS pass_rate
FROM psychometric_reporting.scores s
JOIN dim.candidates c ON c.candidate_id = s.candidate_id
WHERE exam_date >= '2025-01-01'
GROUP BY 1, 2
  1. Run the query to verify results.
  2. Click Explore in the Results toolbar.
  3. In the Chart Editor, click ···Save as dataset.
  4. Enter a Dataset name (e.g., Monthly Pass Rates by Country).
  5. Click Save & Explore.
Keeping virtual datasets maintainable

Use CTEs (WITH clauses) inside virtual dataset SQL to make the logic easy to read and modify later. The full SQL is stored and can be edited from the Dataset editor.


Editing a Dataset

  1. Navigate to DataDatasets.
  2. Find the dataset and click the Edit (pencil) icon in the Actions column.
  3. The dataset editor opens with four tabs:
TabWhat you configure
SourceDatabase, schema, table (physical) or SQL (virtual)
MetricsSaved aggregation expressions
ColumnsColumn visibility, data types, descriptions, and temporal settings
SettingsDataset name, description, cache timeout, certification status
  1. Make your changes and click Save.

Columns Tab

The Columns tab is where you control how each column appears and behaves in the Chart Editor.

SettingPurpose
Column nameThe raw column name from the database
Verbose nameA human-friendly label shown in chart dropdowns (e.g., exam_dateExam Date)
TypeData type override: STRING, NUMERIC, DATETIME, BOOLEAN
Is temporalMark as a date/time column so it appears in the Time filter
Is filterableAllow this column to appear in dashboard filter dropdowns
Is dimensionShow in the Dimensions (Group By) selector
Default endpointURL to open when a user clicks on a value in a table chart
DescriptionPlain-text or Markdown description shown as a tooltip in chart editor

Making a column a time column

If your table has a timestamp column not automatically detected:

  1. Open the dataset in edit mode.
  2. Go to the Columns tab.
  3. Find the column, click Edit.
  4. Set Type to DATETIME and check Is temporal.
  5. Save.

Metrics Tab

Saved metrics let you define reusable aggregations once and use them across many charts.

FieldExample
Metric namepass_rate
Verbose namePass Rate (%)
ExpressionROUND(100.0 * SUM(CASE WHEN passed THEN 1 ELSE 0 END) / COUNT(*), 2)
DescriptionPercentage of exam attempts that resulted in a pass.

See Metrics for the full guide to creating and certifying metrics.


Certifying a Dataset

Certification signals to other users that a dataset has been reviewed and is safe to use for reporting.

  1. Open the dataset editor → Settings tab.
  2. Click Mark as Certified.
  3. Enter a Certifier name and optional Certification details (e.g., Verified by Data Engineering on 2026-01-15).
  4. Click Save.

Certified datasets display a gold checkmark badge in the Datasets list and in the Chart Editor's dataset selector.


Dataset Access Control

Dataset access is managed through CalabiIQ roles. A user must have access to a dataset to build charts from it or view dashboards that use it.

RoleAccess level
AdminFull access to all datasets
AlphaCan create and edit datasets in their permitted databases
GammaRead-only; can only view charts and dashboards using datasets they have been granted access to

To grant a user access to a specific dataset:

  1. Open the dataset editor → Settings tab.
  2. Under Owners, add the user's name.
  3. Alternatively, configure database-level access in SettingsDatabase Connections.

Dataset Cache

CalabiIQ caches query results to reduce database load. By default, the cache TTL is set globally by your administrator.

To set a dataset-specific cache timeout:

  1. Open the dataset editor → Settings tab.
  2. Set Cache Timeout (in seconds). Examples: 300 (5 min), 3600 (1 hr), 0 (no cache).
  3. Click Save.
note

Virtual datasets (SQL-based) are cached per unique query. If your SQL uses NOW() or similar dynamic expressions, set the cache timeout to 0 or a very short value.


Deleting a Dataset

Before deleting, check which charts depend on the dataset:

  1. Open DataDatasets.
  2. Find the dataset and click its name to view its details.
  3. Note the Charts count shown in the dataset header.
  4. Click ···Delete.
  5. Confirm the deletion in the dialog.
warning

Deleting a dataset will break all charts and dashboards that use it. There is no automatic recovery — export affected dashboards as JSON before deleting.