Datasets
A dataset in CalabiIQ is the layer between your raw database tables and your charts. Every chart is built on exactly one dataset. Datasets centralise metric definitions, column descriptions, and access controls so that all charts sharing the same dataset stay consistent.
Physical vs Virtual Datasets
CalabiIQ supports two kinds of datasets:
| Type | Definition | When to use |
|---|---|---|
| Physical | Points directly to a single table or view in a connected database | The data is already well-shaped in the warehouse |
| Virtual | Defined by a custom SQL query (a sub-select) | You need to join tables, filter rows, or derive columns before charting |
Creating a Physical Dataset
- Navigate to Data → Datasets in the top navigation.
- Click + Dataset.
- Select the Database from the dropdown.
- Select the Schema.
- Select the Table from the list of available tables.
- Click Add → Add dataset and create chart (to immediately build a chart) or Add dataset (to save without charting).
The new dataset appears in the Datasets list, ready to be explored.
Creating a Virtual Dataset
Use virtual datasets when you need a calculated or joined view that does not exist as a table in the database.
- Navigate to SQL → SQL Lab.
- Write a query that produces the shape you want to chart:
SELECT
DATE_TRUNC('month', exam_date) AS exam_month,
country_name,
COUNT(*) AS total_attempts,
SUM(CASE WHEN passed THEN 1 ELSE 0 END) AS passes,
ROUND(100.0 * SUM(CASE WHEN passed THEN 1 ELSE 0 END) / COUNT(*), 2) AS pass_rate
FROM psychometric_reporting.scores s
JOIN dim.candidates c ON c.candidate_id = s.candidate_id
WHERE exam_date >= '2025-01-01'
GROUP BY 1, 2
- Run the query to verify results.
- Click Explore in the Results toolbar.
- In the Chart Editor, click ··· → Save as dataset.
- Enter a Dataset name (e.g.,
Monthly Pass Rates by Country). - Click Save & Explore.
Use CTEs (WITH clauses) inside virtual dataset SQL to make the logic easy to read and modify later. The full SQL is stored and can be edited from the Dataset editor.
Editing a Dataset
- Navigate to Data → Datasets.
- Find the dataset and click the Edit (pencil) icon in the Actions column.
- The dataset editor opens with four tabs:
| Tab | What you configure |
|---|---|
| Source | Database, schema, table (physical) or SQL (virtual) |
| Metrics | Saved aggregation expressions |
| Columns | Column visibility, data types, descriptions, and temporal settings |
| Settings | Dataset name, description, cache timeout, certification status |
- Make your changes and click Save.
Columns Tab
The Columns tab is where you control how each column appears and behaves in the Chart Editor.
| Setting | Purpose |
|---|---|
| Column name | The raw column name from the database |
| Verbose name | A human-friendly label shown in chart dropdowns (e.g., exam_date → Exam Date) |
| Type | Data type override: STRING, NUMERIC, DATETIME, BOOLEAN |
| Is temporal | Mark as a date/time column so it appears in the Time filter |
| Is filterable | Allow this column to appear in dashboard filter dropdowns |
| Is dimension | Show in the Dimensions (Group By) selector |
| Default endpoint | URL to open when a user clicks on a value in a table chart |
| Description | Plain-text or Markdown description shown as a tooltip in chart editor |
Making a column a time column
If your table has a timestamp column not automatically detected:
- Open the dataset in edit mode.
- Go to the Columns tab.
- Find the column, click Edit.
- Set Type to
DATETIMEand check Is temporal. - Save.
Metrics Tab
Saved metrics let you define reusable aggregations once and use them across many charts.
| Field | Example |
|---|---|
| Metric name | pass_rate |
| Verbose name | Pass Rate (%) |
| Expression | ROUND(100.0 * SUM(CASE WHEN passed THEN 1 ELSE 0 END) / COUNT(*), 2) |
| Description | Percentage of exam attempts that resulted in a pass. |
See Metrics for the full guide to creating and certifying metrics.
Certifying a Dataset
Certification signals to other users that a dataset has been reviewed and is safe to use for reporting.
- Open the dataset editor → Settings tab.
- Click Mark as Certified.
- Enter a Certifier name and optional Certification details (e.g.,
Verified by Data Engineering on 2026-01-15). - Click Save.
Certified datasets display a gold checkmark badge in the Datasets list and in the Chart Editor's dataset selector.
Dataset Access Control
Dataset access is managed through CalabiIQ roles. A user must have access to a dataset to build charts from it or view dashboards that use it.
| Role | Access level |
|---|---|
| Admin | Full access to all datasets |
| Alpha | Can create and edit datasets in their permitted databases |
| Gamma | Read-only; can only view charts and dashboards using datasets they have been granted access to |
To grant a user access to a specific dataset:
- Open the dataset editor → Settings tab.
- Under Owners, add the user's name.
- Alternatively, configure database-level access in Settings → Database Connections.
Dataset Cache
CalabiIQ caches query results to reduce database load. By default, the cache TTL is set globally by your administrator.
To set a dataset-specific cache timeout:
- Open the dataset editor → Settings tab.
- Set Cache Timeout (in seconds). Examples:
300(5 min),3600(1 hr),0(no cache). - Click Save.
Virtual datasets (SQL-based) are cached per unique query. If your SQL uses NOW() or similar dynamic expressions, set the cache timeout to 0 or a very short value.
Deleting a Dataset
Before deleting, check which charts depend on the dataset:
- Open Data → Datasets.
- Find the dataset and click its name to view its details.
- Note the Charts count shown in the dataset header.
- Click ··· → Delete.
- Confirm the deletion in the dialog.
Deleting a dataset will break all charts and dashboards that use it. There is no automatic recovery — export affected dashboards as JSON before deleting.