Experiments in Calabi ML

Professional+

An experiment in Calabi ML is a named container that groups related training runs. Every time you train a model, tune hyperparameters, or evaluate a new algorithm, that execution is recorded as a run inside an experiment. Calabi ML gives you a structured, searchable history of every run — including the parameters used, the metrics achieved, and the artifacts produced.

Core Concepts

Concept	Description
Experiment	A named project-level container (e.g., `churn-prediction-v2`)
Run	One execution of a training script — the atomic unit of tracking
Parameter	A configuration value set before training begins (e.g., `learning_rate: 0.01`)
Metric	A measured outcome logged during or after training (e.g., `val_auc: 0.87`)
Tag	A free-form key-value label on a run (e.g., `env: production`, `dataset: v3`)
Artifact	A file saved during a run (model weights, plots, feature importance CSVs)

Experiment Structure Diagram

Accessing Calabi ML

Calabi ML is accessible at https://calabi.<your-domain>/mlflow. Your existing Calabi SSO credentials grant access — no separate login is required.

Creating an Experiment

Via the Python SDK

import mlflow

# Connect to your Calabi ML instance
mlflow.set_tracking_uri("https://calabi.<your-domain>/mlflow")

# Create a new experiment (or use an existing one if the name already exists)
experiment_id = mlflow.create_experiment(
    name="churn-prediction-v2",
    tags={
        "team": "data-science",
        "project": "retention",
        "model_type": "binary_classification",
    }
)

print(f"Created experiment with ID: {experiment_id}")

Via the Calabi ML UI

Navigate to Calabi ML from the Calabi sidebar.
Click + Create Experiment (top-right of the experiments list).
Enter a name (e.g., churn-prediction-v2).
Optionally add artifact location (defaults to the configured S3 bucket).
Click Create.

Setting the Active Experiment

Once an experiment exists, set it as the target for all subsequent runs in your Python session:

mlflow.set_experiment("churn-prediction-v2")

Or combine creation and activation:

experiment = mlflow.set_experiment("churn-prediction-v2")
print(f"Active experiment ID: {experiment.experiment_id}")

Naming Conventions

Well-structured experiment names help the team navigate experiments over time. Recommended conventions:

Pattern	Example	When to Use
`project-version`	`churn-prediction-v2`	Iterating on a model project
`project/feature`	`recommender/collaborative-filtering`	Sub-experiments within a project
`team-project-date`	`ds-nlp-sentiment-2026Q1`	Quarterly exploration experiments
`model-dataset-target`	`xgboost-orders-ltv`	Comparing model types on a target

Use tags for richer metadata

Use experiment tags to encode metadata that doesn't fit neatly in the name: team, project, framework, data_version, jira_ticket. Tags are searchable in the Calabi ML UI.

Organising Experiments by Project

For large teams with many experiments, use a consistent folder-like naming scheme and experiment tags:

# All experiments for the churn project
mlflow.set_experiment("churn/logistic_baseline")
mlflow.set_experiment("churn/gradient_boosting")
mlflow.set_experiment("churn/neural_network")
mlflow.set_experiment("churn/ensemble")

The Calabi ML UI groups experiments alphabetically, so a prefix-based hierarchy keeps related experiments together.

Experiment Metadata Reference

Field	Description
`experiment_id`	Auto-generated unique identifier
`name`	Human-readable name (unique per Calabi ML instance)
`artifact_location`	S3 URI where artifacts for this experiment are stored
`lifecycle_stage`	`active` or `deleted`
`tags`	Key-value pairs set at creation or later via `set_experiment_tag()`

Updating Experiment Tags

mlflow.set_experiment_tag("churn-prediction-v2", "status", "production_candidate")
mlflow.set_experiment_tag("churn-prediction-v2", "owner", "alice@example.com")

Searching Experiments

Via the Python SDK:

from mlflow.tracking import MlflowClient

client = MlflowClient(tracking_uri="https://calabi.<your-domain>/mlflow")

# List all active experiments
experiments = client.search_experiments()
for exp in experiments:
    print(f"{exp.experiment_id}: {exp.name}")

# Search by tag
experiments = client.search_experiments(
    filter_string="tags.team = 'data-science'"
)

Via the UI:

Use the search bar at the top of the Experiments page.
Filter by name fragment or tag value.

Deleting Experiments

Deleting an experiment moves it to a deleted lifecycle state. Deleted experiments and their runs are retained in storage but hidden from the default UI view.

client.delete_experiment(experiment_id="3")

To restore a deleted experiment:

client.restore_experiment(experiment_id="3")

Runs are not permanently deleted

Deleting an experiment in Calabi ML does not immediately purge data from S3 or the metadata database. Contact your platform administrator to permanently remove experiment data.

Backend Storage

Component	Storage
Experiment metadata, run metadata, metrics, params, tags	database (`calabi_mlflow_<tenant>`)
Artifacts (model files, plots, data samples)	Amazon S3 (`<tenant>-mlflow-artifacts` bucket)

Calabi ML is configured per-tenant — your experiment data is isolated from other organisations on the platform.

Next Steps

Logging Runs — Record parameters, metrics, and artifacts inside runs
Comparing Runs — Find the best run across an experiment
Model Registry — Promote the best run's model to staging or production

Core Concepts​

Experiment Structure Diagram​

Accessing Calabi ML​

Creating an Experiment​

Via the Python SDK​

Via the Calabi ML UI​

Setting the Active Experiment​

Naming Conventions​

Organising Experiments by Project​

Experiment Metadata Reference​

Updating Experiment Tags​

Searching Experiments​

Deleting Experiments​

Backend Storage​

Next Steps​