RAG Agents in Calabi AI Builder
Retrieval Augmented Generation (RAG) is the technique of supplying a language model with relevant excerpts from your own documents at query time, so it can answer questions grounded in your organization's actual knowledge — not just its training data. Calabi AI Builder provides a visual, no-code environment for building production RAG agents.
What Is RAG?
A plain LLM has no knowledge of your internal documents, policies, data dictionaries, or proprietary research. RAG solves this by:
- Indexing — breaking your documents into small chunks, converting each chunk into a numerical vector (embedding), and storing those vectors in a vector database.
- Retrieval — when a user asks a question, converting the question into the same vector space and finding the most semantically similar document chunks.
- Augmentation — injecting those relevant chunks into the LLM's prompt as context.
- Generation — the LLM generates an answer that is grounded in the retrieved content.
This dramatically reduces hallucination, keeps responses up-to-date with your latest documents, and provides attributable source citations.
RAG Pipeline Architecture
Document Loaders
Document loaders bring your content into the AI Builder pipeline. Choose the loader that matches your document source.
PDF Loader
Supported extensions: .pdf
Settings:
- Usage: "One document per file" or "One document per page"
- Split pages: If enabled, each page becomes a separate document
- OCR: Enable for scanned PDFs (adds processing time)
CSV Loader
Supported extensions: .csv
Settings:
- Column: Which column(s) to include in the text chunk
- Separator: Comma (default), tab, semicolon
- Include metadata columns: Select columns to attach as metadata (not embedded, but returnable)
Web Scraper Loader
Settings:
- URL: The page or sitemap URL to scrape
- Scrape Type: "Single page", "Entire site (sitemap)", "Crawl links"
- Max depth: How many link levels to follow (for crawl mode)
- Include selectors: CSS selectors to include (e.g., "article.content")
- Exclude selectors: CSS selectors to exclude (e.g., "nav, footer")
S3 Loader
Settings:
- Bucket: S3 bucket name
- Prefix: Folder prefix (e.g., "documents/hr/")
- File types: PDF, TXT, DOCX, CSV
- AWS Region: Region where the bucket is hosted
- Credentials: AWS credential (configured in AI Builder secrets)
Supported Document Formats
| Format | Loader | Notes |
|---|---|---|
| PDF Loader | Supports OCR for scanned documents | |
| Word (.docx) | Docx Loader | Preserves heading structure |
| Plain text (.txt) | Text Loader | UTF-8 encoding |
| CSV | CSV Loader | Each row can become a document |
| Excel (.xlsx) | Excel Loader | One sheet per document |
| Markdown (.md) | Markdown Loader | Structure-aware splitting |
| HTML | Web Loader / HTML Loader | Strips tags, preserves text |
| JSON | JSON Loader | Configurable field extraction |
| Confluence | Confluence Loader | Authenticated API pull |
| Notion | Notion Loader | Authenticated API pull |
Text Splitting Strategies
Text splitters determine how documents are chunked before embedding. The chunking strategy significantly impacts retrieval quality.
| Splitter | Strategy | Best For |
|---|---|---|
| Recursive Character | Splits on , , , "" in order until chunks are small enough | General-purpose; works well for most documents |
| Character | Splits on a single separator (default: ) | Documents with consistent paragraph breaks |
| Token | Splits at token boundaries (respects LLM context window) | When you need exact token counts |
| Markdown | Splits at markdown heading boundaries | Technical documentation, README files |
| Code | Language-aware splitting for code files | Source code indexing |
| HTML | Splits at HTML tag boundaries | Web-scraped content |
Recommended settings for most knowledge base use cases:
| Parameter | Recommended Value | Rationale |
|---|---|---|
chunk_size | 1,000 characters | Fits enough context in one chunk without losing focus |
chunk_overlap | 200 characters | Prevents information loss at chunk boundaries |
| Splitter | Recursive Character | Handles mixed document styles gracefully |
Vector Stores
A vector store is a specialized database optimized for fast similarity search over high-dimensional embedding vectors.
| Vector Store | Backend | When to Use |
|---|---|---|
| Postgres (pgvector) | Calabi metadata DB | Default for all Calabi deployments. No external service required. |
| Pinecone | Pinecone SaaS | Extremely large knowledge bases (>10M chunks) needing sub-100ms retrieval. |
| Weaviate | Self-hosted / SaaS | When you need hybrid (keyword + vector) search. |
| Qdrant | Self-hosted | High-performance on-premise deployments. |
| Chroma | In-process | Development and testing only; not for production. |
Calabi's default is Postgres pgvector. It is provisioned automatically with every Calabi Enterprise deployment and requires no additional configuration for knowledge bases under ~1M document chunks.
Creating a Vector Store in AI Builder
- Open a chatflow in AI Builder.
- Drag a Postgres Vector Store node onto the canvas.
- In the configuration drawer:
- Table name: Give the store a unique name (e.g.,
hr_policy_store). - Embedding model: Select the embedding node connected upstream.
- Operation:
Upsert(index new documents) orSimilarity Search(query mode).
- Table name: Give the store a unique name (e.g.,
- Connect a Document Loader → Text Splitter → Embedding → Vector Store for indexing.
- Connect a Vector Store → Retriever for query time.
Embeddings
Embeddings are numerical representations of text that capture semantic meaning. Similar texts produce similar vectors, enabling semantic search.
| Embedding Model | Provider | Dimensions | Cost Profile | Best For |
|---|---|---|---|---|
text-embedding-3-small | OpenAI | 1,536 | Low | General-purpose; excellent quality-to-cost ratio |
text-embedding-3-large | OpenAI | 3,072 | Higher | Maximum accuracy for complex domain knowledge |
text-embedding-ada-002 | OpenAI | 1,536 | Low | Legacy; use 3-small for new projects |
nomic-embed-text | Calabi Local Models | 768 | Free (local compute) | Air-gapped environments, sensitive data |
mxbai-embed-large | Calabi Local Models | 1,024 | Free (local compute) | On-premise deployments needing good quality |
amazon.titan-embed-text-v2 | AWS Bedrock | 1,024 | Pay-per-token | Customers standardized on AWS |
The embedding model used during indexing and the one used during query-time retrieval must be identical. Switching models requires re-indexing all documents.
Similarity Search Configuration
The Vector Store Retriever node controls how documents are retrieved.
| Parameter | Default | Description |
|---|---|---|
| Top K | 4 | Number of chunks to retrieve per query. Increase for broader context; decrease for precision. |
| Similarity Threshold | 0.7 | Minimum cosine similarity score (0–1). Chunks below this threshold are excluded. |
| Search Type | similarity | similarity (pure vector), mmr (Maximum Marginal Relevance — reduces duplicate chunks) |
| Fetch K (MMR only) | 20 | Number of candidates to fetch before MMR re-ranking |
| Lambda (MMR only) | 0.5 | Balance between relevance (1.0) and diversity (0.0) |
Tuning guidance:
- If answers are incomplete → increase Top K to 6–8.
- If unrelated chunks are included → increase Similarity Threshold to 0.75–0.85.
- If multiple chunks repeat the same information → switch to MMR search type.
Building a Company Knowledge Base Agent
Step 1: Prepare Your Documents
Organize your documents by domain. For a company knowledge base, a typical structure:
documents/
├── hr/
│ ├── employee_handbook.pdf
│ ├── leave_policy.pdf
│ └── code_of_conduct.pdf
├── finance/
│ ├── expense_policy.pdf
│ └── budget_guidelines.pdf
└── engineering/
├── architecture_overview.pdf
└── on_call_runbook.pdf
Upload all files to an S3 bucket or use the AI Builder document upload interface.
Step 2: Create the Indexing Flow
- In AI Builder, click + New Chatflow → Template → Document Q&A.
- Configure the Document Loader for your source (S3 or upload).
- Set Text Splitter: Recursive Character, chunk 1000, overlap 200.
- Connect to your embedding model (e.g.,
text-embedding-3-small). - Connect to a Postgres Vector Store, name it
company_kb. - Click Upsert to index all documents. Monitor progress in the logs panel.
Step 3: Create the Query Flow
- Add a Chat Prompt Template with this system message:
You are a helpful assistant for Acme Corp employees.
Answer questions using only the information in the context below.
If the answer is not in the context, say "I don't have that information
in the knowledge base — please contact HR directly."
Context:
{context} - Connect the Postgres Vector Store (in Similarity Search mode) → Retriever → Chat Prompt Template.
- Add your LLM (ChatOpenAI or Chat (Local Models)).
- Add Redis Memory for session persistence.
- Save and test.
Step 4: Deploy and Share
- Click API Endpoint to get the chatflow URL.
- See Embedding AI Builder Chatflows for integration options (iframe, Slack bot, Teams bot, widget).
Re-Indexing Documents
When source documents are updated, re-index to keep the knowledge base current:
- Open the indexing version of the chatflow.
- Click Upsert — by default, existing vectors for the same source file are replaced (upsert semantics).
- Alternatively, use the trigger from Calabi Automate to schedule automatic re-indexing:
Trigger: S3 file upload event
Action: HTTP Request → AI Builder Upsert API
Schedule: Nightly at 02:00 UTC
Related Pages
- Building Chatflows — Canvas overview and all node types
- Local Models — Use local embedding and LLM models for privacy
- Embedding AI Builder Chatflows — Deployment options