Skip to main content

RAG Agents in Calabi AI Builder

Enterprise

Retrieval Augmented Generation (RAG) is the technique of supplying a language model with relevant excerpts from your own documents at query time, so it can answer questions grounded in your organization's actual knowledge — not just its training data. Calabi AI Builder provides a visual, no-code environment for building production RAG agents.


What Is RAG?

A plain LLM has no knowledge of your internal documents, policies, data dictionaries, or proprietary research. RAG solves this by:

  1. Indexing — breaking your documents into small chunks, converting each chunk into a numerical vector (embedding), and storing those vectors in a vector database.
  2. Retrieval — when a user asks a question, converting the question into the same vector space and finding the most semantically similar document chunks.
  3. Augmentation — injecting those relevant chunks into the LLM's prompt as context.
  4. Generation — the LLM generates an answer that is grounded in the retrieved content.

This dramatically reduces hallucination, keeps responses up-to-date with your latest documents, and provides attributable source citations.


RAG Pipeline Architecture


Document Loaders

Document loaders bring your content into the AI Builder pipeline. Choose the loader that matches your document source.

PDF Loader

Supported extensions: .pdf
Settings:
- Usage: "One document per file" or "One document per page"
- Split pages: If enabled, each page becomes a separate document
- OCR: Enable for scanned PDFs (adds processing time)

CSV Loader

Supported extensions: .csv
Settings:
- Column: Which column(s) to include in the text chunk
- Separator: Comma (default), tab, semicolon
- Include metadata columns: Select columns to attach as metadata (not embedded, but returnable)

Web Scraper Loader

Settings:
- URL: The page or sitemap URL to scrape
- Scrape Type: "Single page", "Entire site (sitemap)", "Crawl links"
- Max depth: How many link levels to follow (for crawl mode)
- Include selectors: CSS selectors to include (e.g., "article.content")
- Exclude selectors: CSS selectors to exclude (e.g., "nav, footer")

S3 Loader

Settings:
- Bucket: S3 bucket name
- Prefix: Folder prefix (e.g., "documents/hr/")
- File types: PDF, TXT, DOCX, CSV
- AWS Region: Region where the bucket is hosted
- Credentials: AWS credential (configured in AI Builder secrets)

Supported Document Formats

FormatLoaderNotes
PDFPDF LoaderSupports OCR for scanned documents
Word (.docx)Docx LoaderPreserves heading structure
Plain text (.txt)Text LoaderUTF-8 encoding
CSVCSV LoaderEach row can become a document
Excel (.xlsx)Excel LoaderOne sheet per document
Markdown (.md)Markdown LoaderStructure-aware splitting
HTMLWeb Loader / HTML LoaderStrips tags, preserves text
JSONJSON LoaderConfigurable field extraction
ConfluenceConfluence LoaderAuthenticated API pull
NotionNotion LoaderAuthenticated API pull

Text Splitting Strategies

Text splitters determine how documents are chunked before embedding. The chunking strategy significantly impacts retrieval quality.

SplitterStrategyBest For
Recursive CharacterSplits on , , , "" in order until chunks are small enoughGeneral-purpose; works well for most documents
CharacterSplits on a single separator (default: )Documents with consistent paragraph breaks
TokenSplits at token boundaries (respects LLM context window)When you need exact token counts
MarkdownSplits at markdown heading boundariesTechnical documentation, README files
CodeLanguage-aware splitting for code filesSource code indexing
HTMLSplits at HTML tag boundariesWeb-scraped content

Recommended settings for most knowledge base use cases:

ParameterRecommended ValueRationale
chunk_size1,000 charactersFits enough context in one chunk without losing focus
chunk_overlap200 charactersPrevents information loss at chunk boundaries
SplitterRecursive CharacterHandles mixed document styles gracefully

Vector Stores

A vector store is a specialized database optimized for fast similarity search over high-dimensional embedding vectors.

Vector StoreBackendWhen to Use
Postgres (pgvector)Calabi metadata DBDefault for all Calabi deployments. No external service required.
PineconePinecone SaaSExtremely large knowledge bases (>10M chunks) needing sub-100ms retrieval.
WeaviateSelf-hosted / SaaSWhen you need hybrid (keyword + vector) search.
QdrantSelf-hostedHigh-performance on-premise deployments.
ChromaIn-processDevelopment and testing only; not for production.

Calabi's default is Postgres pgvector. It is provisioned automatically with every Calabi Enterprise deployment and requires no additional configuration for knowledge bases under ~1M document chunks.

Creating a Vector Store in AI Builder

  1. Open a chatflow in AI Builder.
  2. Drag a Postgres Vector Store node onto the canvas.
  3. In the configuration drawer:
    • Table name: Give the store a unique name (e.g., hr_policy_store).
    • Embedding model: Select the embedding node connected upstream.
    • Operation: Upsert (index new documents) or Similarity Search (query mode).
  4. Connect a Document Loader → Text Splitter → Embedding → Vector Store for indexing.
  5. Connect a Vector Store → Retriever for query time.

Embeddings

Embeddings are numerical representations of text that capture semantic meaning. Similar texts produce similar vectors, enabling semantic search.

Embedding ModelProviderDimensionsCost ProfileBest For
text-embedding-3-smallOpenAI1,536LowGeneral-purpose; excellent quality-to-cost ratio
text-embedding-3-largeOpenAI3,072HigherMaximum accuracy for complex domain knowledge
text-embedding-ada-002OpenAI1,536LowLegacy; use 3-small for new projects
nomic-embed-textCalabi Local Models768Free (local compute)Air-gapped environments, sensitive data
mxbai-embed-largeCalabi Local Models1,024Free (local compute)On-premise deployments needing good quality
amazon.titan-embed-text-v2AWS Bedrock1,024Pay-per-tokenCustomers standardized on AWS
Consistency Requirement

The embedding model used during indexing and the one used during query-time retrieval must be identical. Switching models requires re-indexing all documents.


Similarity Search Configuration

The Vector Store Retriever node controls how documents are retrieved.

ParameterDefaultDescription
Top K4Number of chunks to retrieve per query. Increase for broader context; decrease for precision.
Similarity Threshold0.7Minimum cosine similarity score (0–1). Chunks below this threshold are excluded.
Search Typesimilaritysimilarity (pure vector), mmr (Maximum Marginal Relevance — reduces duplicate chunks)
Fetch K (MMR only)20Number of candidates to fetch before MMR re-ranking
Lambda (MMR only)0.5Balance between relevance (1.0) and diversity (0.0)

Tuning guidance:

  • If answers are incomplete → increase Top K to 6–8.
  • If unrelated chunks are included → increase Similarity Threshold to 0.75–0.85.
  • If multiple chunks repeat the same information → switch to MMR search type.

Building a Company Knowledge Base Agent

Step 1: Prepare Your Documents

Organize your documents by domain. For a company knowledge base, a typical structure:

documents/
├── hr/
│ ├── employee_handbook.pdf
│ ├── leave_policy.pdf
│ └── code_of_conduct.pdf
├── finance/
│ ├── expense_policy.pdf
│ └── budget_guidelines.pdf
└── engineering/
├── architecture_overview.pdf
└── on_call_runbook.pdf

Upload all files to an S3 bucket or use the AI Builder document upload interface.

Step 2: Create the Indexing Flow

  1. In AI Builder, click + New ChatflowTemplateDocument Q&A.
  2. Configure the Document Loader for your source (S3 or upload).
  3. Set Text Splitter: Recursive Character, chunk 1000, overlap 200.
  4. Connect to your embedding model (e.g., text-embedding-3-small).
  5. Connect to a Postgres Vector Store, name it company_kb.
  6. Click Upsert to index all documents. Monitor progress in the logs panel.

Step 3: Create the Query Flow

  1. Add a Chat Prompt Template with this system message:
    You are a helpful assistant for Acme Corp employees.
    Answer questions using only the information in the context below.
    If the answer is not in the context, say "I don't have that information
    in the knowledge base — please contact HR directly."

    Context:
    {context}
  2. Connect the Postgres Vector Store (in Similarity Search mode) → Retriever → Chat Prompt Template.
  3. Add your LLM (ChatOpenAI or Chat (Local Models)).
  4. Add Redis Memory for session persistence.
  5. Save and test.

Step 4: Deploy and Share

  1. Click API Endpoint to get the chatflow URL.
  2. See Embedding AI Builder Chatflows for integration options (iframe, Slack bot, Teams bot, widget).

Re-Indexing Documents

When source documents are updated, re-index to keep the knowledge base current:

  1. Open the indexing version of the chatflow.
  2. Click Upsert — by default, existing vectors for the same source file are replaced (upsert semantics).
  3. Alternatively, use the trigger from Calabi Automate to schedule automatic re-indexing:
    Trigger: S3 file upload event
    Action: HTTP Request → AI Builder Upsert API
    Schedule: Nightly at 02:00 UTC