> For the complete documentation index, see [llms.txt](https://docs.interactive.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.interactive.ai/agents/guides/knowledge-base-setup.md).

# Setting up the knowledge base

> **Context** — Assumes [Knowledge base & retrieval](/agents/concepts/knowledge-base.md) (the two implementations and how a retrieval runs). This guide is the provisioning walkthrough for each, plus how to verify grounding actually works.
>
> YAML examples follow **manifest schema 6.1.1**. Manifest and content shapes are schema-versioned and differ across runtime versions — see [Versioning & compatibility](/agents/operations/versioning.md).

## Choosing an implementation

Decide once per agent — `agent_config.search` holds exactly one:

* **`pgvector`** if your corpus is yours to index and a Postgres instance is acceptable infrastructure. The agent does rewrite → embed → search itself; you only maintain the table.
* **`external`** if search already exists in your stack (Elasticsearch, a RAG service, a vendor API) or you need custom ranking. The agent sends conversation context; you return snippets.

## Option A: pgvector

### 1. Provision Postgres with pgvector

Any Postgres 15+ with the `vector` extension:

```sql
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE support_articles (
    id        text PRIMARY KEY,
    vec       vector(1536),       -- must equal embedding_dimensions
    metadata  jsonb NOT NULL
);

CREATE INDEX ON support_articles
    USING hnsw (vec vector_cosine_ops);
```

The agent searches by **cosine similarity**, so index with `vector_cosine_ops`.

### 2. Index your documents

Your pipeline (not the agent) writes rows. The contract per row:

* `vec` — the embedding of the chunk, produced by **the same model** you will declare as `embedding_model`, with matching dimensions.
* `metadata` — JSONB carrying at minimum the document body under the key you'll declare as `content_key` (default `content`), plus any fields you want to filter on.

Example insert your indexer would perform:

```sql
INSERT INTO support_articles (id, vec, metadata) VALUES (
  'cancellation-policy-01',
  '[0.0021, -0.0134, 0.0456]'::vector,   -- 1536 dims in practice
  '{
     "content": "Free cancellation up to 24 hours before pickup. Within 24 hours, one day''s rental rate is forfeited.",
     "kb_location": "article",
     "source_url": "https://help.example.com/cancellations"
   }'::jsonb
);
```

Chunking guidance: retrieved chunks are pasted into the agent's context verbatim — aim for self-contained passages (one rule, one answer, one section) rather than whole documents or single sentences.

### 3. Declare it in the manifest

```yaml
agent_config:
  search:
    type: pgvector
    hostname: kb-postgres.internal.example.com
    port: 5432
    user: postgres
    password: ${KB_PG_PASSWORD}
    dbname: postgres
    sslmode: require
    collection: support_articles
    content_key: content
    metadata_filter:
      kb_location: article
    embedding_model: openai/text-embedding-3-small
    embedding_dimensions: 1536
    top_k: 5
```

* `KB_PG_PASSWORD` must be present in the agent's environment (declare it in the manifest's `secrets` for platform deploys — see [Deploying](/agents/guides/deploying.md)).
* `embedding_dimensions` is validated against the table at boot — a mismatch fails startup immediately rather than returning garbage similarity scores forever.
* `metadata_filter` restricts every search (`kb_location: article` here keeps non-article rows out). Omit it for single-purpose collections.
* If your indexing library stores the body under a different key (the `vecs` library writes `text`), set `content_key` accordingly.

### 4. Optional: tune the query rewrite

The conversation is condensed into one search query by a model call before embedding. Domain-tune it by publishing a prompt and referencing it:

```yaml
    prompt:
      id: kb-rewrite-instructions
      version: 1
```

with content like:

```yaml
text: |
  Queries concern car rental. Expand abbreviations: "CDW" means
  collision damage waiver, "OW" means one-way rental. Always include
  the product category (economy/compact/SUV/van/luxury) in the query
  when the customer has mentioned one.
```

## Option B: external search endpoint

### 1. Implement the endpoint

One POST route; request and response contracts are fixed (full details in [Knowledge base & retrieval](/agents/concepts/knowledge-base.md#type-external-bring-your-own-search)):

```python
from fastapi import FastAPI, Header, HTTPException
from pydantic import BaseModel

app = FastAPI()

SEARCH_API_KEY = "your-shared-secret"


class Message(BaseModel):
    role: str       # "customer" | "agent" | "tool"
    content: str


class SearchRequest(BaseModel):
    session_id: str
    agent_id: str
    top_k: int
    messages: list[Message]


@app.post("/agent-search")
async def agent_search(
    body: SearchRequest,
    authorization: str | None = Header(default=None),
) -> list[str]:
    if authorization != f"Bearer {SEARCH_API_KEY}":
        raise HTTPException(status_code=401)

    query = build_query(body.messages)          # your rewriting
    hits = run_search(query, limit=body.top_k)  # your engine
    return [hit.snippet for hit in hits]        # bare array of strings
```

Hard requirements: respond within the configured timeout (default 5s); return a **bare JSON array of strings**; treat `messages` as most-recent-last. Anything else — envelope objects, non-200s, slow responses — makes the retrieval soft-fail (the turn proceeds ungrounded).

### 2. Declare it

```yaml
agent_config:
  search:
    type: external
    hostname: https://search.internal.example.com
    port: 443
    path: /agent-search
    api_key: ${SEARCH_API_KEY}
    top_k: 5
    max_messages: 20
    timeout_seconds: 5.0
```

## Verifying grounding (both options)

1. **Boot check** — with pgvector, a dimension/collection problem fails startup; read the error, it names the mismatch.
2. **Ask a question only the corpus can answer** ("what exactly happens if I cancel 12 hours before pickup?"). A grounded reply cites specifics from your documents; an ungrounded one generalises.
3. **Inspect the trace** — retrieval appears in the turn's trace with the rewritten query and returned snippets; see [Observability](/agents/guides/observability.md).
4. **Test the soft-fail** — take the KB down and confirm the agent still answers (from prompt + history) while logs show retrieval warnings. That's the designed behaviour; alert on the warning rate, not the turn failure (there isn't one). See [Troubleshooting](/agents/operations/troubleshooting.md).

## Operational notes

* **Re-indexing model changes:** changing `embedding_model` requires re-embedding the entire collection and updating `embedding_dimensions` — query and corpus must live in the same embedding space.
* **Postgres separation:** the KB database and the [session store](/agents/concepts/memory-and-state.md#storage-backends) are configured independently; sharing one server is fine, sharing concerns is not.
* **Corpus hygiene beats `top_k` tuning:** wrong-answer regressions are usually stale or contradictory documents. Raise `top_k` only when answers visibly miss available context — every extra snippet costs prompt space on every retrieval.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.interactive.ai/agents/guides/knowledge-base-setup.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
