> For the complete documentation index, see [llms.txt](https://docs.interactive.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.interactive.ai/agents/guides/deploying.md).

# Deploying

> **Context** — Assumes [Architecture](/agents/concepts/architecture.md) (components and boot sequence). The InteractiveAI platform hosts and runs the agent — you don't build images, manage containers, or operate Kubernetes. This guide covers the deploy lifecycle from your side: what you prepare, what the platform does, and how to verify.
>
> YAML examples follow **manifest schema 6.1.1**. Manifest and content shapes are schema-versioned and differ across runtime versions — the runtime version the platform runs determines which schema your manifest must satisfy; see [Versioning & compatibility](/agents/operations/versioning.md).

## What you provide vs. what the platform does

| You provide                                                                                  | The platform does                                                                         |
| -------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| Versioned **content** in the catalog (system prompt, routines, policies, glossaries, macros) | Fetches the pinned content at boot                                                        |
| A **manifest** declaring the agent                                                           | Validates it on upload; rejects structural errors with every violation listed             |
| A **secret bundle** of the `${VAR}` values the manifest references                           | Injects them as environment variables before boot                                         |
| Reachable **MCP tool servers** and (optionally) a **knowledge base** / **database**          | Connects to them at the addresses the manifest declares                                   |
| —                                                                                            | Runs the agent, applies config, scales it, and (when `endpoint: true`) provisions its URL |

There is no image to pull, no container to run, and no Kubernetes on your side — those are platform-managed. Your deploy artifact is the **manifest plus the content it pins**.

## The manifest

The manifest is the agent's definition. A production example:

```yaml
name: DriveAway Production
id: driveaway-prod
version: "12"
endpoint: true
secrets:
  - secret_name: driveaway-agent-secrets
agent_config:
  runtime:
    api_key: ${AGENT_API_KEY}
  interactive_platform:
    public_key: ${INTERACTIVEAI_PUBLIC_KEY}
    secret_key: ${INTERACTIVEAI_SECRET_KEY}
  llms:
    default: anthropic/claude-haiku-4.5
    api_key: ${ROUTER_API_KEY}
  database:
    hostname: agent-postgres.internal.example.com
    password: ${DB_PASSWORD}
  context:
    system_prompt:
      id: driveaway-system-prompt
      version: 7
    language: match_user
    routines:
      - id: car-search
        version: 4
      - id: book-a-car
        version: 9
    policies:
      - id: stay-on-topic
        version: 2
  mcps:
    - id: cars
      hostname: https://cars-mcp.example.com
      port: 443
      transport: streamable-http
      api_key: ${CARS_MCP_KEY}
```

* `endpoint: true` asks the platform to provision a public-facing URL for the agent; leave it `false` for an agent reached only through the platform's internal API.
* `version` is your free-form revision label — surfaced in logs and traces so you can correlate behaviour with a config release.
* The complete field reference is in [Manifest & content schemas](/agents/reference/manifest.md).
* **Deploying with the `iai` CLI splits this object:** the `--file` holds the `agent_config` block, while the agent name, type (`--id`), runtime `--version`, secrets (`--secret`), and `--endpoint` are passed as flags. See the [Quickstart](/agents/guides/quickstart.md#4-deploy-the-agent) for the exact `iai agents create` / `update` commands.

## Secrets

Every credential is a `${VAR_NAME}` env-ref in the manifest — never a literal. You declare a **secret bundle** in the manifest's top-level `secrets:` list (by its name in Interactive Secrets); the platform injects that bundle's key/value pairs as environment variables before the agent boots, which is exactly how the `${VAR}` refs resolve.

The bundle must cover every `${VAR}` the manifest references. At minimum: `AGENT_API_KEY`, `ROUTER_API_KEY`, `INTERACTIVEAI_PUBLIC_KEY`, `INTERACTIVEAI_SECRET_KEY`; plus whichever of `DB_PASSWORD`, `KB_PG_PASSWORD`, per-MCP keys, traces key, and webhook secrets your manifest declares. A missing required variable aborts boot, naming the variable — so a staging deploy surfaces gaps immediately. Operator tuning knobs (autonomous timeout bounds, router token ceiling, evaluation parallelism) are optional platform settings; see [Environment variables](/agents/reference/environment.md).

## Readiness and first-deploy evaluation

Routine evaluation runs at the end of boot, and the agent **only starts serving — port bound, health checks passing — once it finishes** (see the [boot sequence](/agents/concepts/architecture.md#boot-sequence)). On a warm cache that's immediate; on a cold cache (a fresh content version) the agent is unreachable for the minutes evaluation takes, which can stall a deploy. The platform pre-warms the evaluation cache so cold deploys come up fast — see [Startup evaluation](/agents/concepts/startup-evaluation.md#caching-cold-vs-warm-boots).

## Sizing & scaling

Scaling is platform-managed; what you should know about it:

* The agent is I/O-bound (model calls dominate), so it scales on concurrent-session count rather than CPU.
* The platform runs multiple replicas of an agent. With **Postgres** session storage replicas share state and scale horizontally; with in-memory storage each replica has its own sessions, so a session's traffic isn't guaranteed to land on the same replica — use Postgres for any conversational agent that must survive that. See [Sessions, memory & state](/agents/concepts/memory-and-state.md#storage-backends).
* The LLM router applies per-key rate limits — sustained scale-out multiplies model-call volume; watch for router-side throttling.

## Updating an agent

Two independent release axes:

1. **Content changes** (routines, policies, prompts): publish new content versions to the catalog, bump the pins in the manifest, redeploy the manifest. The runtime is unchanged.
2. **Runtime upgrades**: the platform runs a newer runtime version — check the compatibility matrix first, because a new runtime may require a new manifest-schema version; see [Versioning & compatibility](/agents/operations/versioning.md).

Both are rolling updates; with Postgres storage, in-flight sessions survive. Rolling back content is exact: redeploy the previous manifest (catalog versions are immutable).

## Verifying a deploy

1. **Manifest validates** — the platform reports structural errors on upload, listing every violation at once.
2. **Boot succeeds** — a missing secret or unresolved content/MCP reference fails boot with a specific message; read it in the platform's logs.
3. **A staging conversation works** — send a message through the SDK and confirm the expected routine/policy behaviour.
4. **Traces appear** — confirm the conversation shows up in the platform's traces view; see [Observability](/agents/guides/observability.md).

## Pre-flight checklist

* [ ] Manifest validates on upload (structural errors list every violation)
* [ ] Secret bundle covers every `${VAR}` the manifest references — a staging deploy names the first gap in its boot log
* [ ] `database:` block present (unless ephemeral sessions are intentional)
* [ ] MCP servers reachable from the platform; their `api_key`s in the bundle
* [ ] Runtime/schema version checked against the [compatibility matrix](/agents/operations/versioning.md)
* [ ] Evaluation cache warmed for the content versions being deployed (the platform warms it on deploy — see [Startup evaluation](/agents/concepts/startup-evaluation.md#caching-cold-vs-warm-boots))
* [ ] Traces visible after a staging conversation ([guide](/agents/guides/observability.md))
* [ ] Bearer key distribution: integrations hold the same `AGENT_API_KEY` ([security](/agents/operations/security.md))


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.interactive.ai/agents/guides/deploying.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.