# Overview

The InteractiveAI Router provides a unified API for accessing language models across multiple providers. Request and response schemas follow the OpenAI Chat API specification with targeted enhancements, allowing operators to work with a single interface regardless of the underlying model or provider.

### Requests

#### Basic Request Example

```typescript
fetch('https://app.interactive.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <LLMROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'anthropic/claude-3-sonnet',
    messages: [
      {
        role: 'user',
        content: 'Summarize the key risks in this quarterly financial report.',
      },
    ],
  }),
});
```

#### Request Format

Submit completion requests as `POST` request to the `/api/v1/chat/completions` endpoint. The request body accepts the following schema:

{% code title="Request Schema" expandable="true" %}

```typescript
// Definitions of subtypes are below
type Request = {
  // Either "messages" or "prompt" is required
  messages?: Message[];
  prompt?: string;

  // If "model" is unspecified, uses the user's default
  model?: string; // See "Supported Models" section

  // Allows to force the model to produce specific output format.
  // See models page and note on this docs page for which models support it.
  response_format?: { type: 'json_object' };

  stop?: string | string[];
  stream?: boolean; // Enable streaming

  // See LLM Parameters (app.interactive.ai/docs/api/reference/parameters)
  max_tokens?: number; // Range: [1, context_length)
  temperature?: number; // Range: [0, 2]

  // Tool calling
  // Will be passed down as-is for providers implementing OpenAI's interface.
  // For providers with custom interfaces, we transform and map the properties.
  // Otherwise, we transform the tools into a YAML template. The model responds with an assistant message.
  // See models supporting tool calling: app.interactive.ai/models?supported_parameters=tools
  tools?: Tool[];
  tool_choice?: ToolChoice;

  // Advanced optional parameters
  seed?: number; // Integer only
  top_p?: number; // Range: (0, 1]
  top_k?: number; // Range: [1, Infinity) Not available for OpenAI models
  frequency_penalty?: number; // Range: [-2, 2]
  presence_penalty?: number; // Range: [-2, 2]
  repetition_penalty?: number; // Range: (0, 2]
  logit_bias?: { [key: number]: number };
  top_logprobs: number; // Integer only
  min_p?: number; // Range: [0, 1]
  top_a?: number; // Range: [0, 1]

  // Reduce latency by providing the model with a predicted output
  // https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
  prediction?: { type: 'content'; content: string };

  // InteractiveAI-only parameters
  // See "Prompt Transforms" section: app.interactive.ai/docs/guides/features/message-transforms
  transforms?: string[];
  // See "Model Routing" section: app.interactive.ai/docs/guides/features/model-routing
  models?: string[];
  route?: 'fallback';
  // See "Provider Routing" section: app.interactive.ai/docs/guides/routing/provider-selection
  provider?: ProviderPreferences;
  user?: string; // A stable identifier for your end-users. Used to help detect and prevent abuse.
  
  // Debug options (streaming only)
  debug?: {
    echo_upstream_body?: boolean; // If true, returns the transformed request body sent to the provider
  };
};

// Subtypes:

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // URL or base64 encoded image data
    detail?: string; // Optional, defaults to "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      // ContentParts are only for the "user" role:
      content: string | ContentPart[];
      // If "name" is included, it will be prepended like this
      // for non-OpenAI models: `{name}: {content}`
      name?: string;
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema object
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'
  | 'auto'
  | {
      type: 'function';
      function: {
        name: string;
      };
    };
```

{% endcode %}

{% hint style="info" %}
The `response_format` parameter enforces structured JSON output. This is supported by OpenAI models, Nitro models, and select others. Verify provider support on the models page and set `require_parameters` to `true` in your Provider Preferences.
{% endhint %}

For the complete parameter reference, see [Parameters](/llm-router/api-guides/parameters.md).

#### Model Selection

When the `model` parameter is omitted, the Router uses the default configured for your account. When specified, include the provider prefix (e.g., `anthropic/claude-3-sonnet`, `mistral/mistral-large`).

The Router automatically selects optimal infrastructure for each request and falls back to alternative providers when the primary returns a 5xx error or rate limits the request.

#### Streaming

Server-Sent Events (SSE) are supported across all models. Set `stream: true` in your request body to receive incremental responses. See Streaming for implementation details.

#### Parameter Handling

If a model does not support a specific parameter (such as `logit_bias` for non-OpenAI models or `top_k` for OpenAI), the Router ignores that parameter and forwards the rest to the underlying provider.

#### Assistant Prefill

Guide model responses by including a partial assistant message at the end of your messages array. The model will continue from where the prefill ends.

```typescript
fetch('https://app.interactive.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <LLMROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'anthropic/claude-3-sonnet',
    messages: [
      { role: 'user', content:'Classify this support ticket as billing, technical, or general' },
      { role: 'assistant', content: "I'm not sure" },
    ],
  }),
});
```

### Responses

#### CompletionsResponse Format

All responses follow a normalized schema regardless of the underlying model or provider. The `choices` array is always present, even for single-completion responses.

Here is the response schema as a TypeScript type:

```typescript
// Definitions of subtypes are below
type Response = {
  id: string;
  // Depending on whether you set "stream" to "true" and
  // whether you passed in "messages" or a "prompt", you
  // will get a different output shape
  choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
  created: number; // Unix timestamp
  model: string;
  object: 'chat.completion' | 'chat.completion.chunk';

  system_fingerprint?: string; // Only present if the provider supports it

  // Usage data is always returned for non-streaming.
  // When streaming, you will get one usage object at
  // the end accompanied by an empty choices array.
  usage?: ResponseUsage;
};
```

```typescript
// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.

type ResponseUsage = {
  /** Including images and tools if any */
  prompt_tokens: number;
  /** The tokens generated */
  completion_tokens: number;
  /** Sum of the above two fields */
  total_tokens: number;
};
```

```typescript
// Subtypes:
type NonChatChoice = {
  finish_reason: string | null;
  text: string;
  error?: ErrorResponse;
};

type NonStreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  delta: {
    content: string | null;
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ErrorResponse = {
  code: number; // See "Error Handling" section
  message: string;
  metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};

type ToolCall = {
  id: string;
  type: 'function';
  function: FunctionCall;
};
```

Here's an example:

```json
{
  "id": "gen-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop", // Normalized finish_reason
      "native_finish_reason": "stop", // The raw finish_reason from the provider
      "message": {
        // will be "delta" if streaming
        "role": "assistant",
        "content": "The quarterly report identifies three primary risk factors: supply chain volatility, regulatory compliance in emerging markets, and currency exchange exposure."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 4,
    "total_tokens": 4
  },
  "model": "anthropic/claude-3-sonnet" // Could also be "openai/gpt-3.5-turbo", etc, depending on the "model" that ends up being used
}
```

#### Finish Reason

The Router normalizes `finish_reason` across all models to one of five values:

| Value            | Description                  |
| ---------------- | ---------------------------- |
| `stop`           | Natural completion           |
| `length`         | Token limit reached          |
| `tool_calls`     | Model invoked a tool         |
| `content_filter` | Content filtered by provider |
| `error`          | Error during generation      |

Access the provider's original finish reason through `native_finish_reason` when debugging provider-specific behavior.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.interactive.ai/llm-router/api-guides/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
