# Overview

The InteractiveAI Router provides a unified API for accessing language models across multiple providers. Request and response schemas follow the OpenAI Chat API specification with targeted enhancements, allowing operators to work with a single interface regardless of the underlying model or provider.

### Requests

#### Basic Request Example

```typescript
fetch('https://app.interactive.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <LLMROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'anthropic/claude-3-sonnet',
    messages: [
      {
        role: 'user',
        content: 'Summarize the key risks in this quarterly financial report.',
      },
    ],
  }),
});
```

#### Request Format

Submit completion requests as `POST` request to the `/api/v1/chat/completions` endpoint. The request body accepts the following schema:

{% code title="Request Schema" expandable="true" %}

```typescript
// Definitions of subtypes are below
type Request = {
  // Either "messages" or "prompt" is required
  messages?: Message[];
  prompt?: string;

  // If "model" is unspecified, uses the user's default
  model?: string; // See "Supported Models" section

  // Allows to force the model to produce specific output format.
  // See models page and note on this docs page for which models support it.
  response_format?: { type: 'json_object' };

  stop?: string | string[];
  stream?: boolean; // Enable streaming

  // See LLM Parameters (app.interactive.ai/docs/api/reference/parameters)
  max_tokens?: number; // Range: [1, context_length)
  temperature?: number; // Range: [0, 2]

  // Tool calling
  // Will be passed down as-is for providers implementing OpenAI's interface.
  // For providers with custom interfaces, we transform and map the properties.
  // Otherwise, we transform the tools into a YAML template. The model responds with an assistant message.
  // See models supporting tool calling: app.interactive.ai/models?supported_parameters=tools
  tools?: Tool[];
  tool_choice?: ToolChoice;

  // Advanced optional parameters
  seed?: number; // Integer only
  top_p?: number; // Range: (0, 1]
  top_k?: number; // Range: [1, Infinity) Not available for OpenAI models
  frequency_penalty?: number; // Range: [-2, 2]
  presence_penalty?: number; // Range: [-2, 2]
  repetition_penalty?: number; // Range: (0, 2]
  logit_bias?: { [key: number]: number };
  top_logprobs: number; // Integer only
  min_p?: number; // Range: [0, 1]
  top_a?: number; // Range: [0, 1]

  // Reduce latency by providing the model with a predicted output
  // https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
  prediction?: { type: 'content'; content: string };

  // InteractiveAI-only parameters
  // See "Prompt Transforms" section: app.interactive.ai/docs/guides/features/message-transforms
  transforms?: string[];
  // See "Model Routing" section: app.interactive.ai/docs/guides/features/model-routing
  models?: string[];
  route?: 'fallback';
  // See "Provider Routing" section: app.interactive.ai/docs/guides/routing/provider-selection
  provider?: ProviderPreferences;
  user?: string; // A stable identifier for your end-users. Used to help detect and prevent abuse.
  
  // Debug options (streaming only)
  debug?: {
    echo_upstream_body?: boolean; // If true, returns the transformed request body sent to the provider
  };
};

// Subtypes:

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // URL or base64 encoded image data
    detail?: string; // Optional, defaults to "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      // ContentParts are only for the "user" role:
      content: string | ContentPart[];
      // If "name" is included, it will be prepended like this
      // for non-OpenAI models: `{name}: {content}`
      name?: string;
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema object
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'
  | 'auto'
  | {
      type: 'function';
      function: {
        name: string;
      };
    };
```

{% endcode %}

{% hint style="info" %}
The `response_format` parameter enforces structured JSON output. This is supported by OpenAI models, Nitro models, and select others. Verify provider support on the models page and set `require_parameters` to `true` in your Provider Preferences.
{% endhint %}

For the complete parameter reference, see [Parameters](https://docs.interactive.ai/llm-router/api-guides/parameters).

#### Model Selection

When the `model` parameter is omitted, the Router uses the default configured for your account. When specified, include the provider prefix (e.g., `anthropic/claude-3-sonnet`, `mistral/mistral-large`).

The Router automatically selects optimal infrastructure for each request and falls back to alternative providers when the primary returns a 5xx error or rate limits the request.

#### Streaming

Server-Sent Events (SSE) are supported across all models. Set `stream: true` in your request body to receive incremental responses. See Streaming for implementation details.

#### Parameter Handling

If a model does not support a specific parameter (such as `logit_bias` for non-OpenAI models or `top_k` for OpenAI), the Router ignores that parameter and forwards the rest to the underlying provider.

#### Assistant Prefill

Guide model responses by including a partial assistant message at the end of your messages array. The model will continue from where the prefill ends.

```typescript
fetch('https://app.interactive.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <LLMROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'anthropic/claude-3-sonnet',
    messages: [
      { role: 'user', content:'Classify this support ticket as billing, technical, or general' },
      { role: 'assistant', content: "I'm not sure" },
    ],
  }),
});
```

### Responses

#### CompletionsResponse Format

All responses follow a normalized schema regardless of the underlying model or provider. The `choices` array is always present, even for single-completion responses.

Here is the response schema as a TypeScript type:

```typescript
// Definitions of subtypes are below
type Response = {
  id: string;
  // Depending on whether you set "stream" to "true" and
  // whether you passed in "messages" or a "prompt", you
  // will get a different output shape
  choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
  created: number; // Unix timestamp
  model: string;
  object: 'chat.completion' | 'chat.completion.chunk';

  system_fingerprint?: string; // Only present if the provider supports it

  // Usage data is always returned for non-streaming.
  // When streaming, you will get one usage object at
  // the end accompanied by an empty choices array.
  usage?: ResponseUsage;
};
```

```typescript
// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.

type ResponseUsage = {
  /** Including images and tools if any */
  prompt_tokens: number;
  /** The tokens generated */
  completion_tokens: number;
  /** Sum of the above two fields */
  total_tokens: number;
};
```

```typescript
// Subtypes:
type NonChatChoice = {
  finish_reason: string | null;
  text: string;
  error?: ErrorResponse;
};

type NonStreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  delta: {
    content: string | null;
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ErrorResponse = {
  code: number; // See "Error Handling" section
  message: string;
  metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};

type ToolCall = {
  id: string;
  type: 'function';
  function: FunctionCall;
};
```

Here's an example:

```json
{
  "id": "gen-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop", // Normalized finish_reason
      "native_finish_reason": "stop", // The raw finish_reason from the provider
      "message": {
        // will be "delta" if streaming
        "role": "assistant",
        "content": "The quarterly report identifies three primary risk factors: supply chain volatility, regulatory compliance in emerging markets, and currency exchange exposure."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 4,
    "total_tokens": 4
  },
  "model": "anthropic/claude-3-sonnet" // Could also be "openai/gpt-3.5-turbo", etc, depending on the "model" that ends up being used
}
```

#### Finish Reason

The Router normalizes `finish_reason` across all models to one of five values:

| Value            | Description                  |
| ---------------- | ---------------------------- |
| `stop`           | Natural completion           |
| `length`         | Token limit reached          |
| `tool_calls`     | Model invoked a tool         |
| `content_filter` | Content filtered by provider |
| `error`          | Error during generation      |

Access the provider's original finish reason through `native_finish_reason` when debugging provider-specific behavior.
