# Datasets

Datasets are collections of test cases used to evaluate your LLM application systematically. Each dataset contains **items** with inputs, ideally expected outputs, and metadata that you run your AI system against to **measure quality**, **detect regressions**, and **compare configurations**.

### Why Datasets Matter

LLM applications need structured evaluation beyond ad-hoc testing. Datasets let you:

* **Test your system** against consistent, repeatable inputs
* **Compare performance** across prompt versions, models, or configurations
* **Detect regressions** before they reach production
* Build a library of **edge cases** and known failure modes
* **Establish baselines** for quality measurement

***

### Creating Datasets

You can create datasets through the UI or programmatically via the InteractiveAI SDK.

{% tabs %}
{% tab title="Via InteractiveAI Platform" %}

1. Navigate to **Improve > Datasets** in the sidebar
2. Click **+ New Dataset**
3. Enter a name, optional description, and optional metadata
4. Click **Create Dataset**
   {% endtab %}

{% tab title="Via IntersctiveAI SDK" %}

```python
interactiveai.create_dataset(
    name="test-dataset",
    description="Question-answering test cases for model evaluation",
    metadata={
        "author": "evaluation-team",
        "purpose": "regression-testing",
        "version": "1.0"
    }
)
```

You can optionally provide JSON Schemas to validate dataset items on creation:

```python
interactiveai.create_dataset(
    name="validated-dataset",
    description="Dataset with schema validation",
    metadata={
    "author": "evaluation-team",
    "purpose": "regression-testing",
    "version": "1.0"
    },
    input_schema={
        "type": "object",
        "properties": {
            "prompt": {"type": "string"},
            "context": {"type": "string"}
        },
        "required": ["prompt"]
    },
    expected_output_schema={
        "type": "object",
        "properties": {
            "answer": {"type": "string"}
        },
        "required": ["answer"]
    }
)
```

When `input_schema` or `expected_output_schema` are set, all new items will be validated against these schemas.
{% endtab %}
{% endtabs %}

For the full `create_dataset()` API reference including upsert behavior and status options, see the [SDK Documentation](https://app.gitbook.com/s/jHEEbkpMbUW2x51XS8Ez/datasets#create_dataset-source).

***

### Adding Items to a Dataset

Each item represents a single test case with an input, expected output, and metadata. You can populate datasets through the UI or SDK.

{% tabs %}
{% tab title="Via InteractiveAI Platform" %}
There are three ways to add items through the UI:

1. **From the Items tab:**

   1. Open a dataset from the Datasets list
   2. Click the **Items** tab
   3. Click **+ New Item**
   4. Enter the input, expected output, and metadata
   5. Save the item

   <div data-with-frame="true"><figure><img src="https://708770081-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1ICwJbq7EJdn5kBgXnQu%2Fuploads%2FysaaDVXPo6DLtET0W7K1%2Fimage.png?alt=media&#x26;token=74ee4f55-6cdc-4e47-aba2-763e511a6871" alt=""><figcaption></figcaption></figure></div>

2. **From a production trace:**

   1. Open any trace in the Govern > Traces view
   2. Click **Add to Dataset** in the trace detail view
   3. Select the target dataset
   4. Optionally edit the input, expected output, and metadata
   5. Confirm to add the item

   <div data-with-frame="true"><figure><img src="https://708770081-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1ICwJbq7EJdn5kBgXnQu%2Fuploads%2FJyrIfKP6NcFdXDJEh8ua%2FClipboard-20260311-140824-168.gif?alt=media&#x26;token=7bfcaab0-8090-4bcc-8a03-69fa1cd7ea01" alt=""><figcaption></figcaption></figure></div>

{% hint style="info" %}
This method is useful for building test cases from real user interactions.
{% endhint %}

3. **Via CSV upload:**

   1. When creating a new dataset, drag and **drop a CSV file** or click to upload
   2. Map CSV columns to input, expected output, and metadata fields
   3. The items will be created automatically from your CSV rows

   <div data-with-frame="true"><figure><img src="https://708770081-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1ICwJbq7EJdn5kBgXnQu%2Fuploads%2Fvyg9DzhqgkjAkseCSxhN%2FClipboard-20260311-142124-023.gif?alt=media&#x26;token=72df8e99-959a-455d-b9f9-7ce3ef33bdc2" alt=""><figcaption></figcaption></figure></div>

{% endtab %}

{% tab title="Via InteractiveAI SDK" %}
There are three ways to add items through the SDK:

1. You can add a single item:

```python
interactiveai.create_dataset_item(
    dataset_name="test-dataset",
    input={"prompt": "What is the capital of Portugal?"},
    expected_output={"answer": "Lisbon", "confidence": "high"},
    metadata={"category": "geography", "difficulty": "easy"}
)
```

2. You can add multiple items at once:

```python
dataset_items = [
    {
        "input": {"prompt": "Explain quantum computing in simple terms"},
        "expected_output": {"answer": "Quantum computing uses quantum mechanics..."},
        "metadata": {"category": "science", "difficulty": "medium"}
    },
    {
        "input": {"prompt": "What is 15 + 27?"},
        "expected_output": {"answer": "42"},
        "metadata": {"category": "math", "difficulty": "easy"}
    },
    {
        "input": {"prompt": "Translate 'Hello' to Spanish"},
        "expected_output": {"answer": "Hola"},
        "metadata": {"category": "translation", "difficulty": "easy"}
    }
]

for item in dataset_items:
    interactiveai.create_dataset_item(
        dataset_name="test-dataset",
        input=item["input"],
        expected_output=item["expected_output"],
        metadata=item["metadata"]
    )
```

3. You can create dataset items linked to production traces:

```python
interactiveai.create_dataset_item(
    dataset_name="test-dataset",
    input={"prompt": "User's actual question from production"},
    expected_output={"answer": "Validated correct response"},
    source_trace_id="your-trace-id",
    source_observation_id="your-observation-id",
    metadata={"source": "production", "date": "2026-01-19"}
)
```

{% endtab %}
{% endtabs %}

For the full `create_dataset_item()` API reference including upsert behavior and status options, see the [SDK Documentation](https://app.gitbook.com/s/jHEEbkpMbUW2x51XS8Ez/datasets#create_dataset_item-source).

***

### Translating Content

Click the translate icon in the dataset item header to translate the Input, Expected Output, and Metadata sections into your preferred language. The Runs section is not included in the translation.

<div data-with-frame="true"><figure><img src="https://708770081-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1ICwJbq7EJdn5kBgXnQu%2Fuploads%2FrhzwCr1WCatX0bWae0KC%2Fimage.png?alt=media&#x26;token=5b673e0c-acc2-46f9-a051-cfda420c6f47" alt=""><figcaption></figcaption></figure></div>
