> ## Documentation Index
> Fetch the complete documentation index at: https://docs.scoutos.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Databases: Vector Storage for Scout Agents

> Scout Databases store structured records with automatic embeddings for semantic search. Use them for RAG apps, knowledge bases, and agent data retrieval.

Scout Databases are the primary way to store, search, and retrieve information for your AI agents. Every piece of text you add to a Database is automatically embedded and indexed, so agents can find the most relevant content by meaning — not just by matching exact words. Whether you're building a customer support chatbot, a documentation assistant, or an internal knowledge base, Databases give your agents the data retrieval layer they need.

## What Are Databases?

A **Database** is a container made up of one or more **Tables**. Each Table holds **Documents** — structured records that combine metadata fields (like title, category, or a timestamp) with a text body that gets automatically embedded for semantic search.

```text theme={null}
Database
├── Table 1
│   ├── Document 1 (metadata + text)
│   ├── Document 2 (metadata + text)
│   └── ...
├── Table 2
│   └── ...
└── Sources (sync integrations)
```

For example, a customer support team might have a **Help Center** database with two tables: `FAQs` and `Troubleshooting Guides`. Each document in `FAQs` could have a `category` column, a `last_updated` timestamp, and a text body containing the answer. Agents can search across both tables at once or scope their query to a single table.

## Search Modes

Databases support three distinct search modes. You choose the right one based on the kind of query you expect.

| Mode                  | How It Works                                                                 | Best For                                             |
| --------------------- | ---------------------------------------------------------------------------- | ---------------------------------------------------- |
| **Semantic (Vector)** | Converts your query to an embedding and finds documents with similar vectors | Natural language questions, finding related concepts |
| **Keyword (BM25)**    | Matches exact keywords using traditional full-text search                    | Product codes, SKUs, technical identifiers           |
| **Hybrid**            | Fuses semantic and keyword results using Reciprocal Rank Fusion (RRF)        | General-purpose production search                    |

### Semantic Search

Semantic search finds relevant content even when the user's words don't appear verbatim in the document. A query for `"how do I reset my password"` can surface documents about `"account recovery"` or `"login troubleshooting"` because the embeddings capture meaning, not just terms.

### Keyword Search

Keyword search uses BM25 to find documents containing exact keyword matches. Use it when precision matters — for example, when users search for a specific error code like `ERR_CERT_AUTHORITY_INVALID`.

### Hybrid Search

Hybrid search combines both approaches using Reciprocal Rank Fusion. Results from the semantic pass and the keyword pass are merged and re-ranked, so you get the precision of keyword matching alongside the conceptual coverage of vector search. **For most production applications, hybrid search is the recommended default.**

## Databases vs. Drive

Scout offers two storage systems. Use this table to pick the right one for your use case.

| Feature       | Databases & Tables                  | Drive                                  |
| ------------- | ----------------------------------- | -------------------------------------- |
| **Purpose**   | Structured data with vector search  | Raw file storage (PDFs, images, docs)  |
| **Search**    | Semantic, keyword, or hybrid        | By path or filename                    |
| **Use Case**  | RAG, knowledge bases, CRM data      | Assets, attachments, generated outputs |
| **AI Access** | Agents search by meaning            | Agents read and write files directly   |
| **Sync**      | Notion, Google Sheets, web scraping | Manual upload or agent writes          |

### When to Use Databases

Choose Databases when your agents need to find information by meaning:

* Building a chatbot that answers questions from your internal docs (RAG)
* Creating a searchable knowledge repository for support or onboarding
* Storing and querying customer or CRM records
* Surfacing content by concept rather than exact path

### When to Use Drive

Choose Drive when your agents need raw file access:

* Storing PDFs, images, and other binary assets
* Saving generated reports and workflow outputs
* Passing files between workflow steps
* Reading files by exact path without semantic querying

## Quick Start

Get up and running with Databases in four steps.

<Steps>
  <Step title="Create a Database">
    1. Navigate to **Databases** in the Scout dashboard.
    2. Click **+ New** at the top of the page.
    3. Enter a name and optional description.
    4. Click **Create**.

    Scout provisions and indexes your Database automatically. This takes about 30 seconds and the UI shows the current status while it's setting up.
  </Step>

  <Step title="Create and Configure a Table">
    Every new Database comes with an `Untitled` table. Rename it and add columns to match your data:

    1. Click the **+** button in the table header row.
    2. Enter a column name and select a type: `Single Line Text`, `Multi Line Text`, `Number`, `Checkbox`, or `URL`.
    3. Repeat for each field your documents need.

    Store your main searchable text in a column named `content` — Scout automatically chunks and embeds this field for semantic search.
  </Step>

  <Step title="Add Documents">
    Add documents via the REST API or Python SDK.

    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://api.scoutos.com/v2/collections/{collection_id}/tables/{table_id}/documents \
        -H "Authorization: Bearer YOUR_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "documents": [{
            "id": "doc_1",
            "text": "Your searchable content here...",
            "title": "Document Title",
            "category": "documentation"
          }]
        }'
      ```

      ```python Python theme={null}
      from scoutos import Scout

      client = Scout(api_key="YOUR_API_KEY")

      client.documents.create(
          collection_id="col_abc123",
          table_id="tab_xyz789",
          documents=[{
              "id": "doc_1",
              "text": "Your searchable content here...",
              "title": "Document Title",
              "category": "documentation"
          }]
      )
      ```
    </CodeGroup>
  </Step>

  <Step title="Query Your Data">
    Search your database from a workflow or directly via the API.

    **In a Workflow**, add a **Query Database Table** block and configure it:

    ```yaml theme={null}
    Search Term: "{{inputs.user_question}}"
    Minimum Similarity: 0.5
    Hybrid Search: true
    Limit: 10
    ```

    **Via the API:**

    ```bash theme={null}
    curl -X POST https://api.scoutos.com/v2/collections/{collection_id}/tables/{table_id}/query \
      -H "Authorization: Bearer YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "search_term": "customer support",
        "min_similarity": 0.5,
        "limit": 10
      }'
    ```
  </Step>
</Steps>

## Using Databases with Agents

Give your agents the ability to read from and write to Databases by following these two steps.

### 1. Enable Databases Tools

Open your agent in Scout, go to the **Tools** tab, and enable the Databases tools. This grants the agent permission to query tables and create or update documents.

### 2. Add an Instruction Snippet

Add the following to your agent's system prompt to guide its retrieval behavior:

```markdown theme={null}
When a task depends on organizational knowledge or structured records:

1. Query Databases first.
2. Prefer hybrid search for broad user questions.
3. Use metadata filters when the user specifies a category, date, or status.
4. If information is missing and the user provided new facts, write the new record
   to the correct table.
5. In your reply, clearly distinguish between retrieved data and newly added data.
```

### Prompt Examples

* "Search our support knowledge base for account recovery steps and summarize the answer."
* "Find onboarding docs updated in the last 30 days and return only security-related items."
* "Add this meeting note to the `customer_feedback` table with category `enterprise`."

### Expected Agent Behavior

When configured correctly, your agent will:

* Query the correct table before formulating an answer
* Apply metadata filters when the user's request includes constraints like category or date
* Write records only when explicitly asked or when your instructions allow it
* Cite which data came from Databases in its final response

## Next Steps

<CardGroup cols={3}>
  <Card title="Creating Databases" icon="table" href="/databases/creating-databases">
    Create databases, configure table schemas, and populate data via the UI or API.
  </Card>

  <Card title="Sources" icon="rotate" href="/databases/sources">
    Sync data automatically from Notion, Google Sheets, websites, and more.
  </Card>

  <Card title="Querying Data" icon="magnifying-glass" href="/databases/querying-data">
    Master semantic search, hybrid search, and advanced metadata filtering.
  </Card>
</CardGroup>