Collections & Tables
If youβre building a RAG app, a knowledge base, or any workflow that needs to find the right information at the right time, Collections are where your data lives.
Collections let you store structured records, automatically embed the text content for semantic search, and query everything by meaning or keyword from a workflow or agent.
What are Collections?
A Collection is a group of Tables, and each Table holds Documents: structured records with metadata fields and a text body. The text is automatically embedded and indexed so you can search it semantically.
Collection
βββ Table 1
β βββ Document 1 (metadata + text)
β βββ Document 2 (metadata + text)
β βββ ...
βββ Table 2
β βββ ...
βββ Sources (sync integrations)For example, a customer support team might have a βHelp Centerβ collection with two tables: βFAQsβ and βTroubleshooting Guides.β Each document in βFAQsβ could have a category column, a last_updated date, and a text body with the answer. Agents can then search across both tables or filter to just one.
Key Features
- Vector Search: Text content is automatically embedded and indexed for semantic search
- Hybrid Search: Combine vector (semantic) search with keyword (BM25) search for better coverage
- Structured Metadata: Filter and sort results using typed columns
- Source Syncs: Automatically sync data from Notion, Google Sheets, web scrapes and more
- Workflow Integration: Query and save data directly from your workflows
Collections vs. Drive
| Feature | Collections & Tables | Drive |
|---|---|---|
| Purpose | Structured data & vector search | File storage (PDFs, images) |
| Search | Semantic/vector search | By path/name |
| Use Case | RAG, knowledge bases, CRMs | Assets, attachments, media |
| AI Access | Agents can search semantically | Agents can read files |
When to Use Collections
Use Collections when you need to find information by meaning:
- Building a chatbot that answers questions from your data (RAG)
- Creating a searchable documentation or knowledge repository
- Storing and querying customer or CRM data
- Surfacing content by concept, not just exact keywords
When to Use Drive
Use Drive when you need raw file access:
- Storing PDFs, images and documents
- Managing media assets for workflows
- Reading files by path without complex querying
Quick Start
Create a Collection
- Navigate to Collections in the Scout dashboard
- Click + New
- Enter a name and description
- Click Create
Scout provisions the underlying vector database automatically (takes about 30 seconds).
Create a Table
When you create a Collection, it automatically includes an βUntitledβ table. Customize it:
- Click the + button in the table header to add columns
- Choose column types (Single Line Text, Multi Line Text, Number, Checkbox, URL)
- Add data manually or connect a Source
Add Documents
Via the API:
curl -X POST https://api.scoutos.com/v2/collections/{collection_id}/tables/{table_id}/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": [{
"id": "doc_1",
"text": "Your searchable content here...",
"title": "Document Title",
"category": "documentation"
}]
}'Using Python:
from scoutos import Scout
client = Scout(api_key="YOUR_API_KEY")
client.documents.create(
collection_id="col_abc123",
table_id="tab_xyz789",
documents=[{
"id": "doc_1",
"text": "Your searchable content here...",
"title": "Document Title"
}]
)Query Your Data
In a Workflow:
Use the Query Collection Table block to search your data:
Search Term: "{{inputs.user_question}}"
Minimum Similarity: 0.5
Hybrid Search: true
Limit: 10Via the API:
curl -X POST https://api.scoutos.com/v2/collections/{collection_id}/tables/{table_id}/query \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"search_term": "customer support",
"min_similarity": 0.5,
"limit": 10
}'Use with Agents
1) Enable Tools
In the agentβs Tools tab, enable the tools that let the agent read and write Collections data.
2) Add Instruction Snippet
Add this to your agent instructions:
When a task depends on organizational knowledge or structured records:
1. Query Collections first.
2. Prefer hybrid search for broad user questions.
3. Use metadata filters when the user specifies category, date or status.
4. If information is missing and the user provided new facts, write the new record to the correct table.
5. In your reply, distinguish clearly between retrieved data and newly added data.3) Prompt Examples
- βSearch our support knowledge base for account recovery steps and summarize the answer.β
- βFind onboarding docs updated in the last 30 days and return only security-related items.β
- βAdd this meeting note to the
customer_feedbacktable with categoryenterprise.β
4) Expected Behavior
- The agent queries the correct table before answering
- The agent applies filters when constraints are present
- The agent writes records only when asked or when instructions allow it
- The final response cites what came from Collections data
Next Steps
- Creating Collections β Learn how to create collections, configure tables and define schemas
- Sources β Set up syncs from web, Notion and Google Sheets
- Querying Data β Master semantic search, hybrid search and advanced filtering
Built with β€οΈ by Scout OS