Sources
Sources keep your tables up to date by syncing external data into Collections. Use Sources when you want repeatable ingestion instead of manual document entry.
What Sources Do
Each source runs a sync job that:
- Pulls data from an external system
- Maps that data to your table columns
- Creates or updates documents in the destination table
You can run sources manually or on a schedule.
Source Types
Scout supports multiple source types in the same table:
| Source Type | What It Pulls | Best For |
|---|---|---|
| Web Scrape | Website pages via single URL, crawl or sitemap | Public docs, help centers, blogs |
| Notion | Notion pages and databases | Internal knowledge bases and team wikis |
| Google Sheets | Rows from spreadsheets | Operational data and structured lists |
Create a Source
- Open your Collection and select a table
- Click Sources
- Click Add Source
- Choose a source type
- Configure mapping and frequency
- Run the first sync
Source Mapping
Each source returns slightly different fields. Map those fields to columns in your table.
Typical mappings:
title->titlecontent->contenturl->urlupdated_at->updated_at
If you are syncing long-form text for retrieval, map the main body into your content column.
Sync Frequency
Frequency is optional. You can:
- Run once manually
- Enable a schedule for automatic refresh
Use schedules for content that changes frequently, such as docs portals or active spreadsheets.
Monitoring and Re-runs
From the Sources panel, you can:
- View run status and sync history
- Inspect errors and logs
- Edit source configuration
- Re-run failed or completed jobs
Web Scraping in Sources
Web Scraping is a source type. Scout supports:
- Single Page for one URL
- Website Crawl for linked pages on a site
- Sitemap for controlled URL discovery from
sitemap.xml
For full configuration details, see Web Scraping.
Notion in Sources
Notion is a source type for syncing workspace pages and databases into your table.
For setup and mapping guidance, see Notion.
Google Sheets in Sources
Google Sheets is a source type for syncing spreadsheet rows into a table.
For setup and mapping guidance, see Google Sheets.
Best Practices
- Start with a small test sync before large runs
- Keep column mappings explicit and stable
- Use schedules only where freshness matters
- Review failed runs regularly and fix mapping drift quickly
Next Steps
- Web Scraping: Configure crawl settings and extraction options
- Notion: Connect and sync Notion pages and databases
- Google Sheets: Sync spreadsheet rows into table documents
- Creating Collections: Design tables for source data
- Querying Data: Search synced content with semantic and hybrid search
Built with ❤️ by Scout OS