Web Search Block | Scout

The Web Search block enables users to perform web searches and extract content from search engine results. It provides options for filtering results by time, including or excluding specific domains, and processing the extracted text into manageable chunks. This block is useful for gathering data from the web and integrating it into Scout workflows.

Configuration (Required)

Search Engine Query

stringRequired

Enter what you’re looking for. This is the main query used to search the web.

Search Results To Scrape

integer

The maximum number of search results to process. The default value is 1.

Time Filter

string

Filter search results by time range. The default value is Any time. Options include:

Any time
Past hour
Past 24 hours
Past week
Past month
Past year

Include Domains

list

List of domains to include in the search results. The default is an empty list, which will include results from all domains.

Exclude Domains

list

List of domains to exclude from the search results. The default is an empty list, which will include results from all domains..

Split Page Text

boolean

Toggle whether or not the extracted text is chunked into smaller sections. The default value is true.

Splitter Strategy

string

The strategy to use for splitting text when Split Page Text is enabled. The default value is Smart Splitter.

Max Results to Return

integer

The maximum number of results to return after processing the scraped content. This will default to the number of search results if not set or set to 0. The default value is 0.

Content Capture Mode

string

How to capture web pages: Thorough (processes everything including JavaScript, more complete but slower) or Quick (basic HTML only, faster). The default value is Quick.

Minimum Similarity Score

float

The minimum similarity score for a result to be considered relevant. Set to 0.0 to include all results. The default value is 0.0.

Page Search Term

string

The term to search for inside of the top search results. Defaults to the Search Engine Query if not provided. The default value is an empty string.

Text Extractor

string

The method to use for extracting text from web pages. The default value is readability.

See Workflow Logic & State > State Management for details on using dynamic variables in this block.

Outputs

The block outputs a list of extracted web page results, each containing text, similarity score, canonical URL, and metadata.

Usage Context

Use this block to perform web searches and extract content from search engine results for integration into Scout workflows.

Best Practices

Ensure that the query is specific to obtain relevant search results.
Use the time filter to narrow down results to a specific time range if needed.
Specify include or exclude domains to refine the search scope.
Set an appropriate minimum similarity score to filter out less relevant results.
Consider the content capture mode based on the need for thoroughness versus speed.