Query Collection Table Block
Query and retrieve data from Scout collection tables
The Query Collection Table block enables users to perform advanced searches on a specified collection table within Scout. This block supports multiple search configurations, including vector and hybrid searches, and allows for dynamic query and filter construction using Jinja templating. It is ideal for retrieving data with fine-tuned control over ranking, relevance, and filtering criteria.
Configurations
Select from a dropdown of which collection to query.
Select from a dropdown the table within the selected collection to query.
The term or phrase to search for in the selected table. This input supports Jinja templating for dynamic query construction. For example, use {{inputs.user_message}}
to dynamically query based on user input.
Minimum similarity threshold (0 to 1). The default value is 0.35
. Use a lower value (e.g., 0.2
) when searching for highly specific matches, such as product names or technical terms. Use a higher value (e.g., 0.8
) when searching for broader results. Adjust this value to fine-tune the search results according to your needs.
Toggles between a hybrid search that combines vector search (semantic relevance) and BM25 (textual relevance). Default is false
, which only performs vector-based semantic search. Enable this when you need results that account for both semantic meaning and exact keyword matches.
The weighting factor for hybrid search. A value between 0 and 1, where 0 relies fully on BM25 (text-based ranking) and 1 relies fully on vector search (semantic ranking). Default is 0.5
.
A list of filters to narrow down search results. Filters are specified as JSON arrays, e.g., ["column_id", "operator", "value"]
. You can get the column ids from your table by copying it from the table UI, or using the Get a table by ID endpoint.
Here’s a detailed list of the available filtering parameters:
Maximum number of results to return. The default value is 10
.
Detailed Filter Examples
Filters are a powerful way to narrow down the results by applying specific conditions to attributes in your data, but can be a bit confusing to understand and apply correctly.
In order to use a filter, you will need to specify the column’s ID you want to apply the filter to. You can get the column ids from your table by copying it from the table UI, or using the Get a table by ID endpoint.
Below is a detailed list of examples, along with explanations for each use case:
Basic Filter Examples
Basic Filter Examples
-
Equality Filter
Input:["status_column_id", "Eq", "active"]
Explanation: Filters results to include only rows where thestatus
attribute is equal toactive
.
Use Case: Retrieve all active users, orders, or documents. -
Inequality Filter
Input:["rating_column_id", "NotEq", 5]
Explanation: Excludes results where therating
attribute equals5
.
Use Case: Exclude perfect ratings when analyzing feedback. -
Greater Than Filter
Input:["age_column_id", "Gt", 30]
Explanation: Filters results to include only rows where theage
attribute is greater than30
.
Use Case: Find users older than 30 for demographic analysis. -
Less Than Filter
Input:["price_column_id", "Lt", 50]
Explanation: Filters results to include only rows where theprice
attribute is less than50
.
Use Case: Retrieve all products priced below $50 for a sale or discount offer.
Range Filter Examples
Range Filter Examples
-
Between Filter
Input:["And", [["date_column_id", "Gte", 1672531200], ["date_column_id", "Lte", 1703980800]]]
(Timestamps for “2023-01-01” and “2023-12-31”) Explanation: Filters results to include only rows where thedate
attribute falls within the specified range.
Use Case: Retrieve all transactions or events within the year 2023. -
Not Between Filter Input:
["And", [["score_column_id", "Lte", 60], ["score_column_id", "Gte", 80]]]
Explanation: Excludes results where thescore
attribute is between 60 and 80.
Use Case: Exclude average scores to focus on high or low outliers.
Substring and Pattern Matching Filters
Substring and Pattern Matching Filters
-
Contains Filter
Input:["description_column_id", "IGlob", "%urgent%"]
Explanation: Filters results to include only rows where thedescription
attribute contains the word “urgent”.
Use Case: Find all tasks or tickets marked as urgent. -
Not Contains Filter
Input:["content_column_id", "NotIGlob", "%draft%"]
Explanation: Excludes results where thecontent
attribute contains the word “draft”.
Use Case: Retrieve only published or final content. -
Contains 1 of X Values Filter
Input:["Or", [["url_column_id", "IGlob", "%/blog/%"], ["url_column_id", "IGlob", "%/docs/%"]]]
Explanation: Include results where theurl
attribute contains “/blog/” or “/docs/”.
Use Case: Retrieve only relevant page scrape results from your blog or docs.
In and NotIn
In and NotIn Examples
-
In Filter
Input:["priority_column_id", "In", ["high", "critical"]]
Explanation: Filters results to include rows where thepriority
attribute is eitherhigh
orcritical
.
Use Case: Retrieve high-priority and critical tasks for immediate action. -
Not In Filter
Input:["category_column_id", "NotIn", ["archived", "deleted"]]
Explanation: Filters results to exclude rows where thecategory
attribute isarchived
ordeleted
.
Use Case: Retrieve only active and visible items.
Best Practices for Filters
- Combine Filters: Use logical combinations to create complex filtering rules.
- Dynamic Filters: Use Jinja templating to construct filters dynamically, e.g.,
["status_column_id", "Eq", "{{inputs.user_message}}"]
.- When building dynamic filters, remember to ensure string santization to ensure valid filter structure. For example, if the
{{inputs.user_message}}
may contain double quotes or new lines, try something like:["status_column_id", "Eq", "{{inputs.user_message|tojson}}"]
- When building dynamic filters, remember to ensure string santization to ensure valid filter structure. For example, if the
- Test Incrementally: Start with simple filters to validate results before adding more complexity.
- Optimize for Performance: Use specific filters (e.g., equality or range filters) to reduce the result set and improve search speed.
Outputs
The block outputs a list of results from the queried collection table, based on the specified search criteria.
Example Search Configurations
-
Search for Product Reviews:
- Search Term: “great quality”
- Minimum Similarity:
0.7
- Hybrid Search:
true
- Filters:
["rating_column_id", "Gt", 4]
- Limit:
5
Outcome: Returns up to 5 reviews that highly match “great quality” in their semantic meaning and have a rating greater than 4.
-
Retrieve Active Users:
- Search Term: “John Doe”
- Filters:
["status_column_id", "Eq", "active"]
- Limit:
10
Outcome: Lists up to 10 active users named “John Doe” or with a similar semantic match.
-
Find Overdue Tasks:
- Search Term: “overdue”
- Minimum Similarity:
0.8
- Filters:
["due_date_column_id", "Lt", 1704067200]
(Timestamp for “2025-01-01”) - Limit:
20
Outcome: Returns tasks that are semantically related to “overdue” and have a due date before January 1, 2025.
-
Search Customer Feedback for Complaints:
- Search Term: “bad service”
- Hybrid Search:
true
- Filters:
["category_column_id", "Eq", "complaints"]
- Limit:
15
Outcome: Retrieves up to 15 feedback entries categorized as complaints and semantically matching “bad service.”
-
Identify High-Priority Support Tickets:
- Search Term: “urgent issue”
- Minimum Similarity:
0.35
- Filters:
["priority_column_id", "Eq", "high"]
- Limit:
10
Outcome: Returns support tickets with high semantic relevance to “urgent issue” and marked as high priority.
-
Search for Products in a Price Range:
- Search Term: “wireless headphones”
- Filters:
["And", [["price_column_id", "Gte", 50], ["price_column_id", "Lte", 150]]]
- Limit:
10
Outcome: Finds up to 10 wireless headphones priced between 150.
-
Filter Blog Posts by Tags:
- Search Term: “AI trends”
- Hybrid Search:
true
- Filters:
["tags_column_id", "Glob", "AI"]
- Limit:
5
Outcome: Returns up to 5 blog posts tagged with “AI” and semantically related to “AI trends.”
Best Practices
-
Use Jinja Templating: Dynamically construct queries, filters, and ranking criteria based on user input or workflow variables for flexible and adaptive searches.
-
Fine-Tune Parameters: Experiment with
Minimum Similarity
,Hybrid Search
, andAlpha
to achieve the desired balance between precision and recall. -
Apply Filters Thoughtfully: Filters can greatly enhance search efficiency by narrowing the scope of results to the most relevant subset.
-
Leverage Hybrid Search: Use Hybrid Search for complex queries requiring a balance of semantic understanding and keyword-based ranking.
-
Limit Results: Set a reasonable
Limit
to prevent overwhelming outputs and ensure the relevance of returned data.