Query Collection Table Block
Query and retrieve data from Scout collection tables
The Query Collection Table block enables users to perform advanced searches on a specified collection table within Scout. This block supports multiple search configurations, including vector and hybrid searches, and allows for dynamic query and filter construction using Jinja templating. It is ideal for retrieving data with fine-tuned control over ranking, relevance, and filtering criteria.
Configuration (Required)
Select from a dropdown of which collection to query.
Select from a dropdown the table within the selected collection to query. This input supports Jinja templating for dynamic table selection.
The term or phrase to search for in the selected table. This input supports Jinja templating for dynamic query construction. For example, use {{inputs.user_message}}
to dynamically query based on user input.
Advanced Configuration and Usage
Vector Minimum Distance
Threshold for filtering search results based on vector distance, which determines how closely the result matches the search term semantically. Lower values mean stricter matches, and the range is from 0 to 2. Default is 1
.
- Example Input:
0.8
- Effect: Returns results that are highly semantically similar to the search term, excluding loosely related matches.
- Use Case: Use a lower value (e.g.,
0.5
) when searching for highly specific matches, such as product names or technical terms.
Hybrid Search
Toggles between a hybrid search that combines vector search (semantic relevance) and BM25 (textual relevance). Default is false
, which only performs vector-based search.
- Example Input:
true
- Effect: Combines vector-based relevance and BM25 scoring for a balanced result.
- Use Case: Enable this when you need results that account for both semantic meaning and exact keyword matches.
BM25 Rank By
A list of attributes and corresponding queries for BM25 ranking. This allows you to control how textual relevance is calculated.
- Example Input:
[{"attribute": "content", "query": "hello"}]
- Effect: Results are ranked higher if they contain the word “hello” in the “content” field.
- Use Case: Use this to prioritize documents containing specific terms in specific fields.
Alpha
The weighting factor for hybrid search. A value between 0 and 1, where 0 relies fully on BM25 (text-based ranking) and 1 relies fully on vector search (semantic ranking). Default is 0.5
.
- Example Input:
0.7
- Effect: Gives more weight to semantic matching while still considering textual relevance.
- Use Case: Use higher alpha values (e.g.,
0.8
) when searching for documents with nuanced, semantic meanings.
Filters
A list of filters to narrow down search results. Filters are specified as JSON arrays, e.g., ["attribute", "operator", "value"]
.
Filters are used to narrow down the results by applying specific conditions to attributes in your data. Below is a detailed list of examples, along with explanations for each use case:
Basic Filter Examples
Basic Filter Examples
-
Equality Filter
Input:["status", "Eq", "active"]
Explanation: Filters results to include only rows where thestatus
attribute is equal toactive
.
Use Case: Retrieve all active users, orders, or documents. -
Inequality Filter
Input:["rating", "Ne", "5"]
Explanation: Excludes results where therating
attribute equals5
.
Use Case: Exclude perfect ratings when analyzing feedback. -
Greater Than Filter
Input:["age", "Gt", "30"]
Explanation: Filters results to include only rows where theage
attribute is greater than30
.
Use Case: Find users older than 30 for demographic analysis. -
Less Than Filter
Input:["price", "Lt", "50"]
Explanation: Filters results to include only rows where theprice
attribute is less than50
.
Use Case: Retrieve all products priced below $50 for a sale or discount offer.
Range Filter Examples
Range Filter Examples
-
Between Filter
Input:["date", "Between", [1672531200, 1703980800]]
(Timestamps for “2023-01-01” and “2023-12-31”) Explanation: Filters results to include only rows where thedate
attribute falls within the specified range.
Use Case: Retrieve all transactions or events within the year 2023. -
Not Between Filter
Input:["score", "NotBetween", ["60", "80"]]
Explanation: Excludes results where thescore
attribute is between 60 and 80.
Use Case: Exclude average scores to focus on high or low outliers.
Substring and Pattern Matching Filters
Substring and Pattern Matching Filters
-
Contains Filter
Input:["description", "Contains", "urgent"]
Explanation: Filters results to include only rows where thedescription
attribute contains the word “urgent”.
Use Case: Find all tasks or tickets marked as urgent. -
Not Contains Filter
Input:["content", "NotContains", "draft"]
Explanation: Excludes results where thecontent
attribute contains the word “draft”.
Use Case: Retrieve only published or final content. -
Starts With Filter
Input:["name", "StartsWith", "John"]
Explanation: Filters results to include only rows where thename
attribute starts with “John”.
Use Case: Find all users or contacts with names beginning with “John”. -
Ends With Filter
Input:["email", "EndsWith", "example.com"]
Explanation: Filters results to include only rows where theemail
attribute ends with “example.com”.
Use Case: Retrieve all users with email addresses from a specific domain.
Null and Empty Filters
Null and Empty Filters
-
Is Null Filter
Input:["last_login", "IsNull", null]
Explanation: Filters results to include only rows where thelast_login
attribute is null (i.e., no value).
Use Case: Find all users who have never logged in. -
Is Not Null Filter
Input:["last_login", "IsNotNull", null]
Explanation: Filters results to include only rows where thelast_login
attribute is not null.
Use Case: Retrieve all users who have logged in at least once.
Advanced Examples
Advanced Examples
-
In Filter
Input:["priority", "In", ["high", "critical"]]
Explanation: Filters results to include rows where thepriority
attribute is eitherhigh
orcritical
.
Use Case: Retrieve high-priority and critical tasks for immediate action. -
Not In Filter
Input:["category", "NotIn", ["archived", "deleted"]]
Explanation: Filters results to exclude rows where thecategory
attribute isarchived
ordeleted
.
Use Case: Retrieve only active and visible items. -
Custom Attribute Filter
Input:["custom_field", "Eq", "specific_value"]
Explanation: Filters results to include rows where thecustom_field
attribute matchesspecific_value
.
Use Case: Retrieve data for custom-defined attributes in your collection.
Best Practices for Filters
- Combine Filters: Use logical combinations to create complex filtering rules.
- Dynamic Filters: Use Jinja templating to construct filters dynamically, e.g.,
["status", "Eq", "{{inputs.user_message}}"]
. - Test Incrementally: Start with simple filters to validate results before adding more complexity.
- Optimize for Performance: Use specific filters (e.g., equality or range filters) to reduce the result set and improve search speed.
Outputs
The block outputs a list of results from the queried collection table, based on the specified search criteria.
Example Search Configurations
-
Search for Product Reviews:
- Search Term: “great quality”
- Vector Minimum Distance:
0.7
- Hybrid Search:
true
- BM25 Rank By:
[{"attribute": "review", "query": "great quality"}]
- Filters:
["rating", "Gt", "4"]
- Limit:
5
Outcome: Returns up to 5 reviews that highly match “great quality” in their semantic meaning and have a rating greater than 4.
-
Retrieve Active Users:
- Search Term: “John Doe”
- Filters:
["status", "Eq", "active"]
- Limit:
10
Outcome: Lists up to 10 active users named “John Doe” or with a similar semantic match.
-
Find Overdue Tasks:
- Search Term: “overdue”
- Vector Minimum Distance:
0.8
- Filters:
["due_date", "Lt", 1704067200]
(Timestamp for “2025-01-01”) - Limit:
20
Outcome: Returns tasks that are semantically related to “overdue” and have a due date before January 1, 2025.
-
Search Customer Feedback for Complaints:
- Search Term: “bad service”
- Hybrid Search:
true
- BM25 Rank By:
[{"attribute": "feedback", "query": "bad service"}]
- Filters:
["category", "Eq", "complaints"]
- Limit:
15
Outcome: Retrieves up to 15 feedback entries categorized as complaints and semantically matching “bad service.”
-
Identify High-Priority Support Tickets:
- Search Term: “urgent issue”
- Vector Minimum Distance:
0.6
- Filters:
["priority", "Eq", "high"]
- Limit:
10
Outcome: Returns support tickets with high semantic relevance to “urgent issue” and marked as high priority.
-
Search for Products in a Price Range:
- Search Term: “wireless headphones”
- Filters:
[["price", "Gt", "50"], ["price", "Lt", "150"]]
- Limit:
10
Outcome: Finds up to 10 wireless headphones priced between 150.
-
Filter Blog Posts by Tags:
- Search Term: “AI trends”
- Hybrid Search:
true
- BM25 Rank By:
[{"attribute": "tags", "query": "AI trends"}]
- Filters:
["tags", "Contains", "AI"]
- Limit:
5
Outcome: Returns up to 5 blog posts tagged with “AI” and semantically related to “AI trends.”
-
Find Users by Registration Date:
- Search Term: “new users”
- Filters:
[["registration_date", "Gt", 1704067200], ["registration_date", "Lt", 1735689600]]
(Timestamps for “2025-01-01” and “2025-12-31”) - Limit:
50
Outcome: Retrieves up to 50 users who registered between January 1, 2025, and December 31, 2025.
-
Search for Missing Information:
- Search Term: “missing data”
- Vector Minimum Distance:
0.9
- Filters:
["email", "IsNull", null]
- Limit:
25
Outcome: Finds entries semantically related to “missing data” where the email field is null.
-
Search for Top Performing Sales Reps:
- Search Term: “top performer”
- Filters:
["sales", "Gt", "100000"]
- BM25 Rank By:
[{"attribute": "performance_review", "query": "top performer"}]
- Hybrid Search:
true
- Alpha:
0.8
- Limit:
5
Outcome: Retrieves the top 5 sales reps with sales exceeding $100,000 and high semantic relevance to “top performer” in their performance review.
Best Practices
-
Use Jinja Templating: Dynamically construct queries, filters, and ranking criteria based on user input or workflow variables for flexible and adaptive searches.
-
Fine-Tune Parameters: Experiment with
Vector Minimum Distance
,Alpha
, andBM25 Rank By
to achieve the desired balance between precision and recall. -
Apply Filters Thoughtfully: Filters can greatly enhance search efficiency by narrowing the scope of results to the most relevant subset.
-
Leverage Hybrid Search: Use Hybrid Search for complex queries requiring a balance of semantic understanding and keyword-based ranking.
-
Limit Results: Set a reasonable
Limit
to prevent overwhelming outputs and ensure the relevance of returned data.