Save Document to Table Block
The Save Document to Table block allows you to save data from your workflow into a table within a collection. This block is perfect for storing workflow results, logging actions, or building databases from automated processes.
Overview
This block saves a document (row) to a specified table in one of your collections. Each field in the document can pull values from previous blocks in your workflow using template variables, making it easy to build dynamic workflows that store structured data.
When to Use This Block
Use the Save Document to Table block when you want to:
- ✅ Store workflow results in a structured format
- ✅ Log workflow executions for audit trails
- ✅ Build databases from automated processes
- ✅ Save data extracted from previous blocks
- ✅ Create records that can be viewed and managed in your Collections interface
- ✅ Build searchable knowledge bases with semantic search capabilities
Configuration
Required Fields
-
Collection - Select the collection where you want to save the document
- You can create a new collection directly from this dropdown if needed
-
Table - Select the table within the collection
- The table must exist in the selected collection
- You can create a new table directly from this dropdown if needed
-
Values - Configure one or more fields to save:
- Column: Select which column in the table to populate
- Type: Choose the data type (string, number, boolean, or JSON)
- Value: Enter the value using Jinja2 template syntax
Field Configuration
For each field you want to save, you need to specify:
- Column: The column ID from your table schema
- Type: One of four supported types:
string- Text valuesnumber- Numeric values (integers or decimals)boolean- True/false valuesjson- Structured JSON data
- Value: A Jinja2 template that can reference data from previous blocks
Template Variables
You can use Jinja2 template syntax in the Value field to reference data from previous blocks in your workflow. The template has access to all the state from previous blocks.
Accessing Block Output
To reference data from a previous block, use the block’s ID followed by the output path:
For example, if you have a block with ID extract_data that outputs:
You would reference these values like:
{{ extract_data.output.name }}→ “John Doe”{{ extract_data.output.email }}→ “john@example.com”{{ extract_data.output.score }}→ 95
Common Template Patterns
Simple text value:
Concatenated strings:
Formatted text:
Direct number:
Boolean from condition:
JSON object:
Or reference a JSON object from another block:
Conditional values:
Using date/time from previous blocks:
Type Casting
The block automatically converts values to match the selected type. This helps ensure data consistency:
- String: Converts any value to text
- Number: Converts strings like “123” to the number 123
- Boolean: Converts truthy/falsy values to true/false
- JSON: Parses JSON strings into structured objects
Example: Type Conversions
If you set the type to number and the template evaluates to "42", it will be saved as the number 42, not the string "42".
Step-by-Step Example
Let’s walk through saving a customer record after extracting data from an email:
-
Add the block to your workflow
-
Select Collection: Choose “Customer Database”
-
Select Table: Choose “Customers”
-
Configure Fields:
- Field 1:
- Column:
full_name - Type:
string - Value:
{{ extract_email.output.customer_name }}
- Column:
- Field 2:
- Column:
email_address - Type:
string - Value:
{{ extract_email.output.email }}
- Column:
- Field 3:
- Column:
signup_date - Type:
string - Value:
{{ extract_email.output.date }}
- Column:
- Field 4:
- Column:
is_premium - Type:
boolean - Value:
{{ check_tier.output.premium }}
- Column:
- Field 1:
-
Run the workflow - The document will be saved to your table
Complete Example Workflow
Here’s a complete example workflow that processes a support ticket and saves it:
The collection to save the document to. Ensure that the collection ID is correct to avoid saving data to the wrong collection.
The table to save to. Ensure that the table ID is correct and exists within the desired collection.
A list of field mappings to save. Each field maps a column in your table to a value. The default is an empty list. This field supports Jinja2 template syntax, allowing for dynamic content generation.
Each item in the Values list contains:
- Column (string, required): The column ID in the target table where the value will be saved
- Type (string, required): The data type of the value. Must be one of: “string”, “number”, “boolean”, or “json”
- Value (string, required): The value to save, using Jinja2 template syntax. This field supports dynamic content generation by referencing data from previous blocks in your workflow.
Understanding Vector Databases and Search
When you save documents to a table, you’re creating a database that can be searched in different ways. Tables can be configured with vector indexes, enabling powerful semantic search capabilities alongside traditional keyword search.
What is a Vector Database?
A vector database stores data as embeddings - mathematical representations (vectors) that capture the semantic meaning of text. Each document’s content is converted into a high-dimensional vector (typically 768 dimensions) that represents its meaning in a way that computers can understand and compare.
Semantic Search vs Keyword Search
There are two primary ways to search your saved documents:
Keyword Search (Traditional)
How it works:
- Searches for exact word matches or phrases
- Uses techniques like BM25 (a ranking algorithm) or full-text search
- Looks for specific terms in the document text
- Fast and precise for exact matches
Best for:
- Finding documents containing specific terms
- Searching for exact phrases or names
- Cases where terminology is consistent
- Structured queries with specific keywords
Example:
- Query:
"customer support ticket" - Finds: Documents that contain the exact words “customer”, “support”, and “ticket”
Semantic Search (Vector Search)
How it works:
- Converts your search query into a vector (embedding)
- Compares the query vector against all document vectors
- Uses cosine similarity to find documents with similar meaning
- Understands context, synonyms, and related concepts
Best for:
- Finding documents with similar meaning, even without exact word matches
- Natural language queries
- Concept-based searches
- Handling synonyms and variations in terminology
Example:
- Query:
"user having trouble accessing their account" - Finds: Documents about “login issues”, “authentication problems”, “account access errors” - even if they don’t contain the exact words from your query
Hybrid Search
Many tables support hybrid search, which combines both approaches:
- Keyword search ensures you find exact matches and important terms
- Semantic search finds conceptually similar content
- Results are combined and ranked intelligently using Reciprocal Rank Fusion (RRF)
Benefits:
- More comprehensive results
- Balances precision (keyword) with recall (semantic)
- Better handles queries with both specific terms and conceptual intent
Example:
- Query:
"customer complaint about billing" - Keyword search finds: Documents with “billing”, “complaint”, “customer”
- Semantic search finds: Documents about “payment issues”, “invoice problems”, “account charges”
- Hybrid combines both for the best results
When to Use Hybrid Search vs. Semantic Search
Choosing between hybrid search and pure semantic search depends on your use case, query types, and data characteristics. Here’s a practical guide:
Use Hybrid Search When:
1. You Need Both Precision and Recall
- ✅ Users search with both specific terms AND natural language
- ✅ You want to catch exact matches while also finding conceptually similar content
- ✅ Your data contains both technical terms and descriptive content
- ✅ You need to balance finding exact keywords with understanding intent
Example Use Cases:
- Knowledge bases where users might search for “API documentation” (keyword) or “how to integrate with our service” (semantic)
- Support ticket systems where users search by ticket number (keyword) or describe their problem (semantic)
- Product catalogs with both SKU numbers (keyword) and product descriptions (semantic)
2. Your Queries Mix Specific Terms with Concepts
- Query:
"React hooks useState tutorial" - Hybrid search finds: Documents with “React”, “hooks”, “useState” (keyword) AND documents about “state management in React” (semantic)
3. You Want Maximum Coverage
- Hybrid search ensures you don’t miss results that might only appear in one search method
- Better for general-purpose search where query types vary
4. Your Data Has Technical Terminology
- Technical terms, product names, or codes need exact matching
- But you also want to find related concepts and explanations
Use Pure Semantic Search When:
1. Queries are Primarily Natural Language
- ✅ Users describe what they’re looking for in conversational language
- ✅ Exact keyword matching is less important than understanding intent
- ✅ You want to find conceptually related content even without exact word matches
Example Use Cases:
- Customer support chatbots where users describe problems
- Content discovery systems (“find articles about similar topics”)
- Research and knowledge exploration
- Recommendation systems
2. Your Data Has Synonym-Rich Content
- Same concepts expressed in many different ways
- Terminology varies across documents
- You want to find all variations of an idea
Example:
- Query:
"feeling overwhelmed at work" - Semantic search finds: Documents about “stress management”, “burnout prevention”, “work-life balance” - even without exact word matches
3. You Prioritize Conceptual Understanding
- Finding related ideas and concepts is more important than exact matches
- Users explore topics rather than search for specific items
4. Your Queries are Ambiguous or Context-Dependent
- Query meaning depends on context
- Semantic search better understands context and intent
Hybrid Search Weighting (Alpha Parameter)
The alpha parameter controls how much weight semantic (vector) search has compared to keyword (BM25) search in hybrid search results. Understanding alpha is crucial for tuning your search to match your specific needs.
Alpha Range:
- Alpha = 0.0: Pure keyword search (BM25 only, no vector search)
- Alpha = 0.5: Balanced hybrid (default, equal weighting)
- Alpha = 1.0: Pure semantic search (vector only, no keyword search)
How Alpha Works in Reciprocal Rank Fusion (RRF)
Hybrid search uses Reciprocal Rank Fusion (RRF) to combine results from both search methods. Alpha controls the vector search contribution in the fusion formula:
Where:
kis a constant (typically 60) that prevents division by very small numbersBM25_rankis the document’s rank from keyword search (1st = rank 1, 2nd = rank 2, etc.)Vector_rankis the document’s rank from semantic searchalphamultiplies the vector search contribution
What This Means:
- Lower alpha: Vector search has less influence on final rankings
- Higher alpha: Vector search has more influence on final rankings
- BM25 always contributes: Even with alpha = 0, BM25 is still part of the formula (but vector search is effectively ignored when alpha = 0)
Understanding Alpha Values
Alpha = 0.0 - Pure Keyword Search
- Only keyword (BM25) results are considered
- Vector search runs but doesn’t affect rankings
- Best when exact term matching is critical
- Use for: Technical documentation, code searches, product SKUs
Alpha = 0.1 - 0.3 - Strongly Favor Keyword
- Keyword search dominates, but semantic search provides some boost
- Documents that rank well in both methods get extra boost
- Use for: API documentation, technical specs, structured data search
Alpha = 0.4 - 0.6 - Balanced (Recommended)
- Both search methods contribute significantly
- Good balance between precision (keyword) and recall (semantic)
- Default value (0.5) is a good starting point
- Use for: General knowledge bases, customer support, most use cases
Alpha = 0.7 - 0.9 - Favor Semantic Search
- Semantic search has more influence on rankings
- Still benefits from keyword precision for exact matches
- Use for: Content discovery, research, natural language queries
Alpha = 1.0 - Pure Semantic Search
- Only semantic (vector) results are considered
- Keyword search runs but doesn’t affect rankings
- Use for: Conceptual exploration, finding related ideas
Visual Example: How Alpha Affects Rankings
Let’s say you search for “React hooks tutorial” and these documents are found:
Key Observations:
- With alpha = 0.3: Keyword ranking dominates, exact matches prioritized
- With alpha = 0.5: Balanced - “State Management Guide” gets equal boost from both
- With alpha = 0.8: Semantic search has more influence - conceptually related content ranks higher
Adjust Alpha Based On:
Lower Alpha (0.0 - 0.4) - Favor Keyword Search:
Use when:
- Exact term matching is critical
- Technical terminology is important
- Users search with specific keywords
- Product names, codes, or IDs need to be found
Examples:
- API documentation (
alpha = 0.3) - Code repositories (
alpha = 0.2) - Product catalogs with SKUs (
alpha = 0.3) - Technical specifications (
alpha = 0.3-0.4)
Trade-offs:
- ✅ Excellent precision for exact matches
- ✅ Finds specific technical terms
- ❌ May miss conceptually related content
- ❌ Less effective for natural language queries
Medium Alpha (0.4 - 0.6) - Balanced:
Use when:
- You want the best of both worlds
- Queries mix specific terms and natural language
- General-purpose search across varied content
- Most knowledge bases and support systems
Examples:
- General knowledge bases (
alpha = 0.5) - Customer support systems (
alpha = 0.5) - Internal wikis (
alpha = 0.5) - Documentation with mixed content (
alpha = 0.4-0.6)
Trade-offs:
- ✅ Balanced precision and recall
- ✅ Handles both keyword and semantic queries well
- ✅ Good default for most use cases
- ⚖️ May not be optimal for extreme use cases
Higher Alpha (0.6 - 1.0) - Favor Semantic Search:
Use when:
- Understanding intent is more important than exact matches
- Natural language queries are common
- Finding conceptually similar content
- Users explore topics rather than search for specific items
Examples:
- Content discovery (
alpha = 0.8) - Research and exploration (
alpha = 0.7-0.9) - Conversational interfaces (
alpha = 0.7) - Recommendation systems (
alpha = 0.8-1.0)
Trade-offs:
- ✅ Excellent for finding related concepts
- ✅ Handles synonyms and variations well
- ✅ Better for natural language
- ❌ May miss exact keyword matches
- ❌ Less precise for specific technical terms
How to Choose the Right Alpha Value
Step 1: Start with Default
- Begin with
alpha = 0.5(balanced) - This works well for most use cases
Step 2: Analyze Your Queries
- Are queries mostly keywords? → Lower alpha (0.3-0.4)
- Are queries natural language? → Higher alpha (0.7-0.8)
- Mixed queries? → Keep alpha = 0.5
Step 3: Test and Iterate
- Try different alpha values with real queries
- Compare result quality
- Adjust based on user feedback
Step 4: Monitor Results
- Track which results users find helpful
- Identify patterns in successful searches
- Fine-tune alpha based on data
Alpha vs. Minimum Similarity
Important: Alpha and minimum similarity threshold serve different purposes:
Example:
Common Alpha Patterns
Technical Documentation:
Customer Support:
Content Discovery:
Product Search:
Research/Exploration:
Advanced: Alpha and Result Quality
When Alpha is Too Low:
- Exact matches rank highly ✅
- But conceptually relevant content might be buried
- Users might miss helpful related information
When Alpha is Too High:
- Conceptually related content ranks well ✅
- But exact keyword matches might be ranked lower
- Users searching for specific terms might be frustrated
The Sweet Spot:
- Balance that matches your query patterns
- Usually between 0.4-0.6 for most use cases
- Adjust based on actual user behavior and feedback
Summary
- Alpha controls the weight of semantic search in hybrid rankings
- Range: 0.0 (keyword only) to 1.0 (semantic only), default 0.5 (balanced)
- Lower alpha (0.0-0.4): Favor keyword search for exact matches
- Medium alpha (0.4-0.6): Balanced, good for most use cases
- Higher alpha (0.6-1.0): Favor semantic search for conceptual matching
- Start with 0.5 and adjust based on your queries and results
- Alpha is separate from minimum similarity threshold (they control different things)
Decision Matrix
Performance Considerations
Hybrid Search:
- Slightly slower (runs two searches and combines results)
- More comprehensive results
- Better for varied query types
Semantic Search:
- Faster (single search operation)
- More focused on conceptual matching
- Better for natural language queries
Testing Your Choice
Start with hybrid search (alpha = 0.5) and adjust based on:
- Query Analysis: Review common queries - are they keyword-heavy or natural language?
- Result Quality: Check if results are too focused on keywords (lower alpha) or missing exact matches (raise alpha)
- User Feedback: Monitor which results users find most relevant
- A/B Testing: Try different alpha values and compare user engagement
Tip: For most use cases, hybrid search with alpha = 0.5 is a good starting point. Adjust based on your specific needs.
How Vector Search Works in Practice
When you save a document to a table with vector indexing enabled:
-
Document Processing: The document’s content is automatically converted into a vector embedding using a machine learning model (typically Google’s
text-embedding-004) -
Storage: The vector is stored alongside your document data in the table
-
Search Time: When someone searches:
- The search query is converted into a query vector
- The system compares this vector against all document vectors
- Documents are ranked by similarity (cosine distance)
- Results are returned sorted by relevance
-
Similarity Threshold: You can set a minimum similarity threshold (0-1) to filter out irrelevant results
Understanding Minimum Similarity Scores and Vector Distance
When working with vector search, understanding similarity scores and vector distance is crucial for getting the right results. These concepts control how relevant your search results are.
What is Vector Distance?
Vector distance measures how far apart two vectors are in high-dimensional space. In semantic search, we use cosine distance to compare query vectors with document vectors.
How Cosine Distance Works:
- Distance = 0.0: Vectors point in the exact same direction (identical meaning)
- Distance = 1.0: Vectors point in opposite directions (completely different meaning)
- Distance = 0.5: Vectors are orthogonal (somewhat related, but not very similar)
Visual Analogy: Think of vectors as arrows in space. Cosine distance measures the angle between arrows:
- 0° angle (distance = 0): Arrows point the same direction → Very similar
- 90° angle (distance = 0.5): Arrows are perpendicular → Somewhat related
- 180° angle (distance = 1.0): Arrows point opposite directions → Completely different
What is Similarity Score?
Similarity score is the inverse of distance, making it more intuitive to work with:
Similarity Score Range:
- Similarity = 1.0: Perfect match (distance = 0)
- Similarity = 0.5: Moderate similarity (distance = 0.5)
- Similarity = 0.0: No similarity (distance = 1.0)
Why Use Similarity Instead of Distance?
- Higher numbers = better matches (more intuitive)
- Easier to understand thresholds (“I want results with at least 0.7 similarity”)
- Standard practice in search systems
Minimum Similarity Threshold
The minimum similarity threshold (also called min_similarity) filters out results that aren’t similar enough to your query. Only documents with similarity scores above the threshold are returned.
How It Works:
- System calculates similarity for all documents
- Filters out documents below the threshold
- Returns only documents that meet or exceed the minimum similarity
Example:
Choosing the Right Threshold
The threshold you choose depends on your use case and quality requirements:
Low Threshold (0.0 - 0.4):
- Use when: You want maximum recall (find everything potentially relevant)
- Trade-off: May include less relevant results
- Best for: Exploratory searches, research, content discovery
- Example:
min_similarity = 0.3for finding all related articles
Medium Threshold (0.4 - 0.7):
- Use when: You want balanced precision and recall
- Trade-off: Good balance between relevance and coverage
- Best for: General-purpose search, knowledge bases
- Example:
min_similarity = 0.5for customer support knowledge base
High Threshold (0.7 - 1.0):
- Use when: You need high precision (only very relevant results)
- Trade-off: May miss some relevant but less similar content
- Best for: Specific answers, exact matches, critical applications
- Example:
min_similarity = 0.8for finding exact technical documentation
Typical Threshold Values by Use Case
The Relationship Between Distance and Similarity
Remember: Distance and Similarity are inverses
Conversion Formula:
Why Thresholds Matter
Without a Threshold:
- All documents returned, even completely unrelated ones
- Low-quality results mixed with good ones
- Harder to find what you’re looking for
With Too Low a Threshold:
- Includes marginally relevant results
- More noise in results
- Harder to find the best matches
With Too High a Threshold:
- Only very similar results returned
- May miss relevant but differently worded content
- Fewer results overall
With the Right Threshold:
- Filters out noise while keeping relevant results
- Better user experience
- More focused, useful results
Practical Tips
1. Start with Defaults:
- Most systems default to
min_similarity = 0.5or0.7 - Good starting point for most use cases
2. Adjust Based on Results:
- Too many irrelevant results? → Raise the threshold
- Missing relevant results? → Lower the threshold
- Too few results? → Lower the threshold
3. Test with Real Queries:
- Try different thresholds with actual user queries
- Monitor which results users find helpful
- Adjust based on feedback
4. Consider Your Content:
- Technical content (specific terminology): Higher threshold (0.6-0.8)
- General content (varied language): Lower threshold (0.4-0.6)
- Synonym-rich content: Lower threshold to catch variations
5. Monitor Search Quality:
- Track average similarity scores of returned results
- If consistently low, your content might need improvement
- If consistently high, threshold might be too restrictive
Example: Adjusting Thresholds
Scenario: Customer Support Knowledge Base
Initial Setup:
Problem: Users complain about missing relevant articles
Investigation:
- Review queries: “how do I reset my password”
- Found article: “password reset instructions” (similarity = 0.65)
- Article was filtered out because 0.65 < 0.7
Solution:
Result: More relevant articles returned, users find what they need
Later Adjustment:
- Too many marginally relevant results appearing
- Raise to
min_similarity = 0.65for better balance
Advanced: Understanding Cosine Distance
Cosine distance measures the angle between vectors, not their magnitude:
Formula:
Where:
A · B= dot product of vectors A and B||A||= magnitude (length) of vector A||B||= magnitude (length) of vector B
Key Insight: Cosine distance focuses on direction (meaning) rather than magnitude (length). This makes it perfect for semantic search because:
- Two documents with similar meaning point in similar directions
- Document length doesn’t affect similarity (important for comparing long vs. short documents)
- Focuses on semantic relationships, not word counts
Summary
- Vector Distance: Measures how different two vectors are (0 = identical, 1 = completely different)
- Similarity Score: Inverse of distance (1 = identical, 0 = completely different)
- Minimum Similarity Threshold: Filters out results below a certain similarity
- Relationship: Similarity = 1 - Distance
- Best Practice: Start with 0.5-0.7, adjust based on your results and user feedback
When to Enable Vector Search
Enable vector indexing on your table when you want to:
- ✅ Search documents by meaning, not just keywords
- ✅ Find relevant content even with different wording
- ✅ Support natural language queries
- ✅ Build AI-powered search experiences
- ✅ Create knowledge bases that understand context
The Content Column: Special Handling for Vector Search
The content column is treated specially when your table has vector indexing enabled. Understanding this difference is crucial for building effective searchable databases.
What Makes the Content Column Different?
When a table has vector indexing, the content column receives special processing that other columns don’t:
-
Automatic Text Chunking: The content column is automatically split into smaller chunks (typically 2,500 characters with 200 character overlap) using a RecursiveCharacter splitter. This allows:
- Better handling of long documents
- More precise search results (searching within relevant sections)
- Improved embedding quality (smaller, focused chunks produce better embeddings)
-
Vector Embedding Generation: Each chunk from the content column gets its own vector embedding, which enables semantic search
-
Multiple Index Records: A single document with a long content field can create multiple index records (one per chunk), all linked back to the original document
-
Vector Index Field: The content column is designated as the
vector_index_field, meaning it’s the primary field used for vector similarity search
Other Columns: Standard Storage
Other columns in your table are stored differently:
- No Chunking: Other columns are stored as-is, without splitting
- Filterable Only: Other columns can be used for filtering and exact matching, but not for vector search
- Full-Text Search: String columns get basic full-text search capabilities (keyword search), but not semantic/vector search
- Metadata Storage: Other columns serve as metadata that can be filtered and displayed alongside search results
Practical Implications
Content Column:
- ✅ Use for the main searchable text (descriptions, articles, summaries, etc.)
- ✅ Automatically chunked for optimal search performance
- ✅ Enables semantic search (understanding meaning)
- ✅ Can be very long (will be chunked automatically)
- ✅ Best for natural language content
Other Columns:
- ✅ Use for structured data (titles, IDs, categories, dates, etc.)
- ✅ Stored as-is without chunking
- ✅ Used for filtering and exact matching
- ✅ Can be used for keyword search (if string type)
- ✅ Best for metadata, tags, and structured information
Example: Building a Knowledge Base
When saving a document to a knowledge base table:
How Chunking Works
When you save a document with a content column containing 10,000 characters:
- Original Document: One row in your table with all fields
- Index Records Created: Multiple chunks (e.g., 4 chunks of ~2,500 chars each)
- Each Chunk Gets: Its own embedding vector
- Search Behavior: When someone searches, the system:
- Finds relevant chunks (not just whole documents)
- Returns the parent document with the matching chunk highlighted
- Maintains context through chunk overlap (200 characters)
Benefits:
- More precise search results (finds relevant sections, not just documents)
- Better handling of long documents
- Improved search relevance (smaller chunks = more focused embeddings)
Best Practices
For the Content Column:
- Save complete, meaningful text (not just keywords)
- Include context and full descriptions
- Write naturally (embeddings understand language, not just terms)
- Longer content is fine (it will be chunked automatically)
For Other Columns:
- Use structured, consistent values
- Keep them concise (they’re not chunked)
- Use for filtering, sorting, and display
- Consider what users will want to filter by
When Your Table Doesn’t Have a Content Column
If your table doesn’t have vector indexing enabled or doesn’t have a content column:
- All columns are treated equally (no special processing)
- No automatic chunking occurs
- Vector search is not available
- Only keyword/full-text search is available for string columns
Performance Considerations
- Vector search is faster for large datasets compared to traditional keyword search
- Hybrid search provides the best balance of accuracy and coverage
- Similarity thresholds help filter irrelevant results and improve performance
- Vector indexes are optimized for fast similarity comparisons
Best Practices
1. Use Descriptive Field Names
Make sure your table columns have clear, descriptive names that match the data you’re saving.
2. Handle Missing Data
Use Jinja2 conditionals to handle cases where data might be missing:
3. Validate Data Types
Ensure the data type you select matches the actual data. For example:
- Use
numberfor numeric calculations - Use
booleanfor true/false values - Use
jsonfor complex nested structures
4. Error Handling
The block will raise an error if:
- The API request fails (non-200 status)
- The table or column doesn’t exist
- The data type doesn’t match the column schema
Make sure to test your workflow with sample data before deploying.
5. Naming Conventions
Use consistent naming for your block IDs to make templates easier to write:
extract_*for data extraction blocksprocess_*for data processing blockssave_*for save blocks
6. Optimizing for Vector Search
If your table has vector indexing enabled:
- Save comprehensive content: Store full, meaningful text in the content column for better semantic search
- Use descriptive text: Include context and details rather than just keywords
- Avoid abbreviations: Spell out terms to improve search quality
- Include synonyms: If possible, include alternative phrasings in your content
- Structure matters: Well-structured, complete sentences produce better embeddings than fragments
Output
After the block executes successfully, it returns:
- output: The response from the Collections API (typically includes the document ID and metadata)
- details: Execution metadata including
elapsed_time_ms
You can reference this output in subsequent blocks using:
Troubleshooting
”Failed to save document” Error
Possible causes:
- The table or column doesn’t exist
- The data type doesn’t match the column schema
- The template variable references a block that hasn’t run yet
- The template syntax is incorrect
Solutions:
- Verify the table and column exist in your collection
- Check that the data type matches the column’s expected type
- Ensure the referenced block runs before this block (check your workflow dependencies)
- Test your template syntax in a simple block first
Template Variable Not Found
If you see an error about a missing variable:
- Check that the block ID is correct (case-sensitive)
- Verify the block runs before this one in your workflow
- Check the output structure of the previous block to ensure the field path is correct
Type Conversion Errors
If type conversion fails:
- Check the actual value being generated by your template
- For JSON types, ensure the template produces valid JSON
- For numbers, ensure the template evaluates to a numeric value or numeric string
Column Not Found
If you see a column error:
- Refresh the column dropdown to ensure you have the latest schema
- Verify the column exists in the selected table
- Check that you’re using the correct column ID (not the display name)
Related Blocks
- Save Document to Collection (deprecated) - Older version that saves to collections without tables
- Other collection blocks that read or update documents
Tips
💡 Tip 1: Use the block’s output to chain multiple saves or create relationships between documents.
💡 Tip 2: Save workflow metadata (like execution time, status) alongside your data for better debugging.
💡 Tip 3: Use JSON type for complex nested data structures that don’t fit well into individual columns.
💡 Tip 4: Create test workflows with sample data to validate your field mappings before using real data.
💡 Tip 5: Use descriptive block display names in your workflow to make it easier to write templates that reference them.
💡 Tip 6: If you’re building a searchable knowledge base, enable vector indexing on your table and save full, descriptive content in the content column for better semantic search results.
See Also
- Getting Started with Workflows
- Jinja2 Template Documentation for advanced template features