Spawn Batch
How to Use the Spawn Batch Activity
Overview
The Spawn Batch activity is a powerful utility for splitting, distributing, and managing batches in Grooper workflows. It allows you to create one or more new batches from a subset of folders in the current batch based on various filtering criteria. This activity is essential for:
- Quality assurance sampling: Extract a random or systematic sample for QA review
- Document type segregation: Split mixed batches by content type or MIME type
- Workflow routing: Route specific documents to different processing paths
- Batch size management: Split large batches into smaller, more manageable units
- Exception handling: Separate flagged or invalid items for special processing
Key Features
- Multiple selection methods: Filter, Every N, Random
- Flexible filtering criteria: Content type, MIME type, flag status, validity
- Copy or move operations
- Automatic batch naming with customizable patterns
- Maximum batch size controls
- Optional field value copying
- Source batch cleanup options
When to Use Spawn Batch
Ideal Scenarios
- Use Spawn Batch when you need to
- Split a batch by document type (invoices vs. purchase orders)
- Extract a QA sample (10% random or every 10th document)
- Route flagged items to a review process
- Separate valid and invalid documents
- Create smaller batches from a large import
- Filter by file type (PDFs only, images only, etc.)
- Create parallel processing workflows
- Don’t use Spawn Batch when
- You just need to process documents sequentially (use normal batch processing)
- You want to modify documents in place (use other activities)
- You need to merge batches (use different techniques)
Basic Configuration
Step 1: Add the Activity to Your Batch Process
- Open your Batch Process in Grooper Desktop
- Navigate to the step where you want to spawn batches
- In the Activities panel, find Spawn Batch under the Utilities category
- Drag and drop Spawn Batch into your process step
Step 2: Essential Properties
Processing Level
- What it does
Determines which folder hierarchy level to examine.
- Default
1 (top-level folders/documents)
- When to change
- Use 0 to process all folders at all levels
- Use 2+ to process subfolders (e.g., pages within documents)
- Example Batch Structure
- Batch Root (Level 0)
- Document 1 (Level 1) – Processing Level = 1
- Page 1 (Level 2) – Processing Level = 2
- Page 2 (Level 2)
- Document 2 (Level 1)
- Page 1 (Level 2)
- Page 2 (Level 2)
- Document 1 (Level 1) – Processing Level = 1
Spawn Methods
Filter Method
Use when: You need to select documents based on specific attributes
Configuration Steps
- Set Method = Filter
- Choose your FilterBy criteria:
AllItems
Includes all folders at the processing level.
- Use Case
Split entire batch into fixed-size batches
- Settings
- FilterBy: AllItems
- Action: Move
- Maximum Batch Size: 100
- Result
Creates multiple 100-item batches from source
MimeType
Filters by file attachment type.
- Use Case
Separate PDFs from images
- Settings
- FilterBy: MimeType
- Configure MimeTypeLexicon to include:
- application/pdf
- application/vnd.ms-excel
- Action: Move
- Result
Only documents with PDF or Excel attachments are spawned
- Mime Type Lexicon Configuration
- Expand Mime Type Lexicon property
- Click Add Entry
- Enter MIME types (one per line):
- application/pdf – PDF documents
- image/tiff – TIFF images
- image/jpeg – JPEG images
- application/vnd.openxmlformats-officedocument.wordprocessingml.document – Word .docx
ContentType
Filters by assigned content type.
- Use Case
Separate invoices from purchase orders
- Settings
- FilterBy: ContentType
- Included Content Types: select Invoice
- Action: Move
- Target Step: Invoice Processing / Import
- Result
All invoices moved to invoice processing workflow
FlaggedItems
Selects documents that have been flagged.
- Use Case
Route exceptions to review
- Settings
- FilterBy: FlaggedItems
- Action: Move
- Target Step: Exception Review / Review
- Batch Name Prefix: Review-
- Result
Flagged items moved to review batch
InvalidItems
Selects documents with invalid index data.
- Use Case
Separate invalid documents for correction
- Settings
- FilterBy: InvalidItems
- Action: Move
- Target Step: Data Correction / Manual Review
- Result
Documents with validation errors moved to correction workflow
Invert Filter
The Invert Filter checkbox reverses your filter logic.
- Example – Get Non-PDF Documents
- FilterBy: MimeType
- MimeTypeLexicon: application/pdf
- Invert Filter: checked
- Result
- All non-PDF documents are selected
- Example – Get Non-Flagged Items
- FilterBy: FlaggedItems
- Invert Filter: checked
- Result
- All unflagged documents are selected
Every N Method
Use when: You need systematic sampling (every 10th, 25th, 100th document)
Configuration
- Set Method = EveryN
- Set Step Size: The interval for selection
- Example – 10% Systematic Sample
- Method: EveryN
- Step Size: 10
- Action: Copy
- Create Batch: checked
- Batch Name Prefix: QA-Sample-
- Result
- Every 10th document copied to individual QA batches
- Example – Quality Assurance Sample
- Scenario: 1000-document batch, need 50 documents for QA
- Method: EveryN
- Step Size: 20 (1000/50 = 20)
- Action: Copy (preserve originals)
- Create Batch: unchecked (group into single batch)
- Result
- 50 documents (every 20th) copied to one QA batch
Create Batch Property
Controls whether to create individual batches per item.
| Setting | Result |
|---|---|
| Checked | One batch per selected document |
| Unchecked | All selected documents in one batch |
- Example – Individual Batches
- Step Size: 5
- Create Batch: checked
- 15 documents in source
- Result
- 3 batches created (documents 5, 10, 15)
- Example – Grouped Batch
- Step Size: 5
- Create Batch: unchecked
- 15 documents in source
- Result
- 1 batch with 3 documents (5, 10, 15)
Random Method
Use when: You need random sampling for QA or statistical analysis
Configuration
- Set Method = Random
- Set Spawn Percentage: Fraction of documents to select (0.0 to 1.0)
- Example – 10% Random Sample
- Method: Random
- Spawn Percentage: 0.10 (10%)
- Action: Copy
- Batch Name Prefix: Random-QA-
- Result
- Approximately 10% of documents randomly selected
- Example – Large Random Sample
- Scenario: 5000 documents, need 500 for review
- Method: Random
- Spawn Percentage: 0.10 (500/5000)
- Action: Move
- Maximum Batch Size: 100
- Result
- 500 random documents split into 5 batches of 100 each
Understanding Spawn Percentage
| Percentage | Decimal | Example (100 docs) |
|---|---|---|
| 5% | 0.05 | ~5 documents |
| 10% | 0.10 | ~10 documents |
| 25% | 0.25 | ~25 documents |
| 50% | 0.50 | ~50 documents |
Note: Random selection is probabilistic. Results may vary slightly from exact percentages.
Action Types
The Action property determines what happens to selected folders.
Copy
- Effect
Duplicates folders to new batch; originals remain in source batch.
- Use Case
QA Sampling (preserve originals)
- Settings
- Action: Copy
- Method: Random
- Spawn Percentage: 0.05
- Result
- 5% of documents copied to QA batch; 100% remain in production
- When to use Copy
- Quality assurance sampling
- Creating backup batches
- Parallel processing workflows
- Testing without modifying originals
Move
- Effect
Transfers folders to new batch; removes from source batch.
- Use Case
Route by Document Type
- Settings
- Action: Move
- FilterBy: ContentType
- Included Content Types: Invoice
- Target Step: Invoice Processing / Classification
- Result
- Invoices moved to invoice workflow; other docs remain
- When to use Move
- Document type routing
- Exception handling
- Workflow branching
- Batch size management
Delete
- Effect
Removes folders from source batch; no new batch created.
- Use Case
Remove Invalid Items
- Settings
- Action: Delete
- FilterBy: InvalidItems
- Result
- Invalid documents removed; no new batch created
Batch Naming
Batch Name Prefix
A string prepended to all spawned batch names.
- Examples
- "QA-" → QA-Invoice Batch 2024
- "Review-" → Review-Invoice Batch 2024
- "Invoices-" → Invoices-0001
- "" (empty) → Uses default naming
Batch Name Suffix
Controls how the batch name is completed.
None
Uses default naming with conflict resolution.
- Settings
- Batch Name Prefix: QA-
- Batch Name Suffix: None
- Result
"QA-[generated name]" or "QA-[generated name] (1)" if conflict
SourceBatchName
Appends original batch name to prefix.
- Source Batch
"Invoice Batch 2024-01-15"
- Settings
- Batch Name Prefix: QA-
- Batch Name Suffix: SourceBatchName
- Result
"QA-Invoice Batch 2024-01-15"
- Use when
- Traceability is important
- You want to preserve batch lineage
- Auditing spawned batch origins
NumberedSuffix
Creates sequential numbered batches with zero-padding.
- Settings
- Batch Name Prefix: Invoice-
- Batch Name Suffix: NumberedSuffix
- Batch Name Suffix Zero Padding: 4
- Results
- First batch: Invoice-0001
- Second batch: Invoice-0002
- Third batch: Invoice-0003
- …
- 10th batch: Invoice-0010
- 100th batch: Invoice-0100
- Zero Padding Examples
| Zero Padding | Format | Example |
|---|---|---|
| 0 | No padding | Batch-1, Batch-2 |
| 2 | 2 digits | Batch-01, Batch-02 |
| 3 | 3 digits | Batch-001, Batch-002 |
| 4 | 4 digits | Batch-0001, Batch-0002 |
| 6 | 6 digits | Batch-000001, Batch-000002 |
Advanced Features
Maximum Batch Size
Limits the number of folders in each spawned batch. When the limit is reached, a new batch is created.
- Example – Split Large Import
- Scenario: 1000 documents imported, want batches of 100
- FilterBy: AllItems
- Action: Move
- Maximum Batch Size: 100
- Batch Name Suffix: NumberedSuffix
- Result
- 10 batches created (Import-0001 through Import-0010)
- Example – Balanced QA Samples
- Method: Random
- Spawn Percentage: 0.20 (20%)
- Maximum Batch Size: 25
- Action: Copy
- Result
- Random 20% split into batches of max 25 documents each
Set to 0: No limit (all selected documents in one batch)
Starting Step
Routes spawned batches to a specific batch process step.
- Example – Route to Different Process
- Target Step: Invoice Processing / Classification
- Start Paused: unchecked
- Result
- New batch begins processing at Classification step
- Example – Manual Review
- Target Step: Review Process / Manual Review
- Start Paused: checked
- Result
- New batch created in Paused state for manual intervention
Leave blank: Creates batch without launching into a process (test batch)
Start Paused
When checked, spawned batches are created in a paused state.
- Use cases
- Manual review before processing
- Configuration or setup required
- Quality checks on spawned content
- Scheduled batch processing
Copy Batch Fields
When checked, copies index field values from source batch to spawned batch.
- Example – Preserve Metadata
- Source Batch Fields
- Customer: Acme Corp
- Department: Accounting
- Period: Q1 2024
- Settings
- Copy Batch Fields: checked
- Result
- New batch inherits all field values
- Requirements
- Source batch must have a content type assigned
- Target batch process must have a content type assigned
- Both content types must match
- Common Error
"Copy Batch Fields is set to true, but the source and target batch process content types do not match."
- Solution
Ensure both batch processes use the same content type or leave Copy Batch Fields unchecked.
Source Batch Disposition
Controls what happens to the source batch after spawning.
None (Default)
- Behavior
Source batch remains unchanged.
- Use when
- Using Copy action (originals should remain)
- Source batch needed for other processing
- Auditing or archival purposes
DeleteIfEmpty
- Behavior
Deletes source batch only if all folders were moved/deleted.
- Use when
- Using Move action with intent to replace source
- All documents routed to new batches
- Source batch should only exist if documents remain
DeleteAlways
- Behavior
Always deletes source batch after spawning.
- Use when
- Source batch is temporary/intermediate
- All documents have been routed
- Clean up is required regardless of remaining items
Common Use Cases
Use Case 1: Quality Assurance Sampling
Scenario: Extract 10% random sample from production batch for QA review
- Configuration
- Method: Random
- Spawn Percentage: 0.10
- Action: Copy
- Batch Name Prefix: QA-
- Batch Name Suffix: SourceBatchName
- Target Step: QA Review / Review
- Start Paused: unchecked
- Source Batch Disposition: None
- Result
- ~10% of documents copied to QA batch
- Original batch continues normal processing
- QA batch routed to review process
- Traceable batch naming
Use Case 2: Separate by Document Type
Scenario: Split mixed batch into invoices and purchase orders
- Step 1 – Extract Invoices
- Method: Filter
- FilterBy: ContentType
- Included Content Types: Invoice
- Action: Move
- Target Step: Invoice Process/Import
- Batch Name Prefix: INV-
- Source Batch Disposition: None
- Step 2 – Extract Purchase Orders
- Method: Filter
- FilterBy: ContentType
- Included Content Types: PO
- Action: Move
- Target Step: PO Process/Import
- Batch Name Prefix: PO-
- Source Batch Disposition: DeleteIfEmpty
- Result
- Invoices → Invoice Processing workflow
- Purchase Orders → PO Processing workflow
- Source batch deleted if empty
Use Case 3: Exception Handling
Scenario: Route flagged items to manual review
- Configuration
- Method: Filter
- FilterBy: FlaggedItems
- Action: Move
- Batch Name Prefix: Exceptions-
- Target Step: Exception Review / Manual Review
- Start Paused: checked
- Source Batch Disposition: None
- Result
- Flagged documents moved to exception batch
- Exception batch paused for manual intervention
- Remaining documents continue processing
Use Case 4: Split Large Batch
Scenario: Split 5000-document batch into 100-document batches
- Configuration
- Method: Filter
- FilterBy: AllItems
- Action: Move
- Maximum Batch Size: 100
- Batch Name Prefix: Batch-
- Batch Name Suffix: NumberedSuffix
- Batch Name Suffix Zero Padding: 4
- Target Step: [Same Process/Import]
- Source Batch Disposition: DeleteIfEmpty
- Result
- 50 batches created (Batch-0001 through Batch-0050)
- Each batch contains 100 documents
- Source batch deleted when empty
- All batches continue in same workflow
Use Case 5: Filter by File Type
Scenario: Process only PDF documents, archive others
- Configuration
- Method: Filter
- FilterBy: MimeType
- MimeTypeLexicon:
- application/pdf
- Action: Move
- Target Step: PDF Processing/Import
- Source Batch Disposition: None
- Follow-up (Optional – Remove non-PDFs)
- Method: Filter
- FilterBy: AllItems
- Action: Delete
- Result
- PDFs moved to PDF processing workflow
- Non-PDF documents deleted from source
Use Case 6: Systematic Validation Sample
Scenario: Every 25th document goes to validation for accuracy tracking
- Configuration
- Method: EveryN
- Step Size: 25
- Action: Copy
- Create Batch: unchecked
- Batch Name Prefix: Validation-
- Batch Name Suffix: SourceBatchName
- Target Step: Validation/Review
- Copy Batch Fields: checked
- Source Batch Disposition: None
- Result
- 4% systematic sample (1/25)
- All samples in single validation batch
- Batch metadata preserved
- Original batch continues processing
Troubleshooting
Problem: No Batches Created
- Symptoms
Activity completes but no new batches appear.
- Possible Causes & Solutions
- No matching folders
- Solution: Check filter criteria
- Verify FilterBy settings
- Check MimeTypeLexicon entries
- Confirm ContentTypes are correct
- Review Processing Level
- Solution: Check filter criteria
- Action set to Delete
- Solution: Change Action to Copy or Move if you need new batches
- Spawn Percentage too low
- Method: Random; Spawn Percentage: 0.01; Batch with 10 documents ⇒ 0.1 documents (rounds to 0)
- Solution: Increase percentage or use different method
Problem: “No folders found to process”
- Log Message
"No folders found to process"
- Diagnosis Steps
- Check Processing Level:
- Processing Level: 1
- Batch Structure: All folders at level 2
- Solution: Change Processing Level to 2
- Verify filter criteria:
- FilterBy: ContentType
- Included Content Types: [empty]
- Solution: Add content types to list
- Check Invert Filter:
- FilterBy: FlaggedItems
- Invert Filter: checked
- No unflagged items exist
- Solution: Uncheck Invert Filter
Problem: Batch Process Not Published
- Error
"Batch Process 'ProcessName' is not published."
- Cause
Target Step references an unpublished batch process.
- Solution
- Open the target batch process
- Click Publish in the ribbon
- Confirm publication
- Re-run the Spawn Batch activity
Problem: Content Type Mismatch
- Error
"Copy Batch Fields is set to true, but the source and target batch process content types do not match."
- Cause
Attempting to copy fields between incompatible content types.
- Solutions
- Option 1: Match content types
- Source process content type: Invoice
- Target process content type: Change to Invoice
- Re-run activity
- Option 2: Disable field copying
- Copy Batch Fields: uncheck
Problem: Batch Naming Conflicts
- Symptom
Batches named "Batch-0001 (1)", "Batch-0001 (2)".
- Cause
Batch names already exist.
- Solutions
- Use SourceBatchName suffix:
- Batch Name Suffix: SourceBatchName
- Result: Unique names based on source
- Include timestamp in prefix:
- Batch Name Prefix: Batch-2024-01-15-
- Result: Date-specific names
- Clean up existing batches:
- Delete or complete old batches with conflicting names
Problem: Spawned Batches Not Processing
- Symptom
Batches created but remain in "Ready" status.
- Possible Causes & Solutions
- Start Paused is checked
- Solution: Manually resume batches or uncheck Start Paused
- No Target Step specified
- Solution: Select appropriate batch process step
- Service not running
- Solution: Check Grooper Service status; start if stopped
Problem: Source Batch Not Deleted
- Symptom
Disposition set to DeleteAlways but batch remains.
- Possible Causes & Solutions
- Test batch
- Test batches are never deleted by design
- Solution: This is expected behavior
- Folders still remain
- Disposition: DeleteIfEmpty
- Some folders don't match filter
- Solution: Use DeleteAlways or adjust filter
- Batch locked
- Solution: Unlock batch manually and re-run
Best Practices
1. Test with Test Batches First
Always test new Spawn Batch configurations with test batches
- Testing Workflow
- Create small test batch (5–10 documents)
- Configure Spawn Batch activity
- Run test batch through process
- Verify:
- Correct documents selected
- Batches named properly
- Target workflow correct
- Source batch handled correctly
- Adjust configuration as needed
- Move to production
2. Use Descriptive Batch Name Prefixes
Clear prefixes make batch management easier
- Good Examples
- "QA-Sample-" — clearly indicates QA batches
- "Invoice-" — document type identification
- "Exceptions-" — special handling indicator
- "Split-" — indicates split from larger batch
- Poor Examples
- "B-" — not descriptive
- "" (empty) — no identification
- "Test123-" — unclear purpose
3. Choose Appropriate Action Type
Match action to use case
| Use Case | Recommended Action |
|---|---|
| QA Sampling | Copy |
| Document Routing | Move |
| Cleanup | Delete |
| Parallel Processing | Copy |
| Exception Handling | Move |
4. Set Reasonable Maximum Batch Size
Balance performance and manageability
- Guidelines
- Small batches (50–100): Interactive review processes
- Medium batches (100–500): Standard processing
- Large batches (500–1000): Bulk processing
- No limit (0): Use cautiously; can create very large batches
- Example – Balanced Approach
- Large import: 5000 documents
- Maximum Batch Size: 250
- Result
- 20 manageable batches
5. Manage Source Batch Disposition Carefully
Choose disposition based on workflow needs
- Decision Tree
- Using Copy action?
- Yes → Disposition: None
- No (Move/Delete) →
- Need source batch for reference?
- Yes → Disposition: None
- No →
- Some documents might remain?
- Yes → Disposition: DeleteIfEmpty
- No (all documents processed) → Disposition: DeleteAlways
- Some documents might remain?
- Need source batch for reference?
6. Use Copy Batch Fields Wisely
Only when content types match
- Checklist
- [ ] Source batch has content type
- [ ] Target process has content type
- [ ] Content types match exactly
- [ ] Fields are relevant to spawned batch
If any condition is false → Leave Copy Batch Fields unchecked.
7. Leverage Invert Filter
Powerful for exclusion scenarios
- Scenario 1 – Process Non-PDFs
- FilterBy: MimeType
- MimeTypeLexicon: application/pdf
- Invert Filter: checked
- Result
- All non-PDF documents selected
- Scenario 2 – Process Valid Items
- FilterBy: InvalidItems
- Invert Filter: checked
- Result
- All valid documents selected
8. Monitor Activity Statistics
Review stats after processing
- Available Statistics
- Folders Moved: Count of folders moved to spawned batches
- Folders Copied: Count of folders copied to spawned batches
- Use stats to
- Verify expected document counts
- Track QA sample sizes
- Audit document routing
- Troubleshoot issues
9. Plan for Batch Process Dependencies
Ensure target processes are ready
- Pre-flight Checklist
- [ ] Target batch process is published
- [ ] Target step exists in published version
- [ ] Content types are configured
- [ ] Downstream activities are configured
- [ ] Service has permissions to target folder
10. Document Your Configuration
Maintain configuration documentation
- Template
- Activity: Spawn Batch
- Purpose: [Describe why this spawn is happening]
- Method: [Filter/EveryN/Random]
- Filter Criteria: [Specific settings]
- Action: [Copy/Move/Delete]
- Target: [Destination process/step]
- Naming Pattern: [Prefix and suffix pattern]
- Special Notes: [Any caveats or special handling]
- Example
- Activity: Spawn Batch
- Purpose: QA Sampling for Invoice Processing
- Method: Random
- Filter Criteria: 10% random sample
- Action: Copy (preserve originals)
- Target: QA Review Process / Manual Review Step
- Naming Pattern: "QA-[SourceBatchName]"
- Special Notes: Runs weekly, samples archived after 30 days
Summary
The Spawn Batch activity is an essential tool for:
- Splitting batches by document type, status, or attributes
- Quality assurance through systematic or random sampling
- Workflow routing to specialized processing paths
- Batch management by controlling size and organization
- Exception handling for special cases and errors
Quick Reference
| Task | Method | Action | Key Settings |
|---|---|---|---|
| QA Sample | Random | Copy | Spawn Percentage: 0.10 |
| Route by Type | Filter | Move | FilterBy: ContentType |
| Every Nth | EveryN | Copy | Step Size: 10 |
| Split Large Batch | Filter | Move | Maximum Batch Size: 100 |
| Exception Handling | Filter | Move | FilterBy: FlaggedItems |
| Remove Invalid | Filter | Delete | FilterBy: InvalidItems |