Spawn Batch: Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 1: Line 1:
<blockquote>{{#lst:Glossary|Spawn Batch}}</blockquote>
= How to Use the Spawn Batch Activity =
 
__TOC__
 
== Overview ==
 
The '''Spawn Batch''' activity is a powerful utility for splitting, distributing, and managing batches in Grooper workflows. It allows you to create one or more new batches from a subset of folders in the current batch based on various filtering criteria. This activity is essential for:
* '''Quality assurance sampling''': Extract a random or systematic sample for QA review
* '''Document type segregation''': Split mixed batches by content type or MIME type
* '''Workflow routing''': Route specific documents to different processing paths
* '''Batch size management''': Split large batches into smaller, more manageable units
* '''Exception handling''': Separate flagged or invalid items for special processing
 
=== Key Features ===
* Multiple selection methods: Filter, Every N, Random
* Flexible filtering criteria: Content type, MIME type, flag status, validity
* Copy or move operations
* Automatic batch naming with customizable patterns
* Maximum batch size controls
* Optional field value copying
* Source batch cleanup options
 
== When to Use Spawn Batch ==
 
=== Ideal Scenarios ===
;Use Spawn Batch when you need to:
* Split a batch by document type (invoices vs. purchase orders)
* Extract a QA sample (10% random or every 10th document)
* Route flagged items to a review process
* Separate valid and invalid documents
* Create smaller batches from a large import
* Filter by file type (PDFs only, images only, etc.)
* Create parallel processing workflows
 
;Don’t use Spawn Batch when:
* You just need to process documents sequentially (use normal batch processing)
* You want to modify documents in place (use other activities)
* You need to merge batches (use different techniques)
 
== Basic Configuration ==
 
=== Step 1: Add the Activity to Your Batch Process ===
# Open your '''Batch Process''' in Grooper Desktop
# Navigate to the step where you want to spawn batches
# In the '''Activities''' panel, find '''Spawn Batch''' under the '''Utilities''' category
# Drag and drop '''Spawn Batch''' into your process step
 
=== Step 2: Essential Properties ===
 
==== Processing Level ====
;What it does
Determines which folder hierarchy level to examine.
;Default
1 (top-level folders/documents)
;When to change
* Use 0 to process all folders at all levels
* Use 2+ to process subfolders (e.g., pages within documents)
 
;Example Batch Structure
* Batch Root (Level 0)
** Document 1 (Level 1) – ''Processing Level = 1''
*** Page 1 (Level 2) – ''Processing Level = 2''
*** Page 2 (Level 2)
** Document 2 (Level 1)
*** Page 1 (Level 2)
*** Page 2 (Level 2)
 
== Spawn Methods ==
 
=== Filter Method ===
''Use when: You need to select documents based on specific attributes''
 
==== Configuration Steps ====
* Set '''Method''' = ''Filter''
* Choose your '''FilterBy''' criteria:
 
===== AllItems =====
Includes all folders at the processing level.
 
;Use Case
Split entire batch into fixed-size batches
 
;Settings
* FilterBy: '''AllItems'''
* Action: '''Move'''
* Maximum Batch Size: '''100'''
 
;Result
Creates multiple 100-item batches from source
 
===== MimeType =====
Filters by file attachment type.
 
;Use Case
Separate PDFs from images
 
;Settings
# FilterBy: '''MimeType'''
# Configure '''MimeTypeLexicon''' to include:
#* ''application/pdf''
#* ''application/vnd.ms-excel''
# Action: '''Move'''
 
;Result
Only documents with PDF or Excel attachments are spawned
 
;Mime Type Lexicon Configuration
# Expand '''Mime Type Lexicon''' property
# Click '''Add Entry'''
# Enter MIME types (one per line):
#* application/pdf – PDF documents
#* image/tiff – TIFF images
#* image/jpeg – JPEG images
#* application/vnd.openxmlformats-officedocument.wordprocessingml.document – Word .docx
 
===== ContentType =====
Filters by assigned content type.
 
;Use Case
Separate invoices from purchase orders
 
;Settings
# FilterBy: '''ContentType'''
# Included Content Types: select '''Invoice'''
# Action: '''Move'''
# Target Step: '''Invoice Processing''' / '''Import'''
 
;Result
All invoices moved to invoice processing workflow
 
===== FlaggedItems =====
Selects documents that have been flagged.
 
;Use Case
Route exceptions to review
 
;Settings
* FilterBy: '''FlaggedItems'''
* Action: '''Move'''
* Target Step: '''Exception Review''' / '''Review'''
* Batch Name Prefix: '''Review-'''
 
;Result
Flagged items moved to review batch
 
===== InvalidItems =====
Selects documents with invalid index data.
 
;Use Case
Separate invalid documents for correction
 
;Settings
* FilterBy: '''InvalidItems'''
* Action: '''Move'''
* Target Step: '''Data Correction''' / '''Manual Review'''
 
;Result
Documents with validation errors moved to correction workflow
 
==== Invert Filter ====
The '''Invert Filter''' checkbox reverses your filter logic.
 
;Example – Get Non-PDF Documents
* FilterBy: '''MimeType'''
* MimeTypeLexicon: ''application/pdf''
* Invert Filter: '''checked'''
;Result: All non-PDF documents are selected
 
;Example – Get Non-Flagged Items
* FilterBy: '''FlaggedItems'''
* Invert Filter: '''checked'''
;Result: All unflagged documents are selected
 
=== Every N Method ===
''Use when: You need systematic sampling (every 10th, 25th, 100th document)''
 
==== Configuration ====
* Set '''Method''' = ''EveryN''
* Set '''Step Size''': The interval for selection
 
;Example – 10% Systematic Sample
* Method: '''EveryN'''
* Step Size: '''10'''
* Action: '''Copy'''
* Create Batch: '''checked'''
* Batch Name Prefix: '''QA-Sample-'''
;Result: Every 10th document copied to individual QA batches
 
;Example – Quality Assurance Sample
:'''Scenario:''' 1000-document batch, need 50 documents for QA
* Method: '''EveryN'''
* Step Size: '''20''' (1000/50 = 20)
* Action: '''Copy''' (preserve originals)
* Create Batch: '''unchecked''' (group into single batch)
;Result: 50 documents (every 20th) copied to one QA batch
 
==== Create Batch Property ====
Controls whether to create individual batches per item.
 
{| class="wikitable"
! Setting !! Result
|-
| Checked || One batch per selected document
|-
| Unchecked || All selected documents in one batch
|}
 
;Example – Individual Batches
* Step Size: 5
* Create Batch: '''checked'''
* 15 documents in source
;Result: 3 batches created (documents 5, 10, 15)
 
;Example – Grouped Batch
* Step Size: 5
* Create Batch: '''unchecked'''
* 15 documents in source
;Result: 1 batch with 3 documents (5, 10, 15)
 
=== Random Method ===
''Use when: You need random sampling for QA or statistical analysis''
 
==== Configuration ====
* Set '''Method''' = ''Random''
* Set '''Spawn Percentage''': Fraction of documents to select (0.0 to 1.0)
 
;Example – 10% Random Sample
* Method: '''Random'''
* Spawn Percentage: '''0.10''' (10%)
* Action: '''Copy'''
* Batch Name Prefix: '''Random-QA-'''
;Result: Approximately 10% of documents randomly selected
 
;Example – Large Random Sample
:'''Scenario:''' 5000 documents, need 500 for review
* Method: '''Random'''
* Spawn Percentage: '''0.10''' (500/5000)
* Action: '''Move'''
* Maximum Batch Size: '''100'''
;Result: 500 random documents split into 5 batches of 100 each
 
==== Understanding Spawn Percentage ====
{| class="wikitable"
! Percentage !! Decimal !! Example (100 docs)
|-
| 5% || 0.05 || ~5 documents
|-
| 10% || 0.10 || ~10 documents
|-
| 25% || 0.25 || ~25 documents
|-
| 50% || 0.50 || ~50 documents
|}
 
''Note'': Random selection is probabilistic. Results may vary slightly from exact percentages.
 
== Action Types ==
 
The '''Action''' property determines what happens to selected folders.
 
=== Copy ===
;Effect
Duplicates folders to new batch; originals remain in source batch.
 
;Use Case
QA Sampling (preserve originals)
 
;Settings
* Action: '''Copy'''
* Method: '''Random'''
* Spawn Percentage: '''0.05'''
;Result: 5% of documents copied to QA batch; 100% remain in production
 
;When to use Copy
* Quality assurance sampling
* Creating backup batches
* Parallel processing workflows
* Testing without modifying originals
 
=== Move ===
;Effect
Transfers folders to new batch; removes from source batch.
 
;Use Case
Route by Document Type
 
;Settings
* Action: '''Move'''
* FilterBy: '''ContentType'''
* Included Content Types: '''Invoice'''
* Target Step: '''Invoice Processing''' / '''Classification'''
;Result: Invoices moved to invoice workflow; other docs remain
 
;When to use Move
* Document type routing
* Exception handling
* Workflow branching
* Batch size management
 
=== Delete ===
;Effect
Removes folders from source batch; no new batch created.
 
;Use Case
Remove Invalid Items
 
;Settings
* Action: '''Delete'''
* FilterBy: '''InvalidItems'''
;Result: Invalid documents removed; no new batch created
 
{{Ambox
| type = warning
| text = '''Warning:''' Delete action is permanent! Use with caution in production.
}}
 
== Batch Naming ==
 
=== Batch Name Prefix ===
A string prepended to all spawned batch names.
 
;Examples
* "QA-" → ''QA-Invoice Batch 2024''
* "Review-" → ''Review-Invoice Batch 2024''
* "Invoices-" → ''Invoices-0001''
* "" (empty) → Uses default naming
 
=== Batch Name Suffix ===
Controls how the batch name is completed.
 
==== None ====
Uses default naming with conflict resolution.
 
;Settings
* Batch Name Prefix: '''QA-'''
* Batch Name Suffix: '''None'''
;Result
"QA-[generated name]" or "QA-[generated name] (1)" if conflict
 
==== SourceBatchName ====
Appends original batch name to prefix.
 
;Source Batch
"Invoice Batch 2024-01-15"
 
;Settings
* Batch Name Prefix: '''QA-'''
* Batch Name Suffix: '''SourceBatchName'''
;Result
"QA-Invoice Batch 2024-01-15"
 
;Use when
* Traceability is important
* You want to preserve batch lineage
* Auditing spawned batch origins
 
==== NumberedSuffix ====
Creates sequential numbered batches with zero-padding.
 
;Settings
* Batch Name Prefix: '''Invoice-'''
* Batch Name Suffix: '''NumberedSuffix'''
* Batch Name Suffix Zero Padding: '''4'''
 
;Results
* First batch: ''Invoice-0001''
* Second batch: ''Invoice-0002''
* Third batch: ''Invoice-0003''
* …
* 10th batch: ''Invoice-0010''
* 100th batch: ''Invoice-0100''
 
;Zero Padding Examples
{| class="wikitable"
! Zero Padding !! Format !! Example
|-
| 0 || No padding || Batch-1, Batch-2
|-
| 2 || 2 digits || Batch-01, Batch-02
|-
| 3 || 3 digits || Batch-001, Batch-002
|-
| 4 || 4 digits || Batch-0001, Batch-0002
|-
| 6 || 6 digits || Batch-000001, Batch-000002
|}
 
== Advanced Features ==
 
=== Maximum Batch Size ===
Limits the number of folders in each spawned batch. When the limit is reached, a new batch is created.
 
;Example – Split Large Import
:'''Scenario:''' 1000 documents imported, want batches of 100
* FilterBy: '''AllItems'''
* Action: '''Move'''
* Maximum Batch Size: '''100'''
* Batch Name Suffix: '''NumberedSuffix'''
;Result: 10 batches created (Import-0001 through Import-0010)
 
;Example – Balanced QA Samples
* Method: '''Random'''
* Spawn Percentage: '''0.20''' (20%)
* Maximum Batch Size: '''25'''
* Action: '''Copy'''
;Result: Random 20% split into batches of max 25 documents each
 
''Set to 0'': No limit (all selected documents in one batch)
 
=== Starting Step ===
Routes spawned batches to a specific batch process step.
 
;Example – Route to Different Process
* Target Step: '''Invoice Processing''' / '''Classification'''
* Start Paused: '''unchecked'''
;Result: New batch begins processing at Classification step
 
;Example – Manual Review
* Target Step: '''Review Process''' / '''Manual Review'''
* Start Paused: '''checked'''
;Result: New batch created in Paused state for manual intervention
 
''Leave blank'': Creates batch without launching into a process (test batch)
 
=== Start Paused ===
When checked, spawned batches are created in a paused state.
 
;Use cases
* Manual review before processing
* Configuration or setup required
* Quality checks on spawned content
* Scheduled batch processing
 
=== Copy Batch Fields ===
When checked, copies index field values from source batch to spawned batch.
 
;Example – Preserve Metadata
;Source Batch Fields
* Customer: ''Acme Corp''
* Department: ''Accounting''
* Period: ''Q1 2024''
 
;Settings
* Copy Batch Fields: '''checked'''
;Result: New batch inherits all field values
 
;Requirements
* Source batch must have a content type assigned
* Target batch process must have a content type assigned
* Both content types must match
 
;Common Error
"Copy Batch Fields is set to true, but the source and target batch process content types do not match."
 
;Solution
Ensure both batch processes use the same content type or leave Copy Batch Fields unchecked.
 
=== Source Batch Disposition ===
Controls what happens to the source batch after spawning.
 
==== None (Default) ====
;Behavior
Source batch remains unchanged.
 
;Use when
* Using Copy action (originals should remain)
* Source batch needed for other processing
* Auditing or archival purposes
 
==== DeleteIfEmpty ====
;Behavior
Deletes source batch only if all folders were moved/deleted.
 
;Use when
* Using Move action with intent to replace source
* All documents routed to new batches
* Source batch should only exist if documents remain
 
==== DeleteAlways ====
;Behavior
Always deletes source batch after spawning.
 
;Use when
* Source batch is temporary/intermediate
* All documents have been routed
* Clean up is required regardless of remaining items
 
{{Ambox
| type = notice
| text = '''Safety:''' Test batches are NEVER deleted, regardless of this setting.
}}
 
== Common Use Cases ==
 
=== Use Case 1: Quality Assurance Sampling ===
''Scenario'': Extract 10% random sample from production batch for QA review
 
;Configuration
* Method: '''Random'''
* Spawn Percentage: '''0.10'''
* Action: '''Copy'''
* Batch Name Prefix: '''QA-'''
* Batch Name Suffix: '''SourceBatchName'''
* Target Step: '''QA Review''' / '''Review'''
* Start Paused: '''unchecked'''
* Source Batch Disposition: '''None'''
 
;Result
* ~10% of documents copied to QA batch
* Original batch continues normal processing
* QA batch routed to review process
* Traceable batch naming
 
=== Use Case 2: Separate by Document Type ===
''Scenario'': Split mixed batch into invoices and purchase orders
 
;Step 1 – Extract Invoices
* Method: '''Filter'''
* FilterBy: '''ContentType'''
* Included Content Types: '''Invoice'''
* Action: '''Move'''
* Target Step: '''Invoice Process/Import'''
* Batch Name Prefix: '''INV-'''
* Source Batch Disposition: '''None'''
 
;Step 2 – Extract Purchase Orders
* Method: '''Filter'''
* FilterBy: '''ContentType'''
* Included Content Types: '''PO'''
* Action: '''Move'''
* Target Step: '''PO Process/Import'''
* Batch Name Prefix: '''PO-'''
* Source Batch Disposition: '''DeleteIfEmpty'''
 
;Result
* Invoices → Invoice Processing workflow
* Purchase Orders → PO Processing workflow
* Source batch deleted if empty
 
=== Use Case 3: Exception Handling ===
''Scenario'': Route flagged items to manual review
 
;Configuration
* Method: '''Filter'''
* FilterBy: '''FlaggedItems'''
* Action: '''Move'''
* Batch Name Prefix: '''Exceptions-'''
* Target Step: '''Exception Review''' / '''Manual Review'''
* Start Paused: '''checked'''
* Source Batch Disposition: '''None'''
 
;Result
* Flagged documents moved to exception batch
* Exception batch paused for manual intervention
* Remaining documents continue processing
 
=== Use Case 4: Split Large Batch ===
''Scenario'': Split 5000-document batch into 100-document batches
 
;Configuration
* Method: '''Filter'''
* FilterBy: '''AllItems'''
* Action: '''Move'''
* Maximum Batch Size: '''100'''
* Batch Name Prefix: '''Batch-'''
* Batch Name Suffix: '''NumberedSuffix'''
* Batch Name Suffix Zero Padding: '''4'''
* Target Step: '''[Same Process/Import]'''
* Source Batch Disposition: '''DeleteIfEmpty'''
 
;Result
* 50 batches created (Batch-0001 through Batch-0050)
* Each batch contains 100 documents
* Source batch deleted when empty
* All batches continue in same workflow
 
=== Use Case 5: Filter by File Type ===
''Scenario'': Process only PDF documents, archive others
 
;Configuration
* Method: '''Filter'''
* FilterBy: '''MimeType'''
* MimeTypeLexicon:
** application/pdf
* Action: '''Move'''
* Target Step: '''PDF Processing/Import'''
* Source Batch Disposition: '''None'''
 
;Follow-up (Optional – Remove non-PDFs)
* Method: '''Filter'''
* FilterBy: '''AllItems'''
* Action: '''Delete'''
 
;Result
* PDFs moved to PDF processing workflow
* Non-PDF documents deleted from source
 
=== Use Case 6: Systematic Validation Sample ===
''Scenario'': Every 25th document goes to validation for accuracy tracking
 
;Configuration
* Method: '''EveryN'''
* Step Size: '''25'''
* Action: '''Copy'''
* Create Batch: '''unchecked'''
* Batch Name Prefix: '''Validation-'''
* Batch Name Suffix: '''SourceBatchName'''
* Target Step: '''Validation/Review'''
* Copy Batch Fields: '''checked'''
* Source Batch Disposition: '''None'''
 
;Result
* 4% systematic sample (1/25)
* All samples in single validation batch
* Batch metadata preserved
* Original batch continues processing
 
== Troubleshooting ==
 
=== Problem: No Batches Created ===
;Symptoms
Activity completes but no new batches appear.
 
;Possible Causes & Solutions
* '''No matching folders'''
** Solution: Check filter criteria
*** Verify FilterBy settings
*** Check MimeTypeLexicon entries
*** Confirm ContentTypes are correct
*** Review Processing Level
* '''Action set to Delete'''
** Solution: Change Action to Copy or Move if you need new batches
* '''Spawn Percentage too low'''
** Method: Random; Spawn Percentage: 0.01; Batch with 10 documents ⇒ 0.1 documents (rounds to 0)
** Solution: Increase percentage or use different method
 
=== Problem: “No folders found to process” ===
;Log Message
"No folders found to process"
 
;Diagnosis Steps
* Check Processing Level:
** Processing Level: 1
** Batch Structure: All folders at level 2
** Solution: Change Processing Level to 2
* Verify filter criteria:
** FilterBy: '''ContentType'''
** Included Content Types: ''[empty]''
** Solution: Add content types to list
* Check Invert Filter:
** FilterBy: '''FlaggedItems'''
** Invert Filter: '''checked'''
** No unflagged items exist
** Solution: Uncheck Invert Filter
 
=== Problem: Batch Process Not Published ===
;Error
"Batch Process 'ProcessName' is not published."
 
;Cause
Target Step references an unpublished batch process.
 
;Solution
# Open the target batch process
# Click '''Publish''' in the ribbon
# Confirm publication
# Re-run the Spawn Batch activity
 
=== Problem: Content Type Mismatch ===
;Error
"Copy Batch Fields is set to true, but the source and target batch process content types do not match."
 
;Cause
Attempting to copy fields between incompatible content types.
 
;Solutions
* '''Option 1: Match content types'''
# Source process content type: ''Invoice''
# Target process content type: ''Change to Invoice''
# Re-run activity
* '''Option 2: Disable field copying'''
# Copy Batch Fields: '''uncheck'''
 
=== Problem: Batch Naming Conflicts ===
;Symptom
Batches named "Batch-0001 (1)", "Batch-0001 (2)".
 
;Cause
Batch names already exist.
 
;Solutions
* Use '''SourceBatchName''' suffix:
** Batch Name Suffix: '''SourceBatchName'''
** Result: Unique names based on source
* Include timestamp in prefix:
** Batch Name Prefix: '''Batch-2024-01-15-'''
** Result: Date-specific names
* Clean up existing batches:
** Delete or complete old batches with conflicting names
 
=== Problem: Spawned Batches Not Processing ===
;Symptom
Batches created but remain in "Ready" status.
 
;Possible Causes & Solutions
* '''Start Paused is checked'''
** Solution: Manually resume batches or uncheck Start Paused
* '''No Target Step specified'''
** Solution: Select appropriate batch process step
* '''Service not running'''
** Solution: Check Grooper Service status; start if stopped
 
=== Problem: Source Batch Not Deleted ===
;Symptom
Disposition set to DeleteAlways but batch remains.
 
;Possible Causes & Solutions
* '''Test batch'''
** Test batches are never deleted by design
** Solution: This is expected behavior
* '''Folders still remain'''
** Disposition: '''DeleteIfEmpty'''
** Some folders don't match filter
** Solution: Use DeleteAlways or adjust filter
* '''Batch locked'''
** Solution: Unlock batch manually and re-run
 
== Best Practices ==
 
=== 1. Test with Test Batches First ===
'''Always test new Spawn Batch configurations with test batches'''
 
;Testing Workflow
# Create small test batch (5–10 documents)
# Configure Spawn Batch activity
# Run test batch through process
# Verify:
#* Correct documents selected
#* Batches named properly
#* Target workflow correct
#* Source batch handled correctly
# Adjust configuration as needed
# Move to production
 
=== 2. Use Descriptive Batch Name Prefixes ===
'''Clear prefixes make batch management easier'''
 
;Good Examples
* "QA-Sample-" — clearly indicates QA batches
* "Invoice-" — document type identification
* "Exceptions-" — special handling indicator
* "Split-" — indicates split from larger batch
 
;Poor Examples
* "B-" — not descriptive
* "" (empty) — no identification
* "Test123-" — unclear purpose
 
=== 3. Choose Appropriate Action Type ===
'''Match action to use case'''
 
{| class="wikitable"
! Use Case !! Recommended Action
|-
| QA Sampling || Copy
|-
| Document Routing || Move
|-
| Cleanup || Delete
|-
| Parallel Processing || Copy
|-
| Exception Handling || Move
|}
 
=== 4. Set Reasonable Maximum Batch Size ===
'''Balance performance and manageability'''
 
;Guidelines
* '''Small batches (50–100)''': Interactive review processes
* '''Medium batches (100–500)''': Standard processing
* '''Large batches (500–1000)''': Bulk processing
* '''No limit (0)''': Use cautiously; can create very large batches
 
;Example – Balanced Approach
* Large import: 5000 documents
* Maximum Batch Size: 250
;Result: 20 manageable batches
 
=== 5. Manage Source Batch Disposition Carefully ===
'''Choose disposition based on workflow needs'''
 
;Decision Tree
* Using '''Copy''' action?
** Yes → Disposition: '''None'''
** No (Move/Delete) →
*** Need source batch for reference?
**** Yes → Disposition: '''None'''
**** No →
***** Some documents might remain?
****** Yes → Disposition: '''DeleteIfEmpty'''
****** No (all documents processed) → Disposition: '''DeleteAlways'''
 
=== 6. Use Copy Batch Fields Wisely ===
'''Only when content types match'''
 
;Checklist
* [ ] Source batch has content type
* [ ] Target process has content type
* [ ] Content types match exactly
* [ ] Fields are relevant to spawned batch
 
If any condition is false → Leave '''Copy Batch Fields''' unchecked.
 
=== 7. Leverage Invert Filter ===
'''Powerful for exclusion scenarios'''
 
;Scenario 1 – Process Non-PDFs
* FilterBy: '''MimeType'''
* MimeTypeLexicon: ''application/pdf''
* Invert Filter: '''checked'''
;Result: All non-PDF documents selected
 
;Scenario 2 – Process Valid Items
* FilterBy: '''InvalidItems'''
* Invert Filter: '''checked'''
;Result: All valid documents selected
 
=== 8. Monitor Activity Statistics ===
'''Review stats after processing'''
 
;Available Statistics
* '''Folders Moved''': Count of folders moved to spawned batches
* '''Folders Copied''': Count of folders copied to spawned batches
 
;Use stats to
* Verify expected document counts
* Track QA sample sizes
* Audit document routing
* Troubleshoot issues
 
=== 9. Plan for Batch Process Dependencies ===
'''Ensure target processes are ready'''
 
;Pre-flight Checklist
* [ ] Target batch process is published
* [ ] Target step exists in published version
* [ ] Content types are configured
* [ ] Downstream activities are configured
* [ ] Service has permissions to target folder
 
=== 10. Document Your Configuration ===
'''Maintain configuration documentation'''
 
;Template
* Activity: Spawn Batch
* Purpose: ''[Describe why this spawn is happening]''
* Method: ''[Filter/EveryN/Random]''
* Filter Criteria: ''[Specific settings]''
* Action: ''[Copy/Move/Delete]''
* Target: ''[Destination process/step]''
* Naming Pattern: ''[Prefix and suffix pattern]''
* Special Notes: ''[Any caveats or special handling]''
 
;Example
* Activity: Spawn Batch
* Purpose: QA Sampling for Invoice Processing
* Method: Random
* Filter Criteria: 10% random sample
* Action: Copy (preserve originals)
* Target: QA Review Process / Manual Review Step
* Naming Pattern: "QA-[SourceBatchName]"
* Special Notes: Runs weekly, samples archived after 30 days
 
== Summary ==
The Spawn Batch activity is an essential tool for:
* '''Splitting batches''' by document type, status, or attributes
* '''Quality assurance''' through systematic or random sampling
* '''Workflow routing''' to specialized processing paths
* '''Batch management''' by controlling size and organization
* '''Exception handling''' for special cases and errors
 
=== Quick Reference ===
{| class="wikitable"
! Task !! Method !! Action !! Key Settings
|-
| QA Sample || Random || Copy || Spawn Percentage: 0.10
|-
| Route by Type || Filter || Move || FilterBy: ContentType
|-
| Every Nth || EveryN || Copy || Step Size: 10
|-
| Split Large Batch || Filter || Move || Maximum Batch Size: 100
|-
| Exception Handling || Filter || Move || FilterBy: FlaggedItems
|-
| Remove Invalid || Filter || Delete || FilterBy: InvalidItems
|}

Revision as of 11:05, 30 January 2026

How to Use the Spawn Batch Activity

Overview

The Spawn Batch activity is a powerful utility for splitting, distributing, and managing batches in Grooper workflows. It allows you to create one or more new batches from a subset of folders in the current batch based on various filtering criteria. This activity is essential for:

  • Quality assurance sampling: Extract a random or systematic sample for QA review
  • Document type segregation: Split mixed batches by content type or MIME type
  • Workflow routing: Route specific documents to different processing paths
  • Batch size management: Split large batches into smaller, more manageable units
  • Exception handling: Separate flagged or invalid items for special processing

Key Features

  • Multiple selection methods: Filter, Every N, Random
  • Flexible filtering criteria: Content type, MIME type, flag status, validity
  • Copy or move operations
  • Automatic batch naming with customizable patterns
  • Maximum batch size controls
  • Optional field value copying
  • Source batch cleanup options

When to Use Spawn Batch

Ideal Scenarios

Use Spawn Batch when you need to
  • Split a batch by document type (invoices vs. purchase orders)
  • Extract a QA sample (10% random or every 10th document)
  • Route flagged items to a review process
  • Separate valid and invalid documents
  • Create smaller batches from a large import
  • Filter by file type (PDFs only, images only, etc.)
  • Create parallel processing workflows
Don’t use Spawn Batch when
  • You just need to process documents sequentially (use normal batch processing)
  • You want to modify documents in place (use other activities)
  • You need to merge batches (use different techniques)

Basic Configuration

Step 1: Add the Activity to Your Batch Process

  1. Open your Batch Process in Grooper Desktop
  2. Navigate to the step where you want to spawn batches
  3. In the Activities panel, find Spawn Batch under the Utilities category
  4. Drag and drop Spawn Batch into your process step

Step 2: Essential Properties

Processing Level

What it does

Determines which folder hierarchy level to examine.

Default

1 (top-level folders/documents)

When to change
  • Use 0 to process all folders at all levels
  • Use 2+ to process subfolders (e.g., pages within documents)
Example Batch Structure
  • Batch Root (Level 0)
    • Document 1 (Level 1) – Processing Level = 1
      • Page 1 (Level 2) – Processing Level = 2
      • Page 2 (Level 2)
    • Document 2 (Level 1)
      • Page 1 (Level 2)
      • Page 2 (Level 2)

Spawn Methods

Filter Method

Use when: You need to select documents based on specific attributes

Configuration Steps

  • Set Method = Filter
  • Choose your FilterBy criteria:
AllItems

Includes all folders at the processing level.

Use Case

Split entire batch into fixed-size batches

Settings
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
Result

Creates multiple 100-item batches from source

MimeType

Filters by file attachment type.

Use Case

Separate PDFs from images

Settings
  1. FilterBy: MimeType
  2. Configure MimeTypeLexicon to include:
    • application/pdf
    • application/vnd.ms-excel
  3. Action: Move
Result

Only documents with PDF or Excel attachments are spawned

Mime Type Lexicon Configuration
  1. Expand Mime Type Lexicon property
  2. Click Add Entry
  3. Enter MIME types (one per line):
    • application/pdf – PDF documents
    • image/tiff – TIFF images
    • image/jpeg – JPEG images
    • application/vnd.openxmlformats-officedocument.wordprocessingml.document – Word .docx
ContentType

Filters by assigned content type.

Use Case

Separate invoices from purchase orders

Settings
  1. FilterBy: ContentType
  2. Included Content Types: select Invoice
  3. Action: Move
  4. Target Step: Invoice Processing / Import
Result

All invoices moved to invoice processing workflow

FlaggedItems

Selects documents that have been flagged.

Use Case

Route exceptions to review

Settings
  • FilterBy: FlaggedItems
  • Action: Move
  • Target Step: Exception Review / Review
  • Batch Name Prefix: Review-
Result

Flagged items moved to review batch

InvalidItems

Selects documents with invalid index data.

Use Case

Separate invalid documents for correction

Settings
  • FilterBy: InvalidItems
  • Action: Move
  • Target Step: Data Correction / Manual Review
Result

Documents with validation errors moved to correction workflow

Invert Filter

The Invert Filter checkbox reverses your filter logic.

Example – Get Non-PDF Documents
  • FilterBy: MimeType
  • MimeTypeLexicon: application/pdf
  • Invert Filter: checked
Result
All non-PDF documents are selected
Example – Get Non-Flagged Items
  • FilterBy: FlaggedItems
  • Invert Filter: checked
Result
All unflagged documents are selected

Every N Method

Use when: You need systematic sampling (every 10th, 25th, 100th document)

Configuration

  • Set Method = EveryN
  • Set Step Size: The interval for selection
Example – 10% Systematic Sample
  • Method: EveryN
  • Step Size: 10
  • Action: Copy
  • Create Batch: checked
  • Batch Name Prefix: QA-Sample-
Result
Every 10th document copied to individual QA batches
Example – Quality Assurance Sample
Scenario: 1000-document batch, need 50 documents for QA
  • Method: EveryN
  • Step Size: 20 (1000/50 = 20)
  • Action: Copy (preserve originals)
  • Create Batch: unchecked (group into single batch)
Result
50 documents (every 20th) copied to one QA batch

Create Batch Property

Controls whether to create individual batches per item.

Setting Result
Checked One batch per selected document
Unchecked All selected documents in one batch
Example – Individual Batches
  • Step Size: 5
  • Create Batch: checked
  • 15 documents in source
Result
3 batches created (documents 5, 10, 15)
Example – Grouped Batch
  • Step Size: 5
  • Create Batch: unchecked
  • 15 documents in source
Result
1 batch with 3 documents (5, 10, 15)

Random Method

Use when: You need random sampling for QA or statistical analysis

Configuration

  • Set Method = Random
  • Set Spawn Percentage: Fraction of documents to select (0.0 to 1.0)
Example – 10% Random Sample
  • Method: Random
  • Spawn Percentage: 0.10 (10%)
  • Action: Copy
  • Batch Name Prefix: Random-QA-
Result
Approximately 10% of documents randomly selected
Example – Large Random Sample
Scenario: 5000 documents, need 500 for review
  • Method: Random
  • Spawn Percentage: 0.10 (500/5000)
  • Action: Move
  • Maximum Batch Size: 100
Result
500 random documents split into 5 batches of 100 each

Understanding Spawn Percentage

Percentage Decimal Example (100 docs)
5% 0.05 ~5 documents
10% 0.10 ~10 documents
25% 0.25 ~25 documents
50% 0.50 ~50 documents

Note: Random selection is probabilistic. Results may vary slightly from exact percentages.

Action Types

The Action property determines what happens to selected folders.

Copy

Effect

Duplicates folders to new batch; originals remain in source batch.

Use Case

QA Sampling (preserve originals)

Settings
  • Action: Copy
  • Method: Random
  • Spawn Percentage: 0.05
Result
5% of documents copied to QA batch; 100% remain in production
When to use Copy
  • Quality assurance sampling
  • Creating backup batches
  • Parallel processing workflows
  • Testing without modifying originals

Move

Effect

Transfers folders to new batch; removes from source batch.

Use Case

Route by Document Type

Settings
  • Action: Move
  • FilterBy: ContentType
  • Included Content Types: Invoice
  • Target Step: Invoice Processing / Classification
Result
Invoices moved to invoice workflow; other docs remain
When to use Move
  • Document type routing
  • Exception handling
  • Workflow branching
  • Batch size management

Delete

Effect

Removes folders from source batch; no new batch created.

Use Case

Remove Invalid Items

Settings
  • Action: Delete
  • FilterBy: InvalidItems
Result
Invalid documents removed; no new batch created

Template:Ambox

Batch Naming

Batch Name Prefix

A string prepended to all spawned batch names.

Examples
  • "QA-" → QA-Invoice Batch 2024
  • "Review-" → Review-Invoice Batch 2024
  • "Invoices-" → Invoices-0001
  • "" (empty) → Uses default naming

Batch Name Suffix

Controls how the batch name is completed.

None

Uses default naming with conflict resolution.

Settings
  • Batch Name Prefix: QA-
  • Batch Name Suffix: None
Result

"QA-[generated name]" or "QA-[generated name] (1)" if conflict

SourceBatchName

Appends original batch name to prefix.

Source Batch

"Invoice Batch 2024-01-15"

Settings
  • Batch Name Prefix: QA-
  • Batch Name Suffix: SourceBatchName
Result

"QA-Invoice Batch 2024-01-15"

Use when
  • Traceability is important
  • You want to preserve batch lineage
  • Auditing spawned batch origins

NumberedSuffix

Creates sequential numbered batches with zero-padding.

Settings
  • Batch Name Prefix: Invoice-
  • Batch Name Suffix: NumberedSuffix
  • Batch Name Suffix Zero Padding: 4
Results
  • First batch: Invoice-0001
  • Second batch: Invoice-0002
  • Third batch: Invoice-0003
  • 10th batch: Invoice-0010
  • 100th batch: Invoice-0100
Zero Padding Examples
Zero Padding Format Example
0 No padding Batch-1, Batch-2
2 2 digits Batch-01, Batch-02
3 3 digits Batch-001, Batch-002
4 4 digits Batch-0001, Batch-0002
6 6 digits Batch-000001, Batch-000002

Advanced Features

Maximum Batch Size

Limits the number of folders in each spawned batch. When the limit is reached, a new batch is created.

Example – Split Large Import
Scenario: 1000 documents imported, want batches of 100
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
  • Batch Name Suffix: NumberedSuffix
Result
10 batches created (Import-0001 through Import-0010)
Example – Balanced QA Samples
  • Method: Random
  • Spawn Percentage: 0.20 (20%)
  • Maximum Batch Size: 25
  • Action: Copy
Result
Random 20% split into batches of max 25 documents each

Set to 0: No limit (all selected documents in one batch)

Starting Step

Routes spawned batches to a specific batch process step.

Example – Route to Different Process
  • Target Step: Invoice Processing / Classification
  • Start Paused: unchecked
Result
New batch begins processing at Classification step
Example – Manual Review
  • Target Step: Review Process / Manual Review
  • Start Paused: checked
Result
New batch created in Paused state for manual intervention

Leave blank: Creates batch without launching into a process (test batch)

Start Paused

When checked, spawned batches are created in a paused state.

Use cases
  • Manual review before processing
  • Configuration or setup required
  • Quality checks on spawned content
  • Scheduled batch processing

Copy Batch Fields

When checked, copies index field values from source batch to spawned batch.

Example – Preserve Metadata
Source Batch Fields
  • Customer: Acme Corp
  • Department: Accounting
  • Period: Q1 2024
Settings
  • Copy Batch Fields: checked
Result
New batch inherits all field values
Requirements
  • Source batch must have a content type assigned
  • Target batch process must have a content type assigned
  • Both content types must match
Common Error

"Copy Batch Fields is set to true, but the source and target batch process content types do not match."

Solution

Ensure both batch processes use the same content type or leave Copy Batch Fields unchecked.

Source Batch Disposition

Controls what happens to the source batch after spawning.

None (Default)

Behavior

Source batch remains unchanged.

Use when
  • Using Copy action (originals should remain)
  • Source batch needed for other processing
  • Auditing or archival purposes

DeleteIfEmpty

Behavior

Deletes source batch only if all folders were moved/deleted.

Use when
  • Using Move action with intent to replace source
  • All documents routed to new batches
  • Source batch should only exist if documents remain

DeleteAlways

Behavior

Always deletes source batch after spawning.

Use when
  • Source batch is temporary/intermediate
  • All documents have been routed
  • Clean up is required regardless of remaining items

Template:Ambox

Common Use Cases

Use Case 1: Quality Assurance Sampling

Scenario: Extract 10% random sample from production batch for QA review

Configuration
  • Method: Random
  • Spawn Percentage: 0.10
  • Action: Copy
  • Batch Name Prefix: QA-
  • Batch Name Suffix: SourceBatchName
  • Target Step: QA Review / Review
  • Start Paused: unchecked
  • Source Batch Disposition: None
Result
  • ~10% of documents copied to QA batch
  • Original batch continues normal processing
  • QA batch routed to review process
  • Traceable batch naming

Use Case 2: Separate by Document Type

Scenario: Split mixed batch into invoices and purchase orders

Step 1 – Extract Invoices
  • Method: Filter
  • FilterBy: ContentType
  • Included Content Types: Invoice
  • Action: Move
  • Target Step: Invoice Process/Import
  • Batch Name Prefix: INV-
  • Source Batch Disposition: None
Step 2 – Extract Purchase Orders
  • Method: Filter
  • FilterBy: ContentType
  • Included Content Types: PO
  • Action: Move
  • Target Step: PO Process/Import
  • Batch Name Prefix: PO-
  • Source Batch Disposition: DeleteIfEmpty
Result
  • Invoices → Invoice Processing workflow
  • Purchase Orders → PO Processing workflow
  • Source batch deleted if empty

Use Case 3: Exception Handling

Scenario: Route flagged items to manual review

Configuration
  • Method: Filter
  • FilterBy: FlaggedItems
  • Action: Move
  • Batch Name Prefix: Exceptions-
  • Target Step: Exception Review / Manual Review
  • Start Paused: checked
  • Source Batch Disposition: None
Result
  • Flagged documents moved to exception batch
  • Exception batch paused for manual intervention
  • Remaining documents continue processing

Use Case 4: Split Large Batch

Scenario: Split 5000-document batch into 100-document batches

Configuration
  • Method: Filter
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
  • Batch Name Prefix: Batch-
  • Batch Name Suffix: NumberedSuffix
  • Batch Name Suffix Zero Padding: 4
  • Target Step: [Same Process/Import]
  • Source Batch Disposition: DeleteIfEmpty
Result
  • 50 batches created (Batch-0001 through Batch-0050)
  • Each batch contains 100 documents
  • Source batch deleted when empty
  • All batches continue in same workflow

Use Case 5: Filter by File Type

Scenario: Process only PDF documents, archive others

Configuration
  • Method: Filter
  • FilterBy: MimeType
  • MimeTypeLexicon:
    • application/pdf
  • Action: Move
  • Target Step: PDF Processing/Import
  • Source Batch Disposition: None
Follow-up (Optional – Remove non-PDFs)
  • Method: Filter
  • FilterBy: AllItems
  • Action: Delete
Result
  • PDFs moved to PDF processing workflow
  • Non-PDF documents deleted from source

Use Case 6: Systematic Validation Sample

Scenario: Every 25th document goes to validation for accuracy tracking

Configuration
  • Method: EveryN
  • Step Size: 25
  • Action: Copy
  • Create Batch: unchecked
  • Batch Name Prefix: Validation-
  • Batch Name Suffix: SourceBatchName
  • Target Step: Validation/Review
  • Copy Batch Fields: checked
  • Source Batch Disposition: None
Result
  • 4% systematic sample (1/25)
  • All samples in single validation batch
  • Batch metadata preserved
  • Original batch continues processing

Troubleshooting

Problem: No Batches Created

Symptoms

Activity completes but no new batches appear.

Possible Causes & Solutions
  • No matching folders
    • Solution: Check filter criteria
      • Verify FilterBy settings
      • Check MimeTypeLexicon entries
      • Confirm ContentTypes are correct
      • Review Processing Level
  • Action set to Delete
    • Solution: Change Action to Copy or Move if you need new batches
  • Spawn Percentage too low
    • Method: Random; Spawn Percentage: 0.01; Batch with 10 documents ⇒ 0.1 documents (rounds to 0)
    • Solution: Increase percentage or use different method

Problem: “No folders found to process”

Log Message

"No folders found to process"

Diagnosis Steps
  • Check Processing Level:
    • Processing Level: 1
    • Batch Structure: All folders at level 2
    • Solution: Change Processing Level to 2
  • Verify filter criteria:
    • FilterBy: ContentType
    • Included Content Types: [empty]
    • Solution: Add content types to list
  • Check Invert Filter:
    • FilterBy: FlaggedItems
    • Invert Filter: checked
    • No unflagged items exist
    • Solution: Uncheck Invert Filter

Problem: Batch Process Not Published

Error

"Batch Process 'ProcessName' is not published."

Cause

Target Step references an unpublished batch process.

Solution
  1. Open the target batch process
  2. Click Publish in the ribbon
  3. Confirm publication
  4. Re-run the Spawn Batch activity

Problem: Content Type Mismatch

Error

"Copy Batch Fields is set to true, but the source and target batch process content types do not match."

Cause

Attempting to copy fields between incompatible content types.

Solutions
  • Option 1: Match content types
  1. Source process content type: Invoice
  2. Target process content type: Change to Invoice
  3. Re-run activity
  • Option 2: Disable field copying
  1. Copy Batch Fields: uncheck

Problem: Batch Naming Conflicts

Symptom

Batches named "Batch-0001 (1)", "Batch-0001 (2)".

Cause

Batch names already exist.

Solutions
  • Use SourceBatchName suffix:
    • Batch Name Suffix: SourceBatchName
    • Result: Unique names based on source
  • Include timestamp in prefix:
    • Batch Name Prefix: Batch-2024-01-15-
    • Result: Date-specific names
  • Clean up existing batches:
    • Delete or complete old batches with conflicting names

Problem: Spawned Batches Not Processing

Symptom

Batches created but remain in "Ready" status.

Possible Causes & Solutions
  • Start Paused is checked
    • Solution: Manually resume batches or uncheck Start Paused
  • No Target Step specified
    • Solution: Select appropriate batch process step
  • Service not running
    • Solution: Check Grooper Service status; start if stopped

Problem: Source Batch Not Deleted

Symptom

Disposition set to DeleteAlways but batch remains.

Possible Causes & Solutions
  • Test batch
    • Test batches are never deleted by design
    • Solution: This is expected behavior
  • Folders still remain
    • Disposition: DeleteIfEmpty
    • Some folders don't match filter
    • Solution: Use DeleteAlways or adjust filter
  • Batch locked
    • Solution: Unlock batch manually and re-run

Best Practices

1. Test with Test Batches First

Always test new Spawn Batch configurations with test batches

Testing Workflow
  1. Create small test batch (5–10 documents)
  2. Configure Spawn Batch activity
  3. Run test batch through process
  4. Verify:
    • Correct documents selected
    • Batches named properly
    • Target workflow correct
    • Source batch handled correctly
  5. Adjust configuration as needed
  6. Move to production

2. Use Descriptive Batch Name Prefixes

Clear prefixes make batch management easier

Good Examples
  • "QA-Sample-" — clearly indicates QA batches
  • "Invoice-" — document type identification
  • "Exceptions-" — special handling indicator
  • "Split-" — indicates split from larger batch
Poor Examples
  • "B-" — not descriptive
  • "" (empty) — no identification
  • "Test123-" — unclear purpose

3. Choose Appropriate Action Type

Match action to use case

Use Case Recommended Action
QA Sampling Copy
Document Routing Move
Cleanup Delete
Parallel Processing Copy
Exception Handling Move

4. Set Reasonable Maximum Batch Size

Balance performance and manageability

Guidelines
  • Small batches (50–100): Interactive review processes
  • Medium batches (100–500): Standard processing
  • Large batches (500–1000): Bulk processing
  • No limit (0): Use cautiously; can create very large batches
Example – Balanced Approach
  • Large import: 5000 documents
  • Maximum Batch Size: 250
Result
20 manageable batches

5. Manage Source Batch Disposition Carefully

Choose disposition based on workflow needs

Decision Tree
  • Using Copy action?
    • Yes → Disposition: None
    • No (Move/Delete) →
      • Need source batch for reference?
        • Yes → Disposition: None
        • No →
          • Some documents might remain?
            • Yes → Disposition: DeleteIfEmpty
            • No (all documents processed) → Disposition: DeleteAlways

6. Use Copy Batch Fields Wisely

Only when content types match

Checklist
  • [ ] Source batch has content type
  • [ ] Target process has content type
  • [ ] Content types match exactly
  • [ ] Fields are relevant to spawned batch

If any condition is false → Leave Copy Batch Fields unchecked.

7. Leverage Invert Filter

Powerful for exclusion scenarios

Scenario 1 – Process Non-PDFs
  • FilterBy: MimeType
  • MimeTypeLexicon: application/pdf
  • Invert Filter: checked
Result
All non-PDF documents selected
Scenario 2 – Process Valid Items
  • FilterBy: InvalidItems
  • Invert Filter: checked
Result
All valid documents selected

8. Monitor Activity Statistics

Review stats after processing

Available Statistics
  • Folders Moved: Count of folders moved to spawned batches
  • Folders Copied: Count of folders copied to spawned batches
Use stats to
  • Verify expected document counts
  • Track QA sample sizes
  • Audit document routing
  • Troubleshoot issues

9. Plan for Batch Process Dependencies

Ensure target processes are ready

Pre-flight Checklist
  • [ ] Target batch process is published
  • [ ] Target step exists in published version
  • [ ] Content types are configured
  • [ ] Downstream activities are configured
  • [ ] Service has permissions to target folder

10. Document Your Configuration

Maintain configuration documentation

Template
  • Activity: Spawn Batch
  • Purpose: [Describe why this spawn is happening]
  • Method: [Filter/EveryN/Random]
  • Filter Criteria: [Specific settings]
  • Action: [Copy/Move/Delete]
  • Target: [Destination process/step]
  • Naming Pattern: [Prefix and suffix pattern]
  • Special Notes: [Any caveats or special handling]
Example
  • Activity: Spawn Batch
  • Purpose: QA Sampling for Invoice Processing
  • Method: Random
  • Filter Criteria: 10% random sample
  • Action: Copy (preserve originals)
  • Target: QA Review Process / Manual Review Step
  • Naming Pattern: "QA-[SourceBatchName]"
  • Special Notes: Runs weekly, samples archived after 30 days

Summary

The Spawn Batch activity is an essential tool for:

  • Splitting batches by document type, status, or attributes
  • Quality assurance through systematic or random sampling
  • Workflow routing to specialized processing paths
  • Batch management by controlling size and organization
  • Exception handling for special cases and errors

Quick Reference

Task Method Action Key Settings
QA Sample Random Copy Spawn Percentage: 0.10
Route by Type Filter Move FilterBy: ContentType
Every Nth EveryN Copy Step Size: 10
Split Large Batch Filter Move Maximum Batch Size: 100
Exception Handling Filter Move FilterBy: FlaggedItems
Remove Invalid Filter Delete FilterBy: InvalidItems