Spawn Batch

From Grooper Wiki
Revision as of 11:30, 30 January 2026 by Dgreenwood (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Overview

The Spawn Batch activity is a powerful utility for splitting, distributing, and managing Batches in Grooper workflows. It allows you to create one or more new Batches from a subset of folders in the current Batch based on various filtering criteria. This activity is essential for:

  • Quality assurance sampling: Extract a random or systematic sample for QA review
  • Document Type segregation: Split mixed Batches by Content Type or MIME type
  • Workflow routing: Route specific documents to different processing paths
  • Batch size management: Split large Batches into smaller, more manageable units
  • Exception handling: Separate flagged or invalid items for special processing

Key Features

  • Multiple selection methods: Filter, Every N, Random
  • Flexible filtering criteria: Content Type, MIME type, flag status, validity
  • Copy or move operations
  • Automatic Batch naming with customizable patterns
  • Maximum Batch size controls
  • Optional field value copying
  • Source Batch cleanup options

When to Use Spawn Batch

Ideal Scenarios

Use Spawn Batch when you need to
  • Split a Batch by Document Type (invoices vs. purchase orders)
  • Extract a QA sample (10% random or every 10th document)
  • Route flagged items to a review process
  • Separate valid and invalid documents
  • Create smaller Batches from a large import
  • Filter by file type (PDFs only, images only, etc.)
  • Create parallel processing workflows
Don’t use Spawn Batch when
  • You just need to process documents sequentially (use normal Batch processing)
  • You want to modify documents in place (use other activities)
  • You need to merge Batches (use different techniques)

Basic Configuration

Step 1: Add the Activity to Your Batch Process

  1. Open your Batch Process in Grooper Desktop
  2. Navigate to the step where you want to spawn Batches
  3. In the Activities panel, find Spawn Batch under the Utilities category
  4. Drag and drop Spawn Batch into your process step

Step 2: Essential Properties

Processing Level

What it does

Determines which folder hierarchy level to examine.

Default

1 (top-level folders/documents)

When to change
  • Use 0 to process all folders at all levels
  • Use 2+ to process subfolders (e.g., pages within documents)
Example Batch Structure
  • Batch Root (Level 0)
    • Document 1 (Level 1) – Processing Level = 1
      • Page 1 (Level 2) – Processing Level = 2
      • Page 2 (Level 2)
    • Document 2 (Level 1)
      • Page 1 (Level 2)
      • Page 2 (Level 2)

Spawn Methods

Filter Method

Use when: You need to select documents based on specific attributes

Configuration Steps

  • Set Method = Filter
  • Choose your FilterBy criteria:
AllItems

Includes all folders at the processing level.

Use Case

Split entire Batch into fixed-size Batches

Settings
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
Result

Creates multiple 100-item Batches from source

MimeType

Filters by file attachment type.

Use Case

Separate PDFs from images

Settings
  1. FilterBy: MimeType
  2. Configure MimeTypeLexicon to include:
    • application/pdf
    • application/vnd.ms-excel
  3. Action: Move
Result

Only documents with PDF or Excel attachments are spawned

Mime Type Lexicon Configuration
  1. Expand Mime Type Lexicon property
  2. Click Add Entry
  3. Enter MIME types (one per line):
    • application/pdf – PDF documents
    • image/tiff – TIFF images
    • image/jpeg – JPEG images
    • application/vnd.openxmlformats-officedocument.wordprocessingml.document – Word .docx
ContentType

Filters by assigned Content Type.

Use Case

Separate invoices from purchase orders

Settings
  1. FilterBy: ContentType
  2. Included Content Types: select Invoice
  3. Action: Move
  4. Target Step: Invoice Processing / Import
Result

All invoices moved to invoice processing workflow

FlaggedItems

Selects documents that have been flagged.

Use Case

Route exceptions to review

Settings
  • FilterBy: FlaggedItems
  • Action: Move
  • Target Step: Exception Review / Review
  • Batch Name Prefix: Review-
Result

Flagged items moved to review Batch

InvalidItems

Selects documents with invalid index data.

Use Case

Separate invalid documents for correction

Settings
  • FilterBy: InvalidItems
  • Action: Move
  • Target Step: Data Correction / Manual Review
Result

Documents with validation errors moved to correction workflow

Invert Filter

The Invert Filter checkbox reverses your filter logic.

Example – Get Non-PDF Documents
  • FilterBy: MimeType
  • MimeTypeLexicon: application/pdf
  • Invert Filter: checked
Result
All non-PDF documents are selected
Example – Get Non-Flagged Items
  • FilterBy: FlaggedItems
  • Invert Filter: checked
Result
All unflagged documents are selected

Every N Method

Use when: You need systematic sampling (every 10th, 25th, 100th document)

Configuration

  • Set Method = EveryN
  • Set Step Size: The interval for selection
Example – 10% Systematic Sample
  • Method: EveryN
  • Step Size: 10
  • Action: Copy
  • Create Batch: checked
  • Batch Name Prefix: QA-Sample-
Result
Every 10th document copied to individual QA Batches
Example – Quality Assurance Sample
Scenario: 1000-document Batch, need 50 documents for QA
  • Method: EveryN
  • Step Size: 20 (1000/50 = 20)
  • Action: Copy (preserve originals)
  • Create Batch: unchecked (group into single Batch)
Result
50 documents (every 20th) copied to one QA Batch

Create Batch Property

Controls whether to create individual Batches per item.

Setting Result
Checked One Batch per selected document
Unchecked All selected documents in one Batch
Example – Individual Batches
  • Step Size: 5
  • Create Batch: checked
  • 15 documents in source
Result
3 Batches created (documents 5, 10, 15)
Example – Grouped Batch
  • Step Size: 5
  • Create Batch: unchecked
  • 15 documents in source
Result
1 Batch with 3 documents (5, 10, 15)

Random Method

Use when: You need random sampling for QA or statistical analysis

Configuration

  • Set Method = Random
  • Set Spawn Percentage: Fraction of documents to select (0.0 to 1.0)
Example – 10% Random Sample
  • Method: Random
  • Spawn Percentage: 0.10 (10%)
  • Action: Copy
  • Batch Name Prefix: Random-QA-
Result
Approximately 10% of documents randomly selected
Example – Large Random Sample
Scenario: 5000 documents, need 500 for review
  • Method: Random
  • Spawn Percentage: 0.10 (500/5000)
  • Action: Move
  • Maximum Batch Size: 100
Result
500 random documents split into 5 Batches of 100 each

Understanding Spawn Percentage

Percentage Decimal Example (100 docs)
5% 0.05 ~5 documents
10% 0.10 ~10 documents
25% 0.25 ~25 documents
50% 0.50 ~50 documents

Note: Random selection is probabilistic. Results may vary slightly from exact percentages.

Action Types

The Action property determines what happens to selected folders.

Copy

Effect

Duplicates folders to new Batch; originals remain in source Batch.

Use Case

QA Sampling (preserve originals)

Settings
  • Action: Copy
  • Method: Random
  • Spawn Percentage: 0.05
Result
5% of documents copied to QA Batch; 100% remain in production
When to use Copy
  • Quality assurance sampling
  • Creating backup Batches
  • Parallel processing workflows
  • Testing without modifying originals

Move

Effect

Transfers folders to new Batch; removes from source Batch.

Use Case

Route by Document Type

Settings
  • Action: Move
  • FilterBy: ContentType
  • Included Content Types: Invoice
  • Target Step: Invoice Processing / Classification
Result
Invoices moved to invoice workflow; other docs remain
When to use Move
  • Document Type routing
  • Exception handling
  • Workflow branching
  • Batch size management

Delete

Effect

Removes folders from source Batch; no new Batch created.

Use Case

Remove Invalid Items

Settings
  • Action: Delete
  • FilterBy: InvalidItems
Result
Invalid documents removed; no new Batch created

Template:Ambox

Batch Naming

Batch Name Prefix

A string prepended to all spawned Batch names.

Examples
  • "QA-" → QA-Invoice Batch 2024
  • "Review-" → Review-Invoice Batch 2024
  • "Invoices-" → Invoices-0001
  • "" (empty) → Uses default naming

Batch Name Suffix

Controls how the Batch name is completed.

None

Uses default naming with conflict resolution.

Settings
  • Batch Name Prefix: QA-
  • Batch Name Suffix: None
Result

"QA-[generated name]" or "QA-[generated name] (1)" if conflict

SourceBatchName

Appends original Batch name to prefix.

Source Batch

"Invoice Batch 2024-01-15"

Settings
  • Batch Name Prefix: QA-
  • Batch Name Suffix: SourceBatchName
Result

"QA-Invoice Batch 2024-01-15"

Use when
  • Traceability is important
  • You want to preserve Batch lineage
  • Auditing spawned Batch origins

NumberedSuffix

Creates sequential numbered Batches with zero-padding.

Settings
  • Batch Name Prefix: Invoice-
  • Batch Name Suffix: NumberedSuffix
  • Batch Name Suffix Zero Padding: 4
Results
  • First Batch: Invoice-0001
  • Second Batch: Invoice-0002
  • Third Batch: Invoice-0003
  • 10th Batch: Invoice-0010
  • 100th Batch: Invoice-0100
Zero Padding Examples
Zero Padding Format Example
0 No padding Batch-1, Batch-2
2 2 digits Batch-01, Batch-02
3 3 digits Batch-001, Batch-002
4 4 digits Batch-0001, Batch-0002
6 6 digits Batch-000001, Batch-000002

Advanced Features

Maximum Batch Size

Limits the number of folders in each spawned Batch. When the limit is reached, a new Batch is created.

Example – Split Large Import
Scenario: 1000 documents imported, want Batches of 100
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
  • Batch Name Suffix: NumberedSuffix
Result
10 Batches created (Import-0001 through Import-0010)
Example – Balanced QA Samples
  • Method: Random
  • Spawn Percentage: 0.20 (20%)
  • Maximum Batch Size: 25
  • Action: Copy
Result
Random 20% split into Batches of max 25 documents each

Set to 0: No limit (all selected documents in one Batch)

Starting Step

Routes spawned Batches to a specific Batch Process Step.

Example – Route to Different Process
  • Target Step: Invoice Processing / Classification
  • Start Paused: unchecked
Result
New Batch begins processing at Classification step
Example – Manual Review
  • Target Step: Review Process / Manual Review
  • Start Paused: checked
Result
New Batch created in Paused state for manual intervention

Leave blank: Creates Batch without launching into a process (test Batch)

Start Paused

When checked, spawned Batches are created in a paused state.

Use cases
  • Manual review before processing
  • Configuration or setup required
  • Quality checks on spawned content
  • Scheduled Batch processing

Copy Batch Fields

When checked, copies index field values from source Batch to spawned Batch.

Example – Preserve Metadata
Source Batch Fields
  • Customer: Acme Corp
  • Department: Accounting
  • Period: Q1 2024
Settings
  • Copy Batch Fields: checked
Result
New Batch inherits all field values
Requirements
  • Source Batch must have a Content Type assigned
  • Target Batch Process must have a Content Type assigned
  • Both Content Types must match
Common Error

"Copy Batch Fields is set to true, but the source and target Batch Process Content Types do not match."

Solution

Ensure both Batch Processes use the same Content Type or leave Copy Batch Fields unchecked.

Source Batch Disposition

Controls what happens to the source Batch after spawning.

None (Default)

Behavior

Source Batch remains unchanged.

Use when
  • Using Copy action (originals should remain)
  • Source Batch needed for other processing
  • Auditing or archival purposes

DeleteIfEmpty

Behavior

Deletes source Batch only if all folders were moved/deleted.

Use when
  • Using Move action with intent to replace source
  • All documents routed to new Batches
  • Source Batch should only exist if documents remain

DeleteAlways

Behavior

Always deletes source Batch after spawning.

Use when
  • Source Batch is temporary/intermediate
  • All documents have been routed
  • Clean up is required regardless of remaining items

Template:Ambox

Common Use Cases

Use Case 1: Quality Assurance Sampling

Scenario: Extract 10% random sample from production Batch for QA review

Configuration
  • Method: Random
  • Spawn Percentage: 0.10
  • Action: Copy
  • Batch Name Prefix: QA-
  • Batch Name Suffix: SourceBatchName
  • Target Step: QA Review / Review
  • Start Paused: unchecked
  • Source Batch Disposition: None
Result
  • ~10% of documents copied to QA Batch
  • Original Batch continues normal processing
  • QA Batch routed to review process
  • Traceable Batch naming

Use Case 2: Separate by Document Type

Scenario: Split mixed Batch into invoices and purchase orders

Step 1 – Extract Invoices
  • Method: Filter
  • FilterBy: ContentType
  • Included Content Types: Invoice
  • Action: Move
  • Target Step: Invoice Process/Import
  • Batch Name Prefix: INV-
  • Source Batch Disposition: None
Step 2 – Extract Purchase Orders
  • Method: Filter
  • FilterBy: ContentType
  • Included Content Types: PO
  • Action: Move
  • Target Step: PO Process/Import
  • Batch Name Prefix: PO-
  • Source Batch Disposition: DeleteIfEmpty
Result
  • Invoices → Invoice Processing workflow
  • Purchase Orders → PO Processing workflow
  • Source Batch deleted if empty

Use Case 3: Exception Handling

Scenario: Route flagged items to manual review

Configuration
  • Method: Filter
  • FilterBy: FlaggedItems
  • Action: Move
  • Batch Name Prefix: Exceptions-
  • Target Step: Exception Review / Manual Review
  • Start Paused: checked
  • Source Batch Disposition: None
Result
  • Flagged documents moved to exception Batch
  • Exception Batch paused for manual intervention
  • Remaining documents continue processing

Use Case 4: Split Large Batch

Scenario: Split 5000-document Batch into 100-document Batches

Configuration
  • Method: Filter
  • FilterBy: AllItems
  • Action: Move
  • Maximum Batch Size: 100
  • Batch Name Prefix: Batch-
  • Batch Name Suffix: NumberedSuffix
  • Batch Name Suffix Zero Padding: 4
  • Target Step: [Same Process/Import]
  • Source Batch Disposition: DeleteIfEmpty
Result
  • 50 Batches created (Batch-0001 through Batch-0050)
  • Each Batch contains 100 documents
  • Source Batch deleted when empty
  • All Batches continue in same workflow

Use Case 5: Filter by File Type

Scenario: Process only PDF documents, archive others

Configuration
  • Method: Filter
  • FilterBy: MimeType
  • MimeTypeLexicon:
    • application/pdf
  • Action: Move
  • Target Step: PDF Processing/Import
  • Source Batch Disposition: None
Follow-up (Optional – Remove non-PDFs)
  • Method: Filter
  • FilterBy: AllItems
  • Action: Delete
Result
  • PDFs moved to PDF processing workflow
  • Non-PDF documents deleted from source

Use Case 6: Systematic Validation Sample

Scenario: Every 25th document goes to validation for accuracy tracking

Configuration
  • Method: EveryN
  • Step Size: 25
  • Action: Copy
  • Create Batch: unchecked
  • Batch Name Prefix: Validation-
  • Batch Name Suffix: SourceBatchName
  • Target Step: Validation/Review
  • Copy Batch Fields: checked
  • Source Batch Disposition: None
Result
  • 4% systematic sample (1/25)
  • All samples in single validation Batch
  • Batch metadata preserved
  • Original Batch continues processing

Troubleshooting

Problem: No Batches Created

Symptoms

Activity completes but no new Batches appear.

Possible Causes & Solutions
  • No matching folders
    • Solution: Check filter criteria
      • Verify FilterBy settings
      • Check MimeTypeLexicon entries
      • Confirm ContentTypes are correct
      • Review Processing Level
  • Action set to Delete
    • Solution: Change Action to Copy or Move if you need new Batches
  • Spawn Percentage too low
    • Method: Random; Spawn Percentage: 0.01; Batch with 10 documents ⇒ 0.1 documents (rounds to 0)
    • Solution: Increase percentage or use different method

Problem: “No folders found to process”

Log Message

"No folders found to process"

Diagnosis Steps
  • Check Processing Level:
    • Processing Level: 1
    • Batch Structure: All folders at level 2
    • Solution: Change Processing Level to 2
  • Verify filter criteria:
    • FilterBy: ContentType
    • Included Content Types: [empty]
    • Solution: Add Content Types to list
  • Check Invert Filter:
    • FilterBy: FlaggedItems
    • Invert Filter: checked
    • No unflagged items exist
    • Solution: Uncheck Invert Filter

Problem: Batch Process Not Published

Error

"Batch Process 'ProcessName' is not published."

Cause

Target Step references an unpublished Batch Process.

Solution
  1. Open the target Batch Process
  2. Click Publish in the ribbon
  3. Confirm publication
  4. Re-run the Spawn Batch activity

Problem: Content Type Mismatch

Error

"Copy Batch Fields is set to true, but the source and target Batch Process Content Types do not match."

Cause

Attempting to copy fields between incompatible Content Types.

Solutions
  • Option 1: Match Content Types
  1. Source process Content Type: Invoice
  2. Target process Content Type: Change to Invoice
  3. Re-run activity
  • Option 2: Disable field copying
  1. Copy Batch Fields: uncheck

Problem: Batch Naming Conflicts

Symptom

Batches named "Batch-0001 (1)", "Batch-0001 (2)".

Cause

Batch names already exist.

Solutions
  • Use SourceBatchName suffix:
    • Batch Name Suffix: SourceBatchName
    • Result: Unique names based on source
  • Include timestamp in prefix:
    • Batch Name Prefix: Batch-2024-01-15-
    • Result: Date-specific names
  • Clean up existing Batches:
    • Delete or complete old Batches with conflicting names

Problem: Spawned Batches Not Processing

Symptom

Batches created but remain in "Ready" status.

Possible Causes & Solutions
  • Start Paused is checked
    • Solution: Manually resume Batches or uncheck Start Paused
  • No Target Step specified
    • Solution: Select appropriate Batch Process Step
  • Service not running
    • Solution: Check Grooper Service status; start if stopped

Problem: Source Batch Not Deleted

Symptom

Disposition set to DeleteAlways but Batch remains.

Possible Causes & Solutions
  • Test Batch
    • Test Batches are often never deleted by design
    • Solution: This is expected behavior
  • Folders still remain
    • Disposition: DeleteIfEmpty
    • Some folders don't match filter
    • Solution: Use DeleteAlways or adjust filter
  • Batch locked
    • Solution: Unlock Batch manually and re-run

Best Practices

1. Test with Test Batches First

Always test new Spawn Batch configurations with test Batches

Testing Workflow
  1. Create small test Batch (5–10 documents)
  2. Configure Spawn Batch activity
  3. Run test Batch through process
  4. Verify:
    • Correct documents selected
    • Batches named properly
    • Target workflow correct
    • Source Batch handled correctly
  5. Adjust configuration as needed
  6. Move to production

2. Use Descriptive Batch Name Prefixes

Clear prefixes make Batch management easier

Good Examples
  • "QA-Sample-" — clearly indicates QA Batches
  • "Invoice-" — Document Type identification
  • "Exceptions-" — special handling indicator
  • "Split-" — indicates split from larger Batch
Poor Examples
  • "B-" — not descriptive
  • "" (empty) — no identification
  • "Test123-" — unclear purpose

3. Choose Appropriate Action Type

Match action to use case

Use Case Recommended Action
QA Sampling Copy
Document Routing Move
Cleanup Delete
Parallel Processing Copy
Exception Handling Move

4. Set Reasonable Maximum Batch Size

Balance performance and manageability

Guidelines
  • Small Batches (50–100): Interactive Review processes
  • Medium Batches (100–500): Standard processing
  • Large Batches (500–1000): Bulk processing
  • No limit (0): Use cautiously; can create very large Batches
Example – Balanced Approach
  • Large import: 5000 documents
  • Maximum Batch Size: 250
Result
20 manageable Batches

5. Manage Source Batch Disposition Carefully

Choose disposition based on workflow needs

Decision Tree
  • Using Copy action?
    • Yes → Disposition: None
    • No (Move/Delete) →
      • Need source Batch for reference?
        • Yes → Disposition: None
        • No →
          • Some documents might remain?
            • Yes → Disposition: DeleteIfEmpty
            • No (all documents processed) → Disposition: DeleteAlways

6. Use Copy Batch Fields Wisely

Only when Content Types match

Checklist
  • [ ] Source Batch has Content Type
  • [ ] Target process has Content Type
  • [ ] Content Types match exactly
  • [ ] Fields are relevant to spawned Batch

If any condition is false → Leave Copy Batch Fields unchecked.

7. Leverage Invert Filter

Powerful for exclusion scenarios

Scenario 1 – Process Non-PDFs
  • FilterBy: MimeType
  • MimeTypeLexicon: application/pdf
  • Invert Filter: checked
Result
All non-PDF documents selected
Scenario 2 – Process Valid Items
  • FilterBy: InvalidItems
  • Invert Filter: checked
Result
All valid documents selected

8. Monitor Activity Statistics

Review stats after processing

Available Statistics
  • Folders Moved: Count of folders moved to spawned Batches
  • Folders Copied: Count of folders copied to spawned Batches
Use stats to
  • Verify expected document counts
  • Track QA sample sizes
  • Audit document routing
  • Troubleshoot issues

9. Plan for Batch Process Dependencies

Ensure target processes are ready

Pre-flight Checklist
  • [ ] Target Batch Process is published
  • [ ] Target step exists in published version
  • [ ] Content Types are configured
  • [ ] Downstream activities are configured
  • [ ] Service has permissions to target folder

10. Document Your Configuration

Maintain configuration documentation

Template
  • Activity: Spawn Batch
  • Purpose: [Describe why this spawn is happening]
  • Method: [Filter/EveryN/Random]
  • Filter Criteria: [Specific settings]
  • Action: [Copy/Move/Delete]
  • Target: [Destination process/step]
  • Naming Pattern: [Prefix and suffix pattern]
  • Special Notes: [Any caveats or special handling]
Example
  • Activity: Spawn Batch
  • Purpose: QA Sampling for Invoice Processing
  • Method: Random
  • Filter Criteria: 10% random sample
  • Action: Copy (preserve originals)
  • Target: QA Review Process / Manual Review Step
  • Naming Pattern: "QA-[SourceBatchName]"
  • Special Notes: Runs weekly, samples archived after 30 days

Summary

The Spawn Batch activity is an essential tool for:

  • Splitting Batches by Document Type, status, or attributes
  • Quality assurance through systematic or random sampling
  • Workflow routing to specialized processing paths
  • Batch management by controlling size and organization
  • Exception handling for special cases and errors

Quick Reference

Task Method Action Key Settings
QA Sample Random Copy Spawn Percentage: 0.10
Route by Type Filter Move FilterBy: ContentType
Every Nth EveryN Copy Step Size: 10
Split Large Batch Filter Move Maximum Batch Size: 100
Exception Handling Filter Move FilterBy: FlaggedItems
Remove Invalid Filter Delete FilterBy: InvalidItems