2023.1:Batch Process (Node Type)

From Grooper Wiki

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
  • A Batch Process is often referred to as simply a "process".

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). There is a Batch with the example document(s) discussed in this tutorial, as well as a Project configured according to its instructions.

Glossary

Activity Processing:

Activity Processing: Activity Processing is the execution of a sequence of configured tasks which are performed within a settings Batch Process to transform raw data from documents into structured and actionable information. Tasks are defined by Grooper Activities, configurated to perform document classification, extraction, or data enhancement.

Activity: Grooper Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. In a settings Batch Process, each edit_document Batch Process Step executes a single Activity (determined by the step's "Activity" property).

  • Batch Process Steps are frequently referred by the name of their configured Activity followed by the word "step". For example: "Classify step".

Batch Folder: The folder Batch Folder is an organizational unit within a inventory_2 Batch, allowing for a structured approach to managing and processing a collection of documents. Batch Folder nodes serve two purposes in a Batch. (1) Primarily, they represent "documents" in Grooper. (2) They can also serve more generally as folders, holding other Batch Folders and/or contract Batch Page nodes as children.

  • Batch Folders are frequently referred to simply as "documents" or "folders" depending on how they are used in the Batch.

Batch Page: contract Batch Page nodes represent individual pages within a inventory_2 Batch. Batch Pages are created in one of two ways: (1) When images are scanned into a Batch using the Scan Viewer. (2) Or, when split from a PDF or TIFF file using the Split Pages activity.

  • Batch Pages are frequently referred to simply as "pages".

Batch Process Step: edit_document Batch Process Steps are specific actions within a settings Batch Process sequence. Each Batch Process Step performs an "Activity" specific to some document processing task. These Activities will either be a "Code Activity" or "Review" activities. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Process Steps are frequently referred to as simply "steps".
  • Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".

Batch Process: settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
  • A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch nodes are fundamental in Grooper's architecture. They are containers of documents that are moved through workflow mechanisms called settings Batch Processes. Documents and their pages are represented in Batches by a hierarchy of folder Batch Folders and contract Batch Pages.

Classification: Classification is the process of identifying and organizing documents into categorical types based on their content or layout. Classification is key for efficient document management and data extraction workflows. Grooper has different methods for classifying documents. These include methods that use machine learning and text pattern recognition. In a Grooper Batch Process, the Classify Activity will assign a Content Type to a folder Batch Folder.

Classify: unknown_document Classify is an Activity that "classifies" folder Batch Folders in a inventory_2 Batch by assigning them a description Document Type.

  • Classification is key to Grooper's document processing. It affects how data is extracted from a document (during the Extract activity) and how Behaviors are applied.
  • Classification logic is controlled by a Content Model's "Classify Method". These methods include using text patterns, previously trained document examples, and Label Sets to identify documents.

Code Expressions: Code Expressions (not to be confused with regular expressions) are snippets of VB.NET code that expand Grooper's core functionality.

Content Model: stacks Content Model nodes define a classification taxonomy for document sets in Grooper. This taxonomy is defined by the collections_bookmark Content Categories and description Document Types they contain. Content Models serve as the root of a Content Type hierarchy, which defines Data Element inheritance and Behavior inheritance. Content Models are crucial for organizing documents for data extraction and more.

Content Type: Content Types are a class of node types used used to classify folder Batch Folders. They represent categories of documents (stacks Content Models and collections_bookmark Content Categories) or distinct types of documents (description Document Types). Content Types serve an important role in defining Data Elements and Behaviors that apply to a document.

Data Element: Data Elements are a class of node types used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Document Viewer: The Grooper Document Viewer is the portal to your documents. It is the UI that allows you to see a folder Batch Folder's (or a contract Batch Page's) image, text content, and more.

Execute: tv_options_edit_channels Execute is an Activity that runs one or more specified object commands. This gives access to a variety of Grooper commands in a settings Batch Process for which there is no Activity, such as the "Sort Children" command for Batch Folders or the "Expand Attachments" command for email attachments.

Export Behavior: An Export Behavior defines the parameters for exporting classified folder Batch Folder content from Grooper to other systems. This includes where they are exported to (what content management system, file system, database etc), what content is exported (attached files, images, and/or data), how it is formatted (PDF, CSV, XML etc), folder pathing, file naming and data mappings (for Data Export and CMIS Export).

Export: output Export is an Activity that transfers documents and extracted information to external file systems and content management systems, completing the data processing workflow.

Expressions Cookbook: The "Expressions Cookbook" is a reference list for commonly used Code Expressions in Grooper.

Expressions: Expressions (not to be confused with regular expressions) are snippets of VB.NET code that expand Grooper's core functionality.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

Image Processing: wallpaper Image Processing is an Activity that enhances contract Batch Page images and optimizes them for better OCR text recognition and data extraction results.

Image Processing: wallpaper Image Processing is an Activity that enhances contract Batch Page images and optimizes them for better OCR text recognition and data extraction results.

Node Tree: The Node Tree is the hierarchical list of Grooper node objects found in the left panel in the Design Page. It is the basis for navigation and creation in the Design Page.

Processing Queue: memory Processing Queues help automate "machine performed tasks" (Those are Code Activity tasks performed by computer Machines and their Activity Processing services). Processing Queues are assigned to Batch Process Steps to distribute tasks, control the maximum processing rate, and set the "concurrency mode" (specifying if and how parallelism can occur across one or more servers).

  • Processing Queues are used to dedicate Activity Processing services with a capped number of processing threads to resource intensive activities, such as Recognize. That way, these compute hungry tasks won't gobble up all available system resources.
  • Processing Queues are also used to manage activities, such as Render, who can only have one activity instance running per machine (This is done by changing the queue's Concurrency Mode from "Maximum" to "Per Machine").
  • Processing Queues are also used to throttle Export tasks in scenarios where the export destination can only accept one document at a time.

Project: package_2 Projects are the primary containers for configuration nodes within Grooper. The Project is where various processing objects such as stacks Content Models, settings Batch Processes, profile objects are stored. This makes resources easier to manage, easier to save, and simplifies how node references are made in a Grooper Repository.

Recognize: format_letter_spacing_wide Recognize is an Activity that obtains machine-readable text from contract Batch Pages and folder Batch Folders. When properly configured with an library_booksOCR Profile, Recognize will selectively perform OCR for images and native-text extraction for digital text in PDFs. Recognize can also reference an perm_mediaIP Profile to collect "layout data" like lines, checkboxes, and barcodes. Other Activities then use this machine-readable text and layout data for document analysis and data extraction.

Render: print Render is an Activity that converts files of various formats to PDF. It does this by digitally printing the file to PDF using the Grooper Render Printer. This normalizes electronic document content from file formats Grooper cannot read natively to PDF (which it can read natively), allowing Grooper to extract the text via the format_letter_spacing_wide Recognize Activity.

Review Queue: person_play Review Queues help organize and filter human-performed Review activity tasks. User groups are assigned to each Review Queue, which is then set either on a settings Batch Process or a Review step. Based on a user's membership in Review Queues, this will affect how inventory_2 Batches are distributed in the Batches page and how Review tasks are distributed in the Tasks page.

Review: person_search Review is an Activity that allows user attended review of Grooper's results. This allows human operators to validate processed contract Batch Page and folder Batch Folder content using specialized user interfaces called "Viewers". Different kinds of Viewers assist users in reviewing Grooper's image processing, document classification, data extraction and operating document scanners.

Root: The Grooper database Root node is the topmost element of the Grooper Repository. All other nodes in a Grooper Repository are its children/descendants. The Grooper Root also stores several settings that apply to the Grooper Repository, including the license serial number or license service URL and Repository Options.

Scope: The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

Service: Grooper Services are various executable programs that run as a Windows Service to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Command Console (or in older Grooper versions, Grooper Config).

Split Pages: Multi-page PDF and TIF files come into Grooper as files attached to single folder Batch Folders. Split Pages is an Activity that creates child contract Batch Pages for each page in the PDF or TIF. This allows Grooper to process and handle these pages as individual objects.

Split: Split is a Collation Provider option for pin Data Type extractors. Split separates a data instance at each match returned by the Data Type. The results are used as anchor points to "split" text into one or more smaller parts.

Test Batch: "Test Batch" is a specialized Import Provider designed to facilitate the import of content from an existing inventory_2 Batch in the test environment. This provider is most commonly used for testing, development, and validation scenarios, and is not intended for production use.

  • Looking for information on "production" vs "test" Batches in Grooper? See here.

Visual: "Visual" is a Classify Method that uses image analysis instead of text data to determine the description Document Type assigned to a folder Batch Folder during classification. Instead of using text-based extractors, an "Extract Features" IP Command in an perm_media IP Profile is used to collect image-based data from a Batch Folder's image(s). This image-based data is compared against that of previously trained document examples of each Document Type to classify the Batch Folder.

About

Batch Processes are highly configurable and reusable any time a new Batch is created. They are comprised of child Batch Process Steps which reference different Grooper Activities to perform document processing tasks.

A Batch Process defines a repeatable sequence of steps which achieve a specific information processing objective in Grooper.

Grooper’s goal is to automate the process of Acquiring, Conditioning, Organizing, Collecting and Delivering data from documents. There are many objects in Grooper that facilitate these "Five Phases" of document data acquisition, but it is the Batch Process that is responsible for determining the flow of documents and accomplishing automation within a Grooper system. A Batch Process acts as the assembly line that takes raw documents and converts them into deliverable data.

A Batch Process object has little meaningful configuration on itself and subsequently does nothing on its own. Instead, it acts as a container for one or more Batch Process Steps, which are configured to execute specific Activities. Activities may represent automated system tasks, or human-attended tasks which require operator interaction. Collectively, these steps represent a workflow process through which Batches of a particular class will travel. While most of the configuration items on Batch Process Steps are specific to their function, all Batch Process Steps may have either a Processing Queue or a Review Queue assigned, so that Grooper knows by which cores or by which individuals that step will be processed or reviewed.

Upon completion of this configuration, you publish the Batch Process which places a read-only copy of the Batch Process into the "Processes" branch of the Grooper "Node Tree". Doing so exposes it to be assigned to Batches created in "Production". This, in turn, allows the tasks of the Batch Process Steps to be submitted to their configured queues, either Processing Queue or Review Queue. Batch Processes automatically submit tasks for their active step, which get picked up by either an Activity Processing Service or a human reviewer, depending on task type. If a Batch Process does not have any attended activities, it generally processes each step sequentially in a completely unattended fashion.

Once a Batch Process has been published, all new Batches will use the current published version. When changes are made to a Batch Process and a new version is published, the changes will apply to new Batches created using that Batch Process but will not impact existing Batches already in progress. One can, however, manually apply the latest published version of a Batch Process to an existing Batch by pausing and updating said Batch.

It is also possible to un-publish a published Batch Process, thus making it unavailable to newly created production Batches.

Batch Process Properties

There are a few properties that can be configured on a Batch. These properties are rarely configured, given their limited use, but are worth understanding, however.

Content Type

This property is a drop-down list of all available "Content Types" found within the parent Project of which one can be chosen. As a result, the root Batch Folder of the Batch will be classified as that type. Consequently, the "Data Elements" of that "Content Type" will be displayed in Review.

Review Queue

This property is a drop-down menu of all available Review Queues of which one can be chosen. This will associate any Review tasks of a Batch that uses this Batch Process to the respective Review Queue. If no Review Queue is selected, a Batch associated with this Batch Process will belong to no queue and be visible to anyone on the "Batch" page.

Priority

This property is an int32 value that expresses an inversely proportional relationship to the priority given to tasks submitted by this Batch Process.

For example, setting this property to 1 would give tasks submitted for a Batch using this Batch Process a higher priority than the default of 3. To understand what this means in practice, let's take the example further. Assume the following:

  • Server where Grooper is installed has a single Activity Processing Service
    • This Activity Processing Service is set to use 10 CPU threads
  • "Batch A" is currently in production using "Batch Process A" which is set to a Priority of 3
    • "Batch A" has 20 tasks related to the Recognize Activity
    • 10 of the 20 Recognize tasks are being worked by the 10 available CPU threads
  • As "Batch A" is being processed, "Batch B", which uses "Batch Process B" with a Priority of 1, submits 10 Split Pages tasks

"Batch A's" 10 Recognize tasks will complete. However, because "Batch B" submitted tasks at a higher priority than "Batch A", its 10 Split Pages tasks will get picked up and completed before the 10 remaining Recognize tasks associated with "Batch A" get processed.

Batch Process Steps

Batch Process Steps are objects within a Batch Process that are assigned an Activity which is a type of processing to be executed on all or a portion of a Batch. Activities generally fall into one of two categories of activity types:

  • Attended
  • Unattended

Attended activities, such as Review, are completed by human reviewers and are assigned via Review Queues. Unattended activities, such as Image Processing, Recognize, and Classification, are completed by Activity Processing Services and are assigned via Processing Queues. Batch Process Steps also allow for unit testing of their configured Activity on the Design page via their respective “Activity Tester” tab.

As a Batch Process progresses through its individual steps, each step creates a series of tasks that are submitted to either the designated Processing Queue or Review Queue. Once the Activity Processing Services or human reviewers pick up and complete all tasks from their respective queues for an individual step, the Batch Process moves to the next step. Batch Processes will not submit tasks for a step before all tasks from the previous step have been completed.

Batch Process Steps are added to Batch Processes in a top-to-bottom linear fashion, although they may be re-ordered at any time. As such, the execution of these steps is also done in a linear fashion, from top to bottom. The only exception to this is if a Batch Process Step is configured with a "Should Submit" expression. These are simple expressions written on Batch Process Steps that determine whether the Activity of the step should be executed, and upon completion what the next step in the Batch Process should be. "Should Submit Expressions" can send the contents of the Batch being processed to any Batch Process Step in any (published) Batch Process, including entirely different Batch Processes, and are powerful tools for configuring workflows for exception queues, requesting additional review, or several other functions.

Batch Process Step Properties

Activity

This property is a drop-down menu of all available Activities in Grooper, and is the main property to consider on a Batch Process Step. Once the Activity property is set, the properties related to the set Activity are exposed on the Batch Process Step. Every Activity has its own property configurations to consider, therefore please refer to the individual articles related to each activity type for more information on their setup.

Scope

This property is a drop-down menu of processing scopes available on an Activity of a Batch Process Step. It is an important topic to understand and more than should be covered here. As such, please visit the article on the topic of Scope to get a full understanding of the issue.

Queue Name

For attended activities, this property is a drop-down menu which can be set to point at a specific Review Queue. This property will override the designated Review Queue of its parent Batch Process.

For unattended activities, this property is a drop-down menu which can be set to point at a specific Processing Queue. Please visit the Processing Queue article for more information on why this property might get used.

Activate Mode

This property is a drop-down menu of several modes of which one can be chosen. The submission of activity tasks related to a Batch Process Step is done when a Batch Process "exits" one step and "enters" another. This property controls how the activity tasks of a Batch Process Step will be submitted as the Batch Process "enters" into a step. The different modes are:

  • Normal - Tasks will be submitted for items which have not already been processed.
  • Retry - Tasks will be submitted for items which previously failed, or have not yet been processed.
  • Always - Tasks will be submitted for all items, overwriting any previous task information.
  • Manual - The batch will be paused when this step is reached, requiring the user to manually start the process step.
Should Submit Expression

The Should Submit Expression leverages a VB.Net expression to determine whether tasks will be submitted for individual folders or pages within the Batch. The expression is executed for each folder or page in scope, and should return a True or False value indicating whether the item should be processed. If False is returned for every item in the batch, this will effectively skip the entire step.

For more information and example expressions visit the Code Expressions, or Expressions Cookbook articles.

Next Step Expression

The Next Step Expression determines which step, if any, will occur next in the Batch Process. Normally, steps occur in a top-to-bottom linear fashion.

For more information and example expressions visit the Code Expressions, or Expressions Cookbook articles.

Activity Processing Options

While these are properties of Activities their function is specifically related to batch processing.

Error Disposition

This property consists of several issue dispositions. These options determine what happens when an error occurs during the processing of individual tasks of unattended activities. The property can be set to a combination of any of the following:

  • None - The issue will be ignored, and the task will complete successfully.
  • Flag - The object of the scope of the task, either Batch Folder or Batch Page, will be flagged.
  • Log - The issue will be logged to the Grooper log. The log can be viewed from the Grooper Root node under the "Batch Event Viewer" tab.
  • Stop - The Batch will stop processing, be set to an error state, and all pending tasks will be deleted.

The default, and most common, settings are to Flag and Log the error, but otherwise allow further tasks to be processed. However, the Stop option is very useful as it can prevent cascading processing issues for a Batch further in its process. The main objective is to prevent "bad data" from ending up in whatever backend system you use to store data. When a Batch in production is stopped it allows someone to review and resolve any issues. Once assessed, the Batch can be updated and resumed, thus re-submitting its tasks for processing and completion.

Maximum Consecutive Errors

This property is an int32 value defining the number of consecutive errors to be allowed, after which a "critical stop" will be raised. This critical stop will cause services to stop running.

The main drawback with the Stop option of Error Disposition is that if even one error on a task is encountered, the entire Batch is stopped. Maximum Consecutive Errors allows for some errors to occur, but after a designated amount, instead of stopping the Batch in process and removing its remaining tasks, it stops the Activity Processing Service doing the processing.

An example of this being useful would be for the Export activity. You may know on occasion one or two errors might occur during export, but if say ... 10 in a row happen, something is wrong. To prevent wasting processing power and perhaps causing futher issues, the Activity Processing Service would be shut down and all processing would cease until this issue is resolved.

Concurrency Mode

This property is not configurable. It merely reports the type of "concurrency" an activity is capable of. As of now, the only activity that isn't Multiple is Render.

How To

The creation, testing, publishing, and updating of a Batch Process and its child Batch Process Steps is a straightforward process.

Add and Configure a Batch Process

Add a Batch Process

Batch Processes are created as child objects within a Project. To create a Batch Process...

  1. Right-click on a Project or a Folder within the Project
  2. In the pop-out menu choose "Add > Batch Process"
  3. In the "Add" dialog box name the Batch Process whatever you like and click the "Execute" button


Configure the Batch Process

Consider the following UI elements:

  1. Set the Content Type property to any "Content Type" if you wish to have the Batch set to that specific type and have its related "Data Elements" displayed in Review. This property is rarely configured as classifying a Batch isn't common.
  2. Set the Review Queue property to any Review Que to have the entire Batch related to a specific Review Queue. If no Review Queue is selected, a Batch associated with this Batch Process will belong to no queue and be visible to anyone on the "Batch" page.
  3. Use the "Save" and "Cancel Changes" buttons to perform their respective functions.
  4. Next to the "Save" and "Cancel Changes" buttons are the buttons that allow you to (in order from left to right) "Validate", "Publish", and "Un-Publish" the Batch Process.
    • The "Validate" button will scan the configuration of all child Batch Process Steps and inform if any configurable properties are in an error state.
    • The "Publish" button will create a read-only copy of the Batch Process (and all it's child objects) in the "Processes" folder of the "Node Tree". If the Batch Process has previously been published, this button will perform the publish again by overwriting the copy.
    • The "Un-Publish" button will remove the read-only copy of the Batch Process from the "Processes" folder of the "Node Tree".
  5. Use the "Scripting" tab if you are an advanced user familiar with .NET scripting that wants to expand what a Batch Process can do.

Add, Configure, and Test Batch Process Steps

Add a Batch Process Step (Split Pages)

Batch Process Steps are created as child objects within a Batch Process. To create a Batch Process Step...

  1. Right-click on a Batch Process
  2. In the pop-out menu choose "Add Activity > Transform > Split Pages"
    • The "Add Activity" command is different than the "Add" command. "Add" will create a Batch Process Step, which you can name, and the Activity property will be unconfigured. "Add Activity", however, will create a Batch Process Step and, depending on the choice made within the sub-menu, fill in the Step Name property in the "Add Activity" dialog box. It will also set the Activity property on the newly created Batch Process Step to the chosen activity.
    • For this example the Split Pages Activity is being used, but this process can be used for any Activity.
  3. In the "Add Activity" dialog box name the Batch Process Step whatever you like, or leave the name given for the respective Activity, and click the "Execute" button.


Configure the Batch Process Step (Split Pages)

  1. All of the "General" properties' defaults in this example will suffice.
    • Because the "Add Activity" command was used, the Activity property will be set to the chosen command. For this example, Split Pages was chosen.
    • Because Activity is set to Split Pages, the Scope property will automatically be set to Folder (the only option for this particular Activity).
    • The default for the Folder Level property is 1. The document in the supplied example Batch is at "Level 1" of the Batch, so the default in this case is fine.
    • No queues will be used in the example provided.
    • The Activate Mode of Normal is used in most cases, and this example is no exception.
  2. No Should Submit Expression or Next Step Expression will be used for this example.
  3. "Activity Properties" determine how the configured Activity will operate. Please refer to articles about specific Activities for more information on their configuration.
  4. Click the "Activity Tester" tab to test how this Batch Process Step will operate against a test Batch.


Test the Batch Process Step (Split Pages)

  1. On the "Activity Tester" tab...
  2. ...click the "Browse" button and in the dialog box that opens, select the Batch supplied with this article.
  3. Click the "OK" button to close the dialog box.
  4. By default the Batch will be selected. This step is configured to a Scope of Folder and a Folder Level of 1, therefore the "Test" button will be grayed out.
  5. The "Process All" button can be used to submit a job to be processed by a Grooper Activity Processing Service. In doing so a series of tasks will be created for each object in the Scope of the Activity of the Batch Process Step. As a result, an Activity Processing Service must be installed for this repository and be running. The Activity Processing Service will pick up and process a number of tasks equal to the amount of processing threads available to it. This is useful when you want to test in a multi-threaded fashion, like if you are running the Recognize Activity on multiple Batch Page objects.
    • For the purposes of the example being used here this will not be done. Instead, steps will be taken to use the "Test" button instead.


  1. Selecting the document at the scope that matches the configuration of the Batch Process Step, in this case Folder Level 1...
  2. ...will un-gray the "Test" button and allow it to be clicked.


  1. The test produced the desired result for this Batch Process Step configured for Split Pages: a Batch Page has been split out from the document.

More Steps

Following will be the addition, configuration, and testing of more Batch Process Steps to flesh out the Batch Process. While these additional steps would give the Batch Process more meaning, the process of adding them won't necessarily enhance your knowledge of the base process. Feel free to skip this portion if you want.

Add a Batch Process Step (Recognize)

  1. Right-click on the Batch Process
  2. In the pop-out menu choose "Add > Cleanup & Recognition > Recognize"
  3. In the "Add Activity" dialog box name the Batch Process Step whatever you like, or leave the name given for the respective Activity, and click the "Execute" button

Configure the Batch Process Step (Recognize)

  1. Using the "Add Activity" command has set the Activity property to the desired setting
  2. The default setting for the Scope property is Page when the Activity is Recognize, which is correct for our Batch
  3. The default settings of the "Activity Properties" will be kept for this step
  4. Click the "Activity Tester" tab to test the configuration of this step

Test the Batch Process Step (Recognize)

  1. Select the appropriate scope in the "Batch Viewer"
  2. Click the "Test" button to test the activity


  1. Using the "Renditions" button you can see a new text file for the recognized text
  2. In the "Document Viewer" you can see the recognized text

Add a Batch Process Step (Classify)

  1. Right-click on the Batch Process
  2. In the pop-out menu choose "Add > Document Processing > Classify"
  3. In the "Add Activity" dialog box name the Batch Process Step whatever you like, or leave the name given for the respective Activity, and click the "Execute" button

Configure the Batch Process Step (Classify)

  1. Using the "Add Activity" command has set the Activity property to the desired setting
  2. The default setting for the Scope property is Folder when the Activity is Classify, which is correct for our Batch
  3. The default setting of 1 for the Folder Level property is correct for this test Batch
  4. Click the drop-down for the Content Model Scope and choose the Content Model provided by this article
  5. Click the "Activity Tester" tab to test the configuration of this step

Test the Batch Process Step (Classify)

  1. Select the appropriate scope in the "Batch Viewer"
  2. Click the "Test" button to test the activity


  1. The name of the document has changed to reflect the classificaiton

Add a Batch Process Step (Extract)

  1. Right-click on the Batch Process
  2. In the pop-out menu choose "Add > Document Processing > Extract"
  3. In the "Add Activity" dialog box name the Batch Process Step whatever you like, or leave the name given for the respective Activity, and click the "Execute" button

Configure the Batch Process Step (Extract)

  1. Using the "Add Activity" command has set the Activity property to the desired setting
  2. The default setting for the Scope property is Folder when the Activity is Extract, which is correct for our Batch
  3. The default setting of 1 for the Folder Level property is correct for this test Batch
  4. The default settings of the "Activity Properties" will be kept for this step
  5. Click the "Activity Tester" tab to test the configuration of this step

Test the Batch Process Step (Extract)

  1. Select the appropriate scope in the "Batch Viewer"
  2. Click the "Test" button to test the activity


  1. With successful extraction the "Diagnostics" button will highlight. You can click it and view the results of extraction.

Add a Batch Process Step (Export)

  1. Right-click on the Batch Process
  2. In the pop-out menu choose "Add > Document Processing > Export"
  3. In the "Add Activity" dialog box name the Batch Process Step whatever you like, or leave the name given for the respective Activity, and click the "Execute" button

Configure the Batch Process Step (Export)

  1. Using the "Add Activity" command has set the Activity property to the desired setting
  2. The default setting for the Scope property is Folder when the Activity is Export, which is correct for our Batch
  3. The default setting of 1 for the Folder Level property is correct for this test Batch
  4. The default settings of the "Activity Properties" will be kept for this step
  5. Click the "Activity Tester" tab to test the configuration of this step

Test the Batch Process Step (Export)

  1. Select the appropriate scope in the "Batch Viewer"
  2. Click the "Test" button to test the activity


  1. There is an Export Behavior on the Content Model sending a .JSON of the extracted data to C:\ of the Grooper server

Validate, Publish, and Un-Publish a Batch Process

Validate the Batch Process Validating a Batch Process allows for a quick check to see if properties on all Batch Process Steps are not in an error state.

  1. Select the Batch Process
  2. Click the "Validate" button
    • If all properties on all Batch Process Steps are configured properly and not in an error state, the "Validate Branch" dialog box will show and state that no errors were found.
    • If any properties on any child Batch Process Steps are in an error state, the "Validate Branch" dialog box will show and list all properties that are in an error state.

Publish the Batch Process

  1. Select the Batch Process
  2. Click the "Publish" button
  3. Click the "Execute" button in the "Publish" dialog box
  4. This will put a read-only copy of the Batch Process in the "Processes" folder.

Un-Publish the Batch Process

  1. Select the Batch Process
  2. Click the "Unpublish" button
  3. In the "Unpublish" dialog box click the "Execute" button


  1. The "Unpublish" button will gray out.
  2. The read-only copy of the Batch Process will be removed from the "Processes" folder.

Update the Batch Process on a Production Batch

Sometimes it may be necessary to make changes to a Batch that is currently in production. A Batch in production is using a read-only copy of a read-only copy of the original Batch Process. As a result, changes will need to be made to the original Batch Process and then it will need to be re-published. At that point the Batch in production can be paused and updated.


  1. In this example, the change that was made to the Batch Process was that a Dispose Batch step was added to the Batch Process.
  2. Making this change does not affect the published read-only copy in the "Processes" folder.
  3. It also does not affect the read-only copy of the published Batch Process that the Batch in production is using.


  1. The Batch Process has been re-published
  2. This has put a new read-only copy of the Batch Process in the "Processes" folder of the node Tree
  3. However, the Batch in production is still using an old copy of the original Batch Process


  1. In the "Production" area of "Batches"...
  2. ...select the Batch in production
  3. Click the "Pause" button
  4. Click "Apply" in the "Pause" dialog box


  • Click the "Update" button
    1. Click the drop-down for the Target Step property in the "Update Process" dialog box
    2. Select the newly added step in the drop-down menu
    3. Click the "Apply" button

    1. The copy of the Batch Process for the Batch in production is now updated
    2. Click the "Resume" button to continue processing the Batch