2023.1:Batch Process (Object): Difference between revisions
added object icons // via Wikitext Extension for VSCode |
updated several aspects of the entire content of this article // via Wikitext Extension for VSCode |
||
Line 4: | Line 4: | ||
</blockquote> | </blockquote> | ||
<section end="glossary" /> | <section end="glossary" /> | ||
'''Batch Processes''' are highly configurable and reusable any time a new '''Batch''' is created. | '''Batch Processes''' are highly configurable and reusable any time a new '''Batch''' is created. They are comprised of child [[image:GrooperIcon_BatchProcessStep.png]] '''[[#Batch Process Steps|Batch Process Steps]]''' which reference different Grooper '''[[Activity|Activities]]''' to perform document processing tasks. | ||
== About == | == About == | ||
Line 17: | Line 17: | ||
Upon completion of this configuration, you publish the '''Batch Process''' which places a read-only copy of the '''Batch Process''' into the "Processes" branch of the '''Grooper''' '''[[Node Tree]]'''. Doing so exposes it to be assigned to '''Batches''' created in "Production". This, in turn, allows the tasks of the '''Batch Process Steps''' to be submitted to their configured queues, either '''Processing Queue''' or '''Review Queue'''. '''Batch Processes''' automatically submit tasks for their active step, which get picked up by either an '''[[Activity Processing (Service)|Activity Processing Service]]''' or a human reviewer, depending on task type. If a '''Batch Process''' does not have any attended review activities, it generally processes each step sequentially in a completely unattended fashion. | Upon completion of this configuration, you publish the '''Batch Process''' which places a read-only copy of the '''Batch Process''' into the "Processes" branch of the '''Grooper''' '''[[Node Tree]]'''. Doing so exposes it to be assigned to '''Batches''' created in "Production". This, in turn, allows the tasks of the '''Batch Process Steps''' to be submitted to their configured queues, either '''Processing Queue''' or '''Review Queue'''. '''Batch Processes''' automatically submit tasks for their active step, which get picked up by either an '''[[Activity Processing (Service)|Activity Processing Service]]''' or a human reviewer, depending on task type. If a '''Batch Process''' does not have any attended review activities, it generally processes each step sequentially in a completely unattended fashion. | ||
Once a '''Batch Process''' has been published, all new '''Batches''' will use the current published version. When changes are made to a '''Batch Process''' and a new version is published, the changes will apply to new '''Batches''' created using that '''Batch Process''' but will not impact existing Batches already in progress. One can, however, manually apply the latest published version of a '''Batch Process''' to an existing '''Batch''' by pausing and updating said '''Batch'''. | Once a '''Batch Process''' has been published, all new '''Batches''' will use the current published version. When changes are made to a '''Batch Process''' and a new version is published, the changes will apply to new '''Batches''' created using that '''Batch Process''' but will not impact existing '''Batches''' already in progress. One can, however, manually apply the latest published version of a '''Batch Process''' to an existing '''Batch''' by pausing and updating said '''Batch'''. | ||
It is also possible to un-publish a published Batch Process, thus making it unavailable to newly created production Batches. | It is also possible to un-publish a published Batch Process, thus making it unavailable to newly created production Batches. | ||
=== Batch Process Properties === | |||
There are a few properties that can be configured on a '''Batch'''. These properties are rarely configured, given their limited use, but are worth understanding, however. | |||
==== Content Type ==== | |||
This property is a drop-down list of all available "[[Content Type|Content Types]]" found within the parent '''[[Project]]''' of which one can be chosen. As a result, the root [[image:GrooperIcon_BatchFolder.png]] '''Batch Folder''' of the '''Batch''' will be classified as that type. Consequently, the "[[Data Element|Data Elements]]" of that "Content Type" will be displayed in '''[[Review (Activity)|Review]]'''. | |||
==== Review Queue ==== | |||
This property is a drop-down menu of all available '''Review Queues''' of which one can be chosen. This will associate any '''Review''' tasks of a '''Batch''' that uses this '''Batch Process''' to the respective '''Review Queue'''. | |||
==== Priority ==== | |||
This property is an int32 value that expresses an inversely proportional relationship to the priority given to tasks submitted by this '''Batch Process'''. | |||
For example, setting this property to ''1'' would give tasks submitted for a '''Batch''' using this '''Batch Process''' a higher priority than the default of ''3''. To understand what this means in practice, let's take the example further. Assume the following: | |||
* Server where '''Grooper''' is installed has a single '''Activity Processing Service''' | |||
** This '''Activity Processing Service''' is set to use ''10'' CPU threads | |||
* "Batch A" is currently in production using "Batch Process A" which is set to a '''''Priority''''' of ''3'' | |||
** "Batch A" has 20 tasks related to the '''Recognize''' activity | |||
** 10 of the 20 '''Recognize''' tasks are being worked by the 10 available CPU threads | |||
* As "Batch A" is being processed, "Batch B", which uses "Batch Process B" with a '''''Priority''''' of ''1'', submits 10 '''[[Split Pages]]''' tasks | |||
"Batch A's" 10 '''Recognize''' tasks will complete. However, because "Batch B" submitted tasks at a higher priority than "Batch A", its 10 '''Split Pages''' tasks will get picked up and completed before the 10 remaining '''Recognize''' tasks associated with "Batch A" get processed. | |||
=== Batch Process Steps === | === Batch Process Steps === | ||
'''Batch Process Steps''' are objects within a '''Batch Process''' that are assigned an '''Activity''' which is a type of processing to be executed on all or a portion of a '''Batch'''. Activities generally fall into one of two categories of activity types: | '''Batch Process Steps''' are objects within a '''Batch Process''' that are assigned an '''Activity''' which is a type of processing to be executed on all or a portion of a '''Batch'''. Activities generally fall into one of two categories of activity types: | ||
* Attended | * Attended | ||
* Unattended | * Unattended | ||
Attended activities, such as ''' | |||
Attended activities, such as '''Review''', are completed by human reviewers and are assigned via '''Review Queues'''. Unattended activities, such as '''[[Image Processing (Activity)|Image Processing]]''', '''[[Recognize (Activity)|Recognize]]''', and '''[[Classify (Activity)|Classification]]''', are completed by '''Activity Processing Services''' and are assigned via '''Processing Queues'''. '''Batch Process Steps''' also allow for unit testing of their configured activity on the '''Design''' page via their respective “Activity Tester” tab. | |||
As a '''Batch Process''' progresses through its individual steps, each step creates a series of | As a '''Batch Process''' progresses through its individual steps, each step creates a series of tasks that are submitted to either the designated '''Processing Queue''' or '''Review Queue'''. Once the '''Activity Processors''' or human reviewers pick up and complete all tasks from their respective queues for an individual step, the '''Batch Process''' moves to the next step. '''Batch Processes''' will not submit tasks for a step before all tasks from the previous step have been completed. | ||
'''Batch Process Steps''' are added to '''Batch Processes''' in a top-to-bottom linear fashion, although they may be re-ordered at any time. As such, the execution of these steps is also done in a linear fashion, from top to bottom. The only exception to this is if a '''Batch Process Step''' is configured with a "[[Expressions_Cookbook#Should_Submit_Expression|Should Submit]]" expression. These are simple expressions written on '''Batch Process Steps''' that determine whether the activity of the step should be executed, and upon completion what the next step in the Batch Process should be. "Should Submit Expressions" can send the contents of the '''Batch''' being processed to any '''Batch Process Step''' in any (published) '''Batch Process''', including entirely different '''Batch Processes''', and are powerful tools for configuring workflows for exception queues, requesting additional review, or several other functions. | '''Batch Process Steps''' are added to '''Batch Processes''' in a top-to-bottom linear fashion, although they may be re-ordered at any time. As such, the execution of these steps is also done in a linear fashion, from top to bottom. The only exception to this is if a '''Batch Process Step''' is configured with a "[[Expressions_Cookbook#Should_Submit_Expression|Should Submit]]" expression. These are simple expressions written on '''Batch Process Steps''' that determine whether the activity of the step should be executed, and upon completion what the next step in the Batch Process should be. "Should Submit Expressions" can send the contents of the '''Batch''' being processed to any '''Batch Process Step''' in any (published) '''Batch Process''', including entirely different '''Batch Processes''', and are powerful tools for configuring workflows for exception queues, requesting additional review, or several other functions. | ||
==== Scope ==== | ==== Batch Process Step Properties ==== | ||
===== Activity ===== | |||
This property is a drop-down menu of all available '''Activities''' in '''Grooper''', and is the main property to consider on a '''Batch Process Step'''. Once the '''''Activity''''' property is set, the properties related to the set ''Activity'' are exposed on the '''Batch Process Step'''. Every activity has its own property configurations to consider, therefore please refer to the individual articles related to each activity type for more information on their setup. | |||
===== Scope ===== | |||
This property is a drop-down menu of processing scopes aviailable on an '''Activity''' of a '''Batch Process Step'''. It is an important topic to understand and more than should be covered here. As such, please visit the article on the topic of '''[[Scope]]''' to get a full understanding of the issue. | |||
===== Queue Name ===== | |||
For attended activities, this property is a drop-down menu which can be set to point at a specific '''Review Queue'''. This property will override the designated '''''Review Queue''''' of its parent '''Batch Process'''. | |||
For unatteded activities, this property is a drop-down menu which can be set to point at a specific '''Processing Queue'''. Please visit the '''[[Processing Queue]]''' article for more information on why this property might get used. | |||
===== Activate Mode ===== | |||
This property is a drop-down menu of several modes of which one can be chosen. The submission of activity tasks related to a '''Batch Process Step''' is done when a '''Batch Process''' "exits" one step and "enters" another. This property controls how the activity tasks of a '''Batch Process Step''' will be submitted as the '''Batch Process''' "enters" into a step. The different modes are: | |||
* ''Normal'' - Tasks will be submitted for items which have not already been processed. | |||
* ''Retry'' - Tasks will be submitted for items which previously failed, or have not yet been processed. | |||
* ''Always'' - Tasks will be submitted for all items, overwriting any previous task information. | |||
* ''Manual'' - The batch will be paused when this step is reached, requiring the user to manually start the process step. | |||
== How To == | |||
The creation, testing, and pulblishing of a '''Batch Process''' and its child '''Batch Process Steps''' is a straightforward process. | |||
=== Create a Batch Process === | |||
=== Create and Test Batch Process Steps === | |||
=== Validate, Publish, and Unpublish a Batch Process === | |||
=== Update the Batch Process on a Production Batch === |
Revision as of 14:53, 2 April 2024
Batch Processes are the set of processing instructions given to a
Batch.
Batch Processes are highly configurable and reusable any time a new Batch is created. They are comprised of child Batch Process Steps which reference different Grooper Activities to perform document processing tasks.
About
A Batch Process defines a repeatable sequence of steps which achieve a specific information processing objective in Grooper.
Grooper’s goal is to automate the process of Acquiring, Conditioning, Organizing, Collecting and Delivering data from documents. There are many objects in Grooper that facilitate these Five Phases of document data acquisition, but it is the Batch Process that is responsible for determining the flow of documents and accomplishing automation within a Grooper system. A Batch Process acts as the assembly line that takes raw documents and converts them into deliverable data.
A Batch Process object has little meaningful configuration on itself and subsequently does nothing on its own. Instead, it acts as a container for one or more Batch Process Steps, which are configured to execute activities, also known as tasks.
As you build out a Batch Process you will add Batch Process Steps and configure them with activities. Activities may represent automated system tasks, or human-attended tasks which require operator interaction. Collectively, these steps represent a workflow process through which Batches of a particular class will travel. While most of the configuration items on Batch Process Steps are specific to their function, all Batch Process Steps may have either a Processing Queue or a
Review Queue assigned, so that Grooper knows by which cores or by which individuals that step will be processed.
Upon completion of this configuration, you publish the Batch Process which places a read-only copy of the Batch Process into the "Processes" branch of the Grooper Node Tree. Doing so exposes it to be assigned to Batches created in "Production". This, in turn, allows the tasks of the Batch Process Steps to be submitted to their configured queues, either Processing Queue or Review Queue. Batch Processes automatically submit tasks for their active step, which get picked up by either an Activity Processing Service or a human reviewer, depending on task type. If a Batch Process does not have any attended review activities, it generally processes each step sequentially in a completely unattended fashion.
Once a Batch Process has been published, all new Batches will use the current published version. When changes are made to a Batch Process and a new version is published, the changes will apply to new Batches created using that Batch Process but will not impact existing Batches already in progress. One can, however, manually apply the latest published version of a Batch Process to an existing Batch by pausing and updating said Batch.
It is also possible to un-publish a published Batch Process, thus making it unavailable to newly created production Batches.
Batch Process Properties
There are a few properties that can be configured on a Batch. These properties are rarely configured, given their limited use, but are worth understanding, however.
Content Type
This property is a drop-down list of all available "Content Types" found within the parent Project of which one can be chosen. As a result, the root Batch Folder of the Batch will be classified as that type. Consequently, the "Data Elements" of that "Content Type" will be displayed in Review.
Review Queue
This property is a drop-down menu of all available Review Queues of which one can be chosen. This will associate any Review tasks of a Batch that uses this Batch Process to the respective Review Queue.
Priority
This property is an int32 value that expresses an inversely proportional relationship to the priority given to tasks submitted by this Batch Process.
For example, setting this property to 1 would give tasks submitted for a Batch using this Batch Process a higher priority than the default of 3. To understand what this means in practice, let's take the example further. Assume the following:
- Server where Grooper is installed has a single Activity Processing Service
- This Activity Processing Service is set to use 10 CPU threads
- "Batch A" is currently in production using "Batch Process A" which is set to a Priority of 3
- "Batch A" has 20 tasks related to the Recognize activity
- 10 of the 20 Recognize tasks are being worked by the 10 available CPU threads
- As "Batch A" is being processed, "Batch B", which uses "Batch Process B" with a Priority of 1, submits 10 Split Pages tasks
"Batch A's" 10 Recognize tasks will complete. However, because "Batch B" submitted tasks at a higher priority than "Batch A", its 10 Split Pages tasks will get picked up and completed before the 10 remaining Recognize tasks associated with "Batch A" get processed.
Batch Process Steps
Batch Process Steps are objects within a Batch Process that are assigned an Activity which is a type of processing to be executed on all or a portion of a Batch. Activities generally fall into one of two categories of activity types:
- Attended
- Unattended
Attended activities, such as Review, are completed by human reviewers and are assigned via Review Queues. Unattended activities, such as Image Processing, Recognize, and Classification, are completed by Activity Processing Services and are assigned via Processing Queues. Batch Process Steps also allow for unit testing of their configured activity on the Design page via their respective “Activity Tester” tab.
As a Batch Process progresses through its individual steps, each step creates a series of tasks that are submitted to either the designated Processing Queue or Review Queue. Once the Activity Processors or human reviewers pick up and complete all tasks from their respective queues for an individual step, the Batch Process moves to the next step. Batch Processes will not submit tasks for a step before all tasks from the previous step have been completed.
Batch Process Steps are added to Batch Processes in a top-to-bottom linear fashion, although they may be re-ordered at any time. As such, the execution of these steps is also done in a linear fashion, from top to bottom. The only exception to this is if a Batch Process Step is configured with a "Should Submit" expression. These are simple expressions written on Batch Process Steps that determine whether the activity of the step should be executed, and upon completion what the next step in the Batch Process should be. "Should Submit Expressions" can send the contents of the Batch being processed to any Batch Process Step in any (published) Batch Process, including entirely different Batch Processes, and are powerful tools for configuring workflows for exception queues, requesting additional review, or several other functions.
Batch Process Step Properties
Activity
This property is a drop-down menu of all available Activities in Grooper, and is the main property to consider on a Batch Process Step. Once the Activity property is set, the properties related to the set Activity are exposed on the Batch Process Step. Every activity has its own property configurations to consider, therefore please refer to the individual articles related to each activity type for more information on their setup.
Scope
This property is a drop-down menu of processing scopes aviailable on an Activity of a Batch Process Step. It is an important topic to understand and more than should be covered here. As such, please visit the article on the topic of Scope to get a full understanding of the issue.
Queue Name
For attended activities, this property is a drop-down menu which can be set to point at a specific Review Queue. This property will override the designated Review Queue of its parent Batch Process.
For unatteded activities, this property is a drop-down menu which can be set to point at a specific Processing Queue. Please visit the Processing Queue article for more information on why this property might get used.
Activate Mode
This property is a drop-down menu of several modes of which one can be chosen. The submission of activity tasks related to a Batch Process Step is done when a Batch Process "exits" one step and "enters" another. This property controls how the activity tasks of a Batch Process Step will be submitted as the Batch Process "enters" into a step. The different modes are:
- Normal - Tasks will be submitted for items which have not already been processed.
- Retry - Tasks will be submitted for items which previously failed, or have not yet been processed.
- Always - Tasks will be submitted for all items, overwriting any previous task information.
- Manual - The batch will be paused when this step is reached, requiring the user to manually start the process step.
How To
The creation, testing, and pulblishing of a Batch Process and its child Batch Process Steps is a straightforward process.