Thread (Concept): Difference between revisions
No edit summary Tag: Reverted |
Tag: Reverted |
||
| Line 11: | Line 11: | ||
== Glossary == | == Glossary == | ||
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing}} | <u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Concept}} | ||
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing}} | <u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Service}} | ||
<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}} | <u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}} | ||
Revision as of 13:26, 10 May 2024
A Thread is the smallest unit of processing that can be performed within an operating system. In Grooper, threads are allocated for processing by Activity Processing services.
One thread can perform one "task" in Grooper. This allows for concurrent processing of multiple tasks. For example, if your system has 8 threads available, and the Recognize activity is set to the Page level Scope, the activity will run on 8 pages at a time. If your system only has 2 threads available, only 2 pages will be processed. If you have 64 threads, 64 pages will be processed at a time.
You can control how many threads are used to process an Activity using Processing Queues and the Activity Processing service, following these general steps:
- Create a new Processing Queue
- Assign it to a step in a Batch Process using its Queue Name property.
- Create an Activity Processing Grooper Service from Grooper Config.
- When configuring the Activity Processing service, select the Processing Queue using the Queue Name property.
- Specify how many threads you want to run using the Number of Threads property.
Glossary
Activity Processing: Activity Processing is the execution of a sequence of configured tasks which are performed within a settings Batch Process to transform raw data from documents into structured and actionable information. Tasks are defined by Grooper Activities, configurated to perform document classification, extraction, or data enhancement.
Activity Processing: Activity Processing is a Grooper Service that executes Activities assigned to edit_document Batch Process Steps in a settings Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.
Activity: Grooper Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. In a settings Batch Process, each edit_document Batch Process Step executes a single Activity (determined by the step's "Activity" property).
- Batch Process Steps are frequently referred by the name of their configured Activity followed by the word "step". For example: "Classify step".
Batch Process: settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.
- Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
- A Batch Process is often referred to as simply a "process".
Grooper Service:
Processing Queue: memory Processing Queues help automate "machine performed tasks" (Those are Code Activity tasks performed by computer Machines and their Activity Processing services). Processing Queues are assigned to Batch Process Steps to distribute tasks, control the maximum processing rate, and set the "concurrency mode" (specifying if and how parallelism can occur across one or more servers).
- Processing Queues are used to dedicate Activity Processing services with a capped number of processing threads to resource intensive activities, such as Recognize. That way, these compute hungry tasks won't gobble up all available system resources.
- Processing Queues are also used to manage activities, such as Render, who can only have one activity instance running per machine (This is done by changing the queue's Concurrency Mode from "Maximum" to "Per Machine").
- Processing Queues are also used to throttle Export tasks in scenarios where the export destination can only accept one document at a time.
Recognize: format_letter_spacing_wide Recognize is an Activity that obtains machine-readable text from contract Batch Pages and folder Batch Folders. When properly configured with an library_booksOCR Profile, Recognize will selectively perform OCR for images and native-text extraction for digital text in PDFs. Recognize can also reference an perm_mediaIP Profile to collect "layout data" like lines, checkboxes, and barcodes. Other Activities then use this machine-readable text and layout data for document analysis and data extraction.
Scope: The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.
Service: Grooper Services are various executable programs that run as a Windows Service to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Command Console (or in older Grooper versions, Grooper Config).
Thread: A Thread is the smallest unit of processing that can be performed within an operating system. In Grooper, threads are allocated for processing by Activity Processing services.