Thread (Concept)

From Grooper Wiki

A Thread is the smallest unit of processing that can be performed within an operating system. In Grooper, threads are allocated for processing by Activity Processing services.

One thread can perform one "task" in Grooper. This allows for concurrent processing of multiple tasks. For example, if your system has 8 threads available, and the Recognize activity is set to the Page level Scope, the activity will run on 8 pages at a time. If your system only has 2 threads available, only 2 pages will be processed. If you have 64 threads, 64 pages will be processed at a time.

You can control how many threads are used to process an Activity using Processing Queues and the Activity Processing service, following these general steps:

  1. Create a new Processing Queue
  2. Assign it to a step in a Batch Process using its Queue Name property.
  3. Create an Activity Processing Grooper Service from Grooper Config.
  4. When configuring the Activity Processing service, select the Processing Queue using the Queue Name property.
  5. Specify how many threads you want to run using the Number of Threads property.

Glossary

Activity Processing: Activity Processing is the execution of a sequence of configured tasks which are performed within a settings Batch Process to transform raw data from documents into structured and actionable information. Tasks are defined by Grooper Activities, configurated to perform document classification, extraction, or data enhancement.

Activity Processing: Activity Processing is a Grooper Service that executes Activities assigned to edit_document Batch Process Steps in a settings Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.

Activity: Activity is a property on edit_document Batch Process Steps. Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. Batch Process Steps configured with specific Activities are frequently referred by the name of the Activity followed by the word "step". For example: Classify step.

Batch Process: settings Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each inventory_2 Batch of documents is managed systematically and efficiently.

  • Batch Processes by themselves do nothing. Instead, the workflows they execute are designed by adding child edit_document Batch Process Steps.
  • A Batch Process is often referred to as simply a "process".

Grooper Service: Grooper Services are various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Command Console (Or in older Grooper versions, Grooper Config).

Processing Queue: memory Processing Queue node objects are designed for tasks performed by computer Machines and their Activity Processing services. Processing Queues are used to distribute Grooper "Code Activity" tasks among different servers and control the concurrency and/or processing rate of these tasks.

  • For example, activities such as Render or Export can be managed so that only one activity instance runs per machine or so multiple instances are processed concurrently, according to the queue configuration.

Recognize: format_letter_spacing_wide Recognize is an Activity that obtains machine-readable text from contract Batch Pages and folder Batch Folders. When properly configured with an library_booksOCR Profile, Recognize will selectively perform OCR for images and native-text extraction for digital text in PDFs. Recognize can also reference an perm_mediaIP Profile to collect "layout data" like lines, checkboxes, and barcodes. Other Activities then use this machine-readable text and layout data for document analysis and data extraction.

Scope: The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

Service: Grooper Service is a conceptual term that refers to the various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, started and stopped using Grooper Command Console.

Thread: A Thread is the smallest unit of processing that can be performed within an operating system. In Grooper, threads are allocated for processing by Activity Processing services.