Test Batch (Import Provider): Difference between revisions

Revision as of 13:23, 10 May 2024

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

Would you like to see this article expanded? Let us know at groopereducation@bisok.com.

"Test Batch" is a specialized Import Provider designed to facilitate the import of content from an existing inventory_2 Batch in the test environment. This provider is most commonly used for testing, development, and validation scenarios, and is not intended for production use.
Looking for information on "production" vs "test" Batches in Grooper? See here.

Test Batches are only visible to Grooper Design Studio users and are used to test anything configured in Grooper, such as:

Document Classification
OCR Profile configuration
IP Profile configuration
Data Extraction
Whole or portions of Batch Processes ( or Activity configurations more generally)

They are not meant for full fledged, "real world" document processing. These Batches are not visible to "Production" level users, such as Grooper Dashboard workstations.

Glossary

Activity: Grooper Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. In a settings Batch Process, each edit_document Batch Process Step executes a single Activity (determined by the step's "Activity" property).

Batch Process Steps are frequently referred by the name of their configured Activity followed by the word "step". For example: "Classify step".

Batch Process: settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch nodes are fundamental in Grooper's architecture. They are containers of documents that are moved through workflow mechanisms called settings Batch Processes. Documents and their pages are represented in Batches by a hierarchy of folder Batch Folders and contract Batch Pages.

Classification: Classification is the process of identifying and organizing documents into categorical types based on their content or layout. Classification is key for efficient document management and data extraction workflows. Grooper has different methods for classifying documents. These include methods that use machine learning and text pattern recognition. In a Grooper Batch Process, the Classify Activity will assign a Content Type to a folder Batch Folder.

Data Extraction: Data Extraction involves identifying and capturing specific information from documents (represented by folder Batch Folders in Grooper). Extraction is performed by configurable Data Extractors, which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

IP Profile: perm_media IP Profiles are a step-by-step list of image processing operations (IP Commands). They are used for several image processing related operations, but primarily for:

Permanently enhancing an image during the Image Processing activity (usually to get rid of defects in a scanned image, such as skewing or borders).
Cleaning up an image in-memory during the Recognize activity without altering the image to improve OCR accuracy.
Computer vision operations that collect layout data (table line locations, OMR checkboxes, barcode value and more) utilized in data extraction.

Node Tree: The Node Tree is the hierarchical list of Grooper node objects found in the left panel in the Design Page. It is the basis for navigation and creation in the Design Page.

OCR Profile: library_books OCR Profiles store configuration settings for optical character recognition (OCR). They are used by the Recognize activity to convert images of text on contract Batch Pages into machine-encoded text. OCR Profiles are highly configurable, allowing fine-grained control over how OCR occurs, how pre-OCR image cleanup occurs, and how Grooper's OCR Synthesis occurs. All this works to the end goal of highly accurate OCR text data, which is used to classify documents, extract data and more.

OCR: OCR is stands for Optical Character Recognition. It allows text on paper documents to be digitized, in order to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text.

Test Batch: "Test Batch" is a specialized Import Provider designed to facilitate the import of content from an existing inventory_2 Batch in the test environment. This provider is most commonly used for testing, development, and validation scenarios, and is not intended for production use.

Looking for information on "production" vs "test" Batches in Grooper? See here.

@@ Line 8: / Line 8: @@
 * '''[[OCR Profile]]''' configuration
 * '''[[IP Profile]]''' configuration
-* [[Data Extractors|Data Extraction]]
+* [[Data Extraction (Concept)|Data Extraction]]
 * Whole or portions of '''[[Batch Process]]es''' ( or '''[[Activity]]''' configurations more generally)
 They are not meant for full fledged, "real world" document processing.  These '''Batches''' are not visible to "Production" level users, such as [[Grooper Dashboard]] workstations.
+== Glossary ==
+<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}}
+<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
+<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
+<u><big>'''Classification'''</big></u>: {{#lst:Glossary|Classification}}
+<u><big>'''Data Extraction'''</big></u>: {{#lst:Glossary|Data Extraction}}
+<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
+<u><big>'''IP Profile'''</big></u>: {{#lst:Glossary|IP Profile}}
+<u><big>'''Node Tree'''</big></u>: {{#lst:Glossary|Node Tree}}
+<u><big>'''OCR Profile'''</big></u>: {{#lst:Glossary|OCR Profile}}
+<u><big>'''OCR'''</big></u>: {{#lst:Glossary|OCR}}
+<u><big>'''Test Batch'''</big></u>: {{#lst:Glossary|Test Batch}}
 <!---EDITOR'S NOTE
 Consider deleting this article.  This information should probably just be in the "Batch" article.  Consider leaving this as a redirect only.