Glossary: Difference between revisions
| No edit summary | |||
| Line 3: | Line 3: | ||
| '''Batch Process Steps''' configured with specific '''''Activities''''' are frequently referred by the name of the '''''Activity''''' followed by the word "step". For example: '''Classify Step'''.<section end="Activity" /> | '''Batch Process Steps''' configured with specific '''''Activities''''' are frequently referred by the name of the '''''Activity''''' followed by the word "step". For example: '''Classify Step'''.<section end="Activity" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Apply Rules === | === Apply Rules === | ||
| <section begin="Apply Rules" />'''''[[Apply Rules (Activity)|Apply Rules]]''''' is an '''''[[Activity (Property)|Activity]]''''' that runs '''[[Data Rule (Object)|Data Rules]]''' on data that has already been extracted from a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]'''. A '''[[Batch Process Step (Object)|Batch Process Step]]''' configured with the '''''Apply Rules Activity''''' will always need to be preceded by a '''Batch Process Step''' configured with the '''''Extract Activity'''''. <section end="Apply Rules" /> | <section begin="Apply Rules" />'''''[[Apply Rules (Activity)|Apply Rules]]''''' is an '''''[[Activity (Property)|Activity]]''''' that runs '''[[Data Rule (Object)|Data Rules]]''' on data that has already been extracted from a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]'''. A '''[[Batch Process Step (Object)|Batch Process Step]]''' configured with the '''''Apply Rules Activity''''' will always need to be preceded by a '''Batch Process Step''' configured with the '''''Extract Activity'''''. <section end="Apply Rules" /> | ||
| Line 64: | Line 64: | ||
| == Application == | == Application == | ||
| <section begin="Application" />A '''Grooper''' [[Repository (Concept)|repository]] consists of a series of [https://en.wikipedia.org/wiki/Table_(information) tables] in a [https://en.wikipedia.org/wiki/Database database], and a '''[[File Store (Object)|File Store]]''' containing relevant files associated to objects that exist within that database. A '''Grooper''' [https://en.wikipedia.org/wiki/Application_software application] is the interface by which a user can interact with that repository of information in an intuitive way.<section end="Application" /> | <section begin="Application" />A '''Grooper''' [[Repository (Concept)|repository]] consists of a series of [https://en.wikipedia.org/wiki/Table_(information) tables] in a [https://en.wikipedia.org/wiki/Database database], and a '''[[File Store (Object)|File Store]]''' containing relevant files associated to objects that exist within that database. A '''Grooper''' [https://en.wikipedia.org/wiki/Application_software application] is the interface by which a user can interact with that repository of information in an intuitive way.<section end="Application" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Grooper Command Console === | === Grooper Command Console === | ||
| <section begin="Grooper Command Console" />The '''[[Grooper Command Console (Application)|Grooper Command Console]]''' is a [https://en.wikipedia.org/wiki/Command-line_interface command-line interface] that performs system configuration and administration tasks within '''Grooper'''.<section end="Grooper Command Console" /> | <section begin="Grooper Command Console" />The '''[[Grooper Command Console (Application)|Grooper Command Console]]''' is a [https://en.wikipedia.org/wiki/Command-line_interface command-line interface] that performs system configuration and administration tasks within '''Grooper'''.<section end="Grooper Command Console" /> | ||
| Line 73: | Line 73: | ||
| == Behavior == | == Behavior == | ||
| <section begin="Behavior" />'''''[[Behaviors (Property)|Behaviors]]''''' is a property of '''[[Content Type (Concept)|Content Types]]''' and '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''  that defines configurable actions that automate processing tasks based on the identified '''Content Type''' of a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Behavior" /> | <section begin="Behavior" />'''''[[Behaviors (Property)|Behaviors]]''''' is a property of '''[[Content Type (Concept)|Content Types]]''' and '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''  that defines configurable actions that automate processing tasks based on the identified '''Content Type''' of a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Behavior" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Export Behavior === | === Export Behavior === | ||
| <section begin="Export Behavior" />An '''''[[Export Behavior (Behavior)|Export Behavior]]''''' defines the conditions and actions for exporting [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' and their associated data from '''Grooper''' to other systems.<section end="Export Behavior" /> | <section begin="Export Behavior" />An '''''[[Export Behavior (Behavior)|Export Behavior]]''''' defines the conditions and actions for exporting [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' and their associated data from '''Grooper''' to other systems.<section end="Export Behavior" /> | ||
| Line 85: | Line 85: | ||
| == CMIS Connection Type == | == CMIS Connection Type == | ||
| <section begin="CMIS Connection Type" />'''''CMIS Connection Type''''', or "binding", establishes the communication protocols used to connect '''Grooper''' with content management systems adhering to the [https://en.wikipedia.org/wiki/Content_Management_Interoperability_Services CMIS] standard.<section end="CMIS Connection Type" /> | <section begin="CMIS Connection Type" />'''''CMIS Connection Type''''', or "binding", establishes the communication protocols used to connect '''Grooper''' with content management systems adhering to the [https://en.wikipedia.org/wiki/Content_Management_Interoperability_Services CMIS] standard.<section end="CMIS Connection Type" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === AppXtender === | === AppXtender === | ||
| <section begin="AppXtender" />The '''''[[AppXtender (CMIS Connection Type)|AppXtender]]''''' '''''CMIS Connection Type''''', or "binding", connects '''Grooper''' to the [https://en.wikipedia.org/wiki/OpenText#AppEnhancer_(formerly_ApplicationXtender) ApplicationXtender] [https://en.wikipedia.org/wiki/Content_management_system content management system] for import and export operations.<section end="AppXtender" /> | <section begin="AppXtender" />The '''''[[AppXtender (CMIS Connection Type)|AppXtender]]''''' '''''CMIS Connection Type''''', or "binding", connects '''Grooper''' to the [https://en.wikipedia.org/wiki/OpenText#AppEnhancer_(formerly_ApplicationXtender) ApplicationXtender] [https://en.wikipedia.org/wiki/Content_management_system content management system] for import and export operations.<section end="AppXtender" /> | ||
| Line 115: | Line 115: | ||
| == Classification Method == | == Classification Method == | ||
| <section begin="Classification Method" />The '''''[[Classification Method (Property)|Classification Method]]''''' property determines the technique used for document [[Classification (Concept)|classification]] within a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''', enabling the sorting of [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.<section end="Classification Method" /> | <section begin="Classification Method" />The '''''[[Classification Method (Property)|Classification Method]]''''' property determines the technique used for document [[Classification (Concept)|classification]] within a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''', enabling the sorting of [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.<section end="Classification Method" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === GPT Embeddings === | === GPT Embeddings === | ||
| <section begin="GPT Embeddings" />The '''''[[GPT Embeddings (Classification Method)|GPT Embeddings]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' is an [https://en.wikipedia.org/wiki/OpenAI OpenAI] [https://en.wikipedia.org/wiki/Generative_pre-trained_transformer GPT] training-based [[Classification (Concept)|classification]] approach that uses "embeddings" to tell one document from another.<section end="GPT Embeddings" /> | <section begin="GPT Embeddings" />The '''''[[GPT Embeddings (Classification Method)|GPT Embeddings]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' is an [https://en.wikipedia.org/wiki/OpenAI OpenAI] [https://en.wikipedia.org/wiki/Generative_pre-trained_transformer GPT] training-based [[Classification (Concept)|classification]] approach that uses "embeddings" to tell one document from another.<section end="GPT Embeddings" /> | ||
| Line 133: | Line 133: | ||
| == Collation Provider == | == Collation Provider == | ||
| <section begin="Collation Provider" />The '''''[[Collation Provider (Property)|Collation]]''''' property of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' defines the method for converting its raw results into a final result set, governing how lists of matches from the '''Data Type''' are combined and interpreted to produce the output data of the '''Data Type'''.<section end="Collation Provider" /> | <section begin="Collation Provider" />The '''''[[Collation Provider (Property)|Collation]]''''' property of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' defines the method for converting its raw results into a final result set, governing how lists of matches from the '''Data Type''' are combined and interpreted to produce the output data of the '''Data Type'''.<section end="Collation Provider" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === AND === | === AND === | ||
| <section begin="AND" />The '''''[[AND (Collation Provider)|AND]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.<section end="AND" /> | <section begin="AND" />The '''''[[AND (Collation Provider)|AND]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.<section end="AND" /> | ||
| Line 163: | Line 163: | ||
| == Concept == | == Concept == | ||
| <section begin="Concept" />There are many objects and properties a user can configure in '''Grooper''', however, gaining an understanding how, why, and when to use these objects and properties is powered by one's understanding of the underlying concepts that define what what these objects and properties are doing and why.<section end="Concept" /> | <section begin="Concept" />There are many objects and properties a user can configure in '''Grooper''', however, gaining an understanding how, why, and when to use these objects and properties is powered by one's understanding of the underlying concepts that define what what these objects and properties are doing and why.<section end="Concept" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Activity Processing === | === Activity Processing === | ||
| <section begin="Activity Processing Concept" />[[Activity Processing (Concept)|Activity Processing]] is a conceptual term that refers to the execution of a sequence of configured tasks, such as [[Classification (Concept)|classification]], [[Data Extraction (Concept)|extraction]], or data enhancement on documents, which are performed within a [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' to transform raw data from documents into structured and actionable information.<section end="Activity Processing Concept" /> | <section begin="Activity Processing Concept" />[[Activity Processing (Concept)|Activity Processing]] is a conceptual term that refers to the execution of a sequence of configured tasks, such as [[Classification (Concept)|classification]], [[Data Extraction (Concept)|extraction]], or data enhancement on documents, which are performed within a [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' to transform raw data from documents into structured and actionable information.<section end="Activity Processing Concept" /> | ||
| Line 313: | Line 313: | ||
| == Export Definition == | == Export Definition == | ||
| <section begin="Export Definition" />'''''[[Export Definitions (Property)|Export Definitions]]''''' is a property of '''''[[Export Behavior (Behavior)|Export Behaviors]]''''' as defined on '''[[Content Type (Concept)|Content Types]]''' or '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''. It defines export connectivity to external systems such as [https://en.wikipedia.org/wiki/File_system file systems], [https://en.wikipedia.org/wiki/Content_management_system content management repositories], [https://en.wikipedia.org/wiki/Database databases], [https://en.wikipedia.org/wiki/Message_transfer_agent mail servers], etc.<section end="Export Definition" /> | <section begin="Export Definition" />'''''[[Export Definitions (Property)|Export Definitions]]''''' is a property of '''''[[Export Behavior (Behavior)|Export Behaviors]]''''' as defined on '''[[Content Type (Concept)|Content Types]]''' or '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''. It defines export connectivity to external systems such as [https://en.wikipedia.org/wiki/File_system file systems], [https://en.wikipedia.org/wiki/Content_management_system content management repositories], [https://en.wikipedia.org/wiki/Database databases], [https://en.wikipedia.org/wiki/Message_transfer_agent mail servers], etc.<section end="Export Definition" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === CMIS Export === | === CMIS Export === | ||
| <section begin="CMIS Export" />'''''[[CMIS Export (Export Definition)|CMIS Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to export documents and their [https://en.wikipedia.org/wiki/Metadata metadata] to various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud-based storage platforms].<section end="CMIS Export" /> | <section begin="CMIS Export" />'''''[[CMIS Export (Export Definition)|CMIS Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to export documents and their [https://en.wikipedia.org/wiki/Metadata metadata] to various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud-based storage platforms].<section end="CMIS Export" /> | ||
| Line 322: | Line 322: | ||
| == Extractor Type == | == Extractor Type == | ||
| <section begin="Extractor Type" />'''''[[Extractor Type (Property)|Extractor Type]]''''', or value extractor, is a property on a wide array of objects that goes by many different names. It defines a primitive operator which reads data values from the text or visual content of a document. '''''Extractor Types''''' are consumed by higher-level objects such as '''[[Data Element (Concept)|Data Elements]]''', [[Object Nomenclature#Extractor Objects|extractor objects]], '''[[Content Type (Concept)|Content Types]]''' and more.<section end="Extractor Type" /> | <section begin="Extractor Type" />'''''[[Extractor Type (Property)|Extractor Type]]''''', or value extractor, is a property on a wide array of objects that goes by many different names. It defines a primitive operator which reads data values from the text or visual content of a document. '''''Extractor Types''''' are consumed by higher-level objects such as '''[[Data Element (Concept)|Data Elements]]''', [[Object Nomenclature#Extractor Objects|extractor objects]], '''[[Content Type (Concept)|Content Types]]''' and more.<section end="Extractor Type" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Detect Signature === | === Detect Signature === | ||
| <section begin="Detect Signature" />The '''''[[Detect Signature (Extractor Type)|Detect Signature]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' detects signatures within a specified rectangular region on a document page by measuring the fill percentage, providing a method to identify and validate the presence of handwritten signatures.<section end="Detect Signature" /> | <section begin="Detect Signature" />The '''''[[Detect Signature (Extractor Type)|Detect Signature]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' detects signatures within a specified rectangular region on a document page by measuring the fill percentage, providing a method to identify and validate the presence of handwritten signatures.<section end="Detect Signature" /> | ||
| Line 379: | Line 379: | ||
| == Fill Method == | == Fill Method == | ||
| <section begin="Fill Method" />The '''''[[Fill Method (Property)|Fill Method]]''''' property on '''[[Data Model (Object)|Data Models]]''', '''[[Data Section (Object)|Data Sections]]''', and '''[[Data Table (Object)|Data Tables]]''' is a collection of various mechanisms that allow for the population of descendant '''[[Data Element (Concept)|Data Elements]]''' of '''Data Models''', '''Data Sections''', and '''Data Tables''' (which can be referred to as "containers"). '''''Fill Methods''''' are secondary extraction operations which populate descendant '''Data Elements''' as they run ''after'' normal extraction.<section end="Fill Method" /> | <section begin="Fill Method" />The '''''[[Fill Method (Property)|Fill Method]]''''' property on '''[[Data Model (Object)|Data Models]]''', '''[[Data Section (Object)|Data Sections]]''', and '''[[Data Table (Object)|Data Tables]]''' is a collection of various mechanisms that allow for the population of descendant '''[[Data Element (Concept)|Data Elements]]''' of '''Data Models''', '''Data Sections''', and '''Data Tables''' (which can be referred to as "containers"). '''''Fill Methods''''' are secondary extraction operations which populate descendant '''Data Elements''' as they run ''after'' normal extraction.<section end="Fill Method" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === AI Extract === | === AI Extract === | ||
| <section begin="AI Extract" />'''''[[AI Extract (Fill Method)|AI Extract]]''''' is a '''''[[Fill Method (Property)|Fill Method]]''''' that leverages a [https://en.wikipedia.org/wiki/Large_language_model Large Language Model (LLM)] to quickly and easily return extraction results to the child elements of '''[[Data Model (Object)|Data Models]]''', '''[[Data Section (Object)|Data Sections]]''', and '''[[Data Table (Object)|Data Tables]]''' by using the .json structure of the relavent '''[[Data Element (Concept)|Data Elements]]''' as part of the instruction set to the LLM.<section end="AI Extract" /> | <section begin="AI Extract" />'''''[[AI Extract (Fill Method)|AI Extract]]''''' is a '''''[[Fill Method (Property)|Fill Method]]''''' that leverages a [https://en.wikipedia.org/wiki/Large_language_model Large Language Model (LLM)] to quickly and easily return extraction results to the child elements of '''[[Data Model (Object)|Data Models]]''', '''[[Data Section (Object)|Data Sections]]''', and '''[[Data Table (Object)|Data Tables]]''' by using the .json structure of the relavent '''[[Data Element (Concept)|Data Elements]]''' as part of the instruction set to the LLM.<section end="AI Extract" /> | ||
| Line 386: | Line 386: | ||
| == IP Command == | == IP Command == | ||
| <section begin="IP Command" />The '''''[[IP Command (Property)|Command]]''''' property of an [[image:GrooperIcon_IPStep.png]] '''[[IP Step (Object)|IP Step]]''' object in '''Grooper''' specifies the [[Image Processing (Concept)|Image Processing (IP)]] command to be executed for that specific step as part of an [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile (Object)|IP Profile]]'''.<section end="IP Command" /> | <section begin="IP Command" />The '''''[[IP Command (Property)|Command]]''''' property of an [[image:GrooperIcon_IPStep.png]] '''[[IP Step (Object)|IP Step]]''' object in '''Grooper''' specifies the [[Image Processing (Concept)|Image Processing (IP)]] command to be executed for that specific step as part of an [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile (Object)|IP Profile]]'''.<section end="IP Command" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Barcode Detection === | === Barcode Detection === | ||
| <section begin="Barcode Detection" />The '''''[[Barcode Detection (IP Command)|Barcode Detection]]''''' '''''[[IP Command (Property)|IP Command]]''''' detects and reads [https://en.wikipedia.org/wiki/Barcode barcode] data. The detected barcode information is stored as part of the object's [[Layout Data (Concept)|layout data]].<section end="Barcode Detection" /> | <section begin="Barcode Detection" />The '''''[[Barcode Detection (IP Command)|Barcode Detection]]''''' '''''[[IP Command (Property)|IP Command]]''''' detects and reads [https://en.wikipedia.org/wiki/Barcode barcode] data. The detected barcode information is stored as part of the object's [[Layout Data (Concept)|layout data]].<section end="Barcode Detection" /> | ||
| Line 410: | Line 410: | ||
| == Import Provider == | == Import Provider == | ||
| <section begin="Import Provider" />The '''''[[Import Provider (Property)|Provider]]''''' property is a selection of '''''[[Import Provider (Property)|Import Providers]]''''' which enable import of file-based content from a variety of sources such as [https://en.wikipedia.org/wiki/File_system file systems], [https://en.wikipedia.org/wiki/Message_transfer_agent mail servers], and [https://en.wikipedia.org/wiki/Content_repository content repositories].<section end="Import Provider" /> | <section begin="Import Provider" />The '''''[[Import Provider (Property)|Provider]]''''' property is a selection of '''''[[Import Provider (Property)|Import Providers]]''''' which enable import of file-based content from a variety of sources such as [https://en.wikipedia.org/wiki/File_system file systems], [https://en.wikipedia.org/wiki/Message_transfer_agent mail servers], and [https://en.wikipedia.org/wiki/Content_repository content repositories].<section end="Import Provider" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === CMIS Import === | === CMIS Import === | ||
| <section begin="CMIS Import" />The '''''[[CMIS Import (Import Provider)|CMIS Import]]''''' '''''[[Import Provider (Property)|Import Provider]]''''' used to import content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to import from various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud based storage] platforms.<section end="CMIS Import" /> | <section begin="CMIS Import" />The '''''[[CMIS Import (Import Provider)|CMIS Import]]''''' '''''[[Import Provider (Property)|Import Provider]]''''' used to import content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to import from various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud based storage] platforms.<section end="CMIS Import" /> | ||
| Line 422: | Line 422: | ||
| == Lookup == | == Lookup == | ||
| <section begin="Lookup" />The '''''[[Lookups (Property)|Lookups]]''''' property is a list of lookup operations to be performed on child elements of the associated container. Each Lookup specification defines a lookup operation, where the value of one or more '''Grooper''' fields will be used to query an external [https://en.wikipedia.org/wiki/Datasource data source], such as a [https://en.wikipedia.org/wiki/Database database]. The results of the query can be used to validate existing field values or populate additional field values.<section end="Lookup" /> | <section begin="Lookup" />The '''''[[Lookups (Property)|Lookups]]''''' property is a list of lookup operations to be performed on child elements of the associated container. Each Lookup specification defines a lookup operation, where the value of one or more '''Grooper''' fields will be used to query an external [https://en.wikipedia.org/wiki/Datasource data source], such as a [https://en.wikipedia.org/wiki/Database database]. The results of the query can be used to validate existing field values or populate additional field values.<section end="Lookup" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === CMIS Lookup === | === CMIS Lookup === | ||
| <section begin="CMIS Lookup" />'''''[[CMIS Lookup (Lookup)|CMIS Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a [[image:GrooperIcon_CMISRepository.png]] '''[[CMIS Repository (Object)|CMIS Repository]]''' via a [[CMIS Query|CMISQL Query]].<section end="CMIS Lookup" /> | <section begin="CMIS Lookup" />'''''[[CMIS Lookup (Lookup)|CMIS Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a [[image:GrooperIcon_CMISRepository.png]] '''[[CMIS Repository (Object)|CMIS Repository]]''' via a [[CMIS Query|CMISQL Query]].<section end="CMIS Lookup" /> | ||
| Line 437: | Line 437: | ||
| == Object == | == Object == | ||
| <section begin="Object" />In '''Grooper''', objects are defined as configurable elements within its hierarchical tree structure. These include nodes and embedded objects that can be manipulated and edited to define the system's behavior, create workflows, and manage content.<section end="Object" /> | <section begin="Object" />In '''Grooper''', objects are defined as configurable elements within its hierarchical tree structure. These include nodes and embedded objects that can be manipulated and edited to define the system's behavior, create workflows, and manage content.<section end="Object" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Batch === | === Batch === | ||
| <section begin="Batch" />[[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' objects are fundamental in '''Grooper's''' architecture as they are the containers of documents that get moved through '''Grooper's''' workflow mechanisms known as [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Processes]]'''.<section end="Batch" /> | <section begin="Batch" />[[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' objects are fundamental in '''Grooper's''' architecture as they are the containers of documents that get moved through '''Grooper's''' workflow mechanisms known as [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Processes]]'''.<section end="Batch" /> | ||
| Line 568: | Line 568: | ||
| == Property == | == Property == | ||
| <section begin="Property" />A property is a mechanism by which an object in '''Grooper''' is configured that affects how the object performs its function.<section end="Property" /> | <section begin="Property" />A property is a mechanism by which an object in '''Grooper''' is configured that affects how the object performs its function.<section end="Property" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Confidence Multiplier and Output Confidence === | === Confidence Multiplier and Output Confidence === | ||
| <section begin="Confidence Multiplier and Output Confidence" />Some results carry more weight than others.  The '''''[[Confidence Multiplier and Output Confidence (Property)|Confidence Multiplier]]''''' and '''''[[Confidence Multiplier and Output Confidence (Property)|Output Confidence]]''''' properties allow you to manually adjust an [[Data Extraction (Concept)|extraction]] result's confidence.<section end="Confidence Multiplier and Output Confidence" /> | <section begin="Confidence Multiplier and Output Confidence" />Some results carry more weight than others.  The '''''[[Confidence Multiplier and Output Confidence (Property)|Confidence Multiplier]]''''' and '''''[[Confidence Multiplier and Output Confidence (Property)|Output Confidence]]''''' properties allow you to manually adjust an [[Data Extraction (Concept)|extraction]] result's confidence.<section end="Confidence Multiplier and Output Confidence" /> | ||
| Line 611: | Line 611: | ||
| <section begin="Vertical Wrap" />'''''[[Vertical Wrap (Property)|Vertical Wrap]]''''' is a property of certain '''''[[Extractor Type (Property)|Extractor Types]]''''' and a '''[[Content Type (Concept)|Content Type's]]''' ''[[Labeling Behavior (Behavior)|Labeling Behavior]]'' used to provide simplified [[Data Extraction (Concept)|extraction]] of vertically wrapped text (typically stacked labels).<section end="Vertical Wrap" /> | <section begin="Vertical Wrap" />'''''[[Vertical Wrap (Property)|Vertical Wrap]]''''' is a property of certain '''''[[Extractor Type (Property)|Extractor Types]]''''' and a '''[[Content Type (Concept)|Content Type's]]''' ''[[Labeling Behavior (Behavior)|Labeling Behavior]]'' used to provide simplified [[Data Extraction (Concept)|extraction]] of vertically wrapped text (typically stacked labels).<section end="Vertical Wrap" /> | ||
| </div> | </div> | ||
| == Repository Option == | |||
| <section begin="Repository Option" />The '''''[[Repository Option (Property)|Options]]''''' property of the {{GrooperRootIcon}} '''[[Root (Object)|Root]]''' object is a collection of optional features that affect the entire repository. These optional features enable entire collections of functionality that otherwise do not work without first establishing the connections these options provide.<section end="Repository Option" /> | |||
| <div style="padding-left: 1.5em;"> | |||
| === LLM Connector === | |||
| <section begin="LLM Connector" />'''''[[LLM Connector (Repository Option)|LLM Connector]]''''' is a '''''[[Repository Opion (Property)|Repository Option]]''''' that enables [https://en.wikipedia.org/wiki/OpenAI OpenAI-based] functionality for the local '''Grooper''' repository.<section end="LLM Connector" /> | |||
| </div> | |||
| == Section Extract Method == | == Section Extract Method == | ||
| <section begin="Section Extract Method" />The '''''Extract Method''''' property of a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' defines a "Section Extract Method" which specifies how section instances will be identified and extracted.<section end="Section Extract Method" /> | <section begin="Section Extract Method" />The '''''Extract Method''''' property of a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' defines a "Section Extract Method" which specifies how section instances will be identified and extracted.<section end="Section Extract Method" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Nested Table === | === Nested Table === | ||
| <section begin="Nested Table" />'''''[[Nested Table (Section Extract Method)|Nested Table]]''''' is a "Section Extract Method" enabled for a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' using the '''''Extract Method''''' property. This method divides a document into sections by extracting table data within those sections. This gives '''Grooper''' users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' object).<section end="Nested Table" /> | <section begin="Nested Table" />'''''[[Nested Table (Section Extract Method)|Nested Table]]''''' is a "Section Extract Method" enabled for a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' using the '''''Extract Method''''' property. This method divides a document into sections by extracting table data within those sections. This gives '''Grooper''' users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' object).<section end="Nested Table" /> | ||
| Line 623: | Line 628: | ||
| == Separation Provider == | == Separation Provider == | ||
| <section begin="Separation Provider" />The '''''[[Separation Provider (Property)|Provider]]''''' property of the '''''[[Separate (Activity)|Separate]]''''' '''''[[Activity (Property)|Activity]]''''' defines the type of [[Separation (Concept)|separation]] to be performed at the designated '''''[[Scope (Property)|Scope]]'''''.<section end="Separation Provider" /> | <section begin="Separation Provider" />The '''''[[Separation Provider (Property)|Provider]]''''' property of the '''''[[Separate (Activity)|Separate]]''''' '''''[[Activity (Property)|Activity]]''''' defines the type of [[Separation (Concept)|separation]] to be performed at the designated '''''[[Scope (Property)|Scope]]'''''.<section end="Separation Provider" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Change in Value Separation === | === Change in Value Separation === | ||
| <section begin="Change in Value Separation" />The '''''[[Change in Value Separation (Separation Provider)|Change in Value]]''''' '''''[[Separation Provider (Property)|Separation Provider]]''''' creates a new folder and separates every time an extracted value changes from one [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' to another.<section end="Change in Value Separation" /> | <section begin="Change in Value Separation" />The '''''[[Change in Value Separation (Separation Provider)|Change in Value]]''''' '''''[[Separation Provider (Property)|Separation Provider]]''''' creates a new folder and separates every time an extracted value changes from one [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' to another.<section end="Change in Value Separation" /> | ||
| Line 650: | Line 655: | ||
| == Service == | == Service == | ||
| <section begin="Service" />[[Grooper Service (Concept)|Grooper Service]] is a conceptual term that refers to the various [https://en.wikipedia.org/wiki/Computer_program executable programs] that run as a [https://en.wikipedia.org/wiki/Windows_service Windows Services] to facilitate '''Grooper''' processing. Service instances are installed, configured, started and stopped using [[Grooper Config (Application)|Grooper Config]].<section end="Service" /> | <section begin="Service" />[[Grooper Service (Concept)|Grooper Service]] is a conceptual term that refers to the various [https://en.wikipedia.org/wiki/Computer_program executable programs] that run as a [https://en.wikipedia.org/wiki/Windows_service Windows Services] to facilitate '''Grooper''' processing. Service instances are installed, configured, started and stopped using [[Grooper Config (Application)|Grooper Config]].<section end="Service" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === API Services === | === API Services === | ||
| <section begin="API Services" />You can perform [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' processing via [https://en.wikipedia.org/wiki/REST REST] [https://en.wikipedia.org/wiki/API API] web calls by installing  '''''[[API Services (Service)|API Services]]'''''.<section end="API Services" /> | <section begin="API Services" />You can perform [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' processing via [https://en.wikipedia.org/wiki/REST REST] [https://en.wikipedia.org/wiki/API API] web calls by installing  '''''[[API Services (Service)|API Services]]'''''.<section end="API Services" /> | ||
| Line 664: | Line 669: | ||
| <section end="Import Watcher Service" /> | <section end="Import Watcher Service" /> | ||
| </div> | </div> | ||
| == Table Extract Method == | == Table Extract Method == | ||
| <section begin="Table Extract Method" />The '''''[[Table Extract Method (Property)|Extract Method]]''''' property of a [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' sets a '''''Table Extract Method''''' which defines the settings and logic for the '''Data Table''' to perform [[Data Extraction (Concept)|extraction]].<section end="Table Extract Method" /> | <section begin="Table Extract Method" />The '''''[[Table Extract Method (Property)|Extract Method]]''''' property of a [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' sets a '''''Table Extract Method''''' which defines the settings and logic for the '''Data Table''' to perform [[Data Extraction (Concept)|extraction]].<section end="Table Extract Method" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Delimited Extract === | === Delimited Extract === | ||
| <section begin="Delimited Extract" />The '''''[[Delimited Extract (Table Extract Method)|Delimited Extract]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' extracts tabular data from a [https://en.wikipedia.org/wiki/Delimiter-separated_values delimiter-separated] text file, such as a [https://en.wikipedia.org/wiki/Comma-separated_values CSV file].<section end="Delimited Extract" /> | <section begin="Delimited Extract" />The '''''[[Delimited Extract (Table Extract Method)|Delimited Extract]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' extracts tabular data from a [https://en.wikipedia.org/wiki/Delimiter-separated_values delimiter-separated] text file, such as a [https://en.wikipedia.org/wiki/Comma-separated_values CSV file].<section end="Delimited Extract" /> | ||
| Line 685: | Line 689: | ||
| == UI Element == | == UI Element == | ||
| <section begin="UI Element" />A UI Element is a portion of the '''Grooper''' interface that allows users to interact with or otherwise receive information about the application.<section end="UI Element" /> | <section begin="UI Element" />A UI Element is a portion of the '''Grooper''' interface that allows users to interact with or otherwise receive information about the application.<section end="UI Element" /> | ||
| <div style="padding-left: 1.5em"> | <div style="padding-left: 1.5em;"> | ||
| === Document Viewer === | === Document Viewer === | ||
| <section begin="Document Viewer" />The [[Document Viewer (UI Element)|Grooper Document Viewer]] is the portal to your documents. It is the UI that allows you to see a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder's]]''' (or a [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page's]]''') image, text content, and more.<section end="Document Viewer" /> | <section begin="Document Viewer" />The [[Document Viewer (UI Element)|Grooper Document Viewer]] is the portal to your documents. It is the UI that allows you to see a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder's]]''' (or a [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page's]]''') image, text content, and more.<section end="Document Viewer" /> | ||
Revision as of 15:03, 24 July 2024
Activity
Activity  is a property on  Batch Process Step objects. Activities define specific document processing operations done to a
 Batch Process Step objects. Activities define specific document processing operations done to a  Batch,
 Batch,  Batch Folder, or
 Batch Folder, or  Batch Page.
 Batch Page.
Batch Process Steps configured with specific Activities are frequently referred by the name of the Activity followed by the word "step". For example: Classify Step.
Apply Rules
Apply Rules is an Activity that runs Data Rules on data that has already been extracted from a  Batch. A Batch Process Step configured with the Apply Rules Activity will always need to be preceded by a Batch Process Step configured with the Extract Activity.
 Batch. A Batch Process Step configured with the Apply Rules Activity will always need to be preceded by a Batch Process Step configured with the Extract Activity. 
Classify
Classify is an Activity that "classifies"  Batch Folders in a
 Batch Folders in a  Batch by assigning them a Content Type using patterns, lexical understanding, or rules as defined by a
 Batch by assigning them a Content Type using patterns, lexical understanding, or rules as defined by a  Content Model.
 Content Model.
Clip Frames
The Clip Frames Activity extracts defined areas from microfiche card images, creating new image frames or layers for focused analysis or processing.
Correct
The Correct Activity performs spell correction on the textual content of  Batch Folders or specific Data Elements, enhancing the accuracy of data extraction by resolving recognition errors.
 Batch Folders or specific Data Elements, enhancing the accuracy of data extraction by resolving recognition errors.
Detect Frames
The Detect Frames Activity locates and identifies frame lines on microfiche card images, enabling the isolation of areas within the frames for further data extraction or processing.
Execute
The Execute Activity runs a specified child command, allowing for the modular and controlled execution of tasks within a larger automated workflow.
Export
The Export Activity facilitates the transfer of documents and extracted information to external systems or formats, completing the data processing workflow.
Extract
The Extract Activity retrieves relevant information, defined by Data Elements, from  Batch Folders, transforming unstructured or semi-structured content into structured, usable data.
 Batch Folders, transforming unstructured or semi-structured content into structured, usable data.
Image Processing
The Image Processing Activity enhances and optimizes  Batch Pages for better recognition and data extraction results.
 Batch Pages for better recognition and data extraction results.
Initialize Card
The Initialize Card Activity prepares and configures microfiche card images for further processing.
Merge
The Merge Activity creates a document from the Page objects in your Batch and saves to a Batch Folder.
Recognize
The Recognize Activity interprets  Batch Pages and
 Batch Pages and  Batch Folders, converting them into machine-readable text and capturing layout data for comprehensive analysis and data extraction. This will attach a text and/or layoutData file to the respective object.
 Batch Folders, converting them into machine-readable text and capturing layout data for comprehensive analysis and data extraction. This will attach a text and/or layoutData file to the respective object.
Redact
The Redact Activity hides or "redacts" text information on the image or PDF of a document based on data returned from a configured Extractor. It does not alter the text data attached to the image or PDF.
Render
The Render Activity normalizes electronic document content from file formats Grooper cannot read innately to a PDF format. This allows Grooper to extract the text via the Recognize Activity.
Review
The Review Activity facilitates human evaluation and validation of processed  Batch Folders and extracted data for accuracy and completeness.
 Batch Folders and extracted data for accuracy and completeness.
Send Mail
The Send Mail Activity automates the dispatch of emails with or without attachments, based on  Batch Process events and conditions.
 Batch Process events and conditions.
Separate
The Separate Activity sorts  Batch Pages into individual
 Batch Pages into individual  Batch Folders, distinguishing them for independent processing and organization.
 Batch Folders, distinguishing them for independent processing and organization.
Split Pages
Multi-page documents (typically PDFs and TIFFs) come into Grooper represented as single  Batch Folders. The Split Pages Activity exposes
 Batch Folders. The Split Pages Activity exposes  Batch Pages as child objects of the Batch Folders for individualized processing and handling.
 Batch Pages as child objects of the Batch Folders for individualized processing and handling.
XML Transform
The XML Transform Activity applies XSLT stylesheets to XML data to modify or reformat the output structure for various purposes.
Application
A Grooper repository consists of a series of tables in a database, and a File Store containing relevant files associated to objects that exist within that database. A Grooper application is the interface by which a user can interact with that repository of information in an intuitive way.
Grooper Command Console
The Grooper Command Console is a command-line interface that performs system configuration and administration tasks within Grooper.
Web Client
The Grooper Web Client allows users to connect to Grooper via a web browser using a URL. The URL is pointed at a website hosted by a server on which Grooper is installed and Internet Information Services configured.
Behavior
Behaviors is a property of Content Types and Export Activities  that defines configurable actions that automate processing tasks based on the identified Content Type of a  Batch Folder.
 Batch Folder.
Export Behavior
An Export Behavior defines the conditions and actions for exporting  Batch Folders and their associated data from Grooper to other systems.
 Batch Folders and their associated data from Grooper to other systems.
Labeling Behavior
A Labeling Behavior is a Content Type Behavior designed to collect and utilize a document's field labels in a variety of ways. This includes functionality for Classification and Extraction.
PDF Data Mapping
PDF Data Mapping is a Content Type Behavior designed to create an exportable PDF file with additional native PDF elements.
CMIS Connection Type
CMIS Connection Type, or "binding", establishes the communication protocols used to connect Grooper with content management systems adhering to the CMIS standard.
AppXtender
The AppXtender CMIS Connection Type, or "binding", connects Grooper to the ApplicationXtender content management system for import and export operations.
Box
The Box CMIS Connection Type, or "binding", connects Grooper to the Box content management system for import and export operations.
Exchange
The Exchange CMIS Connection Type, or "binding", connects Grooper to the Microsoft Exchange Server mail server for import and export operations.
FTP
The FTP CMIS Connection Type, or "binding", connects Grooper to FTP directories for import and export operations.
IMAP
The IMAP CMIS Connection Type, or "binding", connects Grooper to email messages and folders through an IMAP email server.
NTFS
The NTFS CMIS Connection Type, or "binding", connects Grooper to files and folders in the Microsoft Windows NTFS file system.
OneDrive
The OneDrive CMIS Connection Type, or "binding", connects Grooper to Microsoft OneDrive cloud services.
SFTP
The SFTP CMIS Connection Type, or "binding", connects Grooper to SFTP directories for import and export operations.
The SharePoint CMIS Connection Type, or "binding", connects Grooper to Microsoft SharePoint, providing access to content stored in "document libraries" and "picture lLibraries".
Classification Method
The Classification Method property determines the technique used for document classification within a  Content Model, enabling the sorting of
 Content Model, enabling the sorting of  Batch Folders into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.
 Batch Folders into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.
GPT Embeddings
The GPT Embeddings Classification Method is an OpenAI GPT training-based classification approach that uses "embeddings" to tell one document from another.
Labelset-Based
Labelset-Based is a Classification Method that leverages the labels defined via a Labeling Behavior to classify  Batch Folders.
 Batch Folders.
Lexical
The Lexical Classification Method classifies  Batch Folders based on their text content by utilizing either pre-configured training or rules. This is achieved through the analysis of word frequencies or defined rules that identify document types.
 Batch Folders based on their text content by utilizing either pre-configured training or rules. This is achieved through the analysis of word frequencies or defined rules that identify document types.
Rules-Based
The Rules-Based Classification Method employs defined "rules" on  Document Types to classify
 Document Types to classify  Batch Folders, utilizing Positive Extractor and Negative Extractor properties to accurately categorize them through rule application, thereby ensuring Batch Folders match predefined criteria.
 Batch Folders, utilizing Positive Extractor and Negative Extractor properties to accurately categorize them through rule application, thereby ensuring Batch Folders match predefined criteria.
Visual
The Visual Classification Method uses image data instead of text data to determine the  Document Type assigned to a
 Document Type assigned to a  Batch Folder during classification.  Instead of using text-based extractors, an
 Batch Folder during classification.  Instead of using text-based extractors, an  IP Profile is used with an Extract Features IP Command to obtain data pertaining to a Batch Folder's image(s).  Document samples are trained as examples of a Document Type.
 IP Profile is used with an Extract Features IP Command to obtain data pertaining to a Batch Folder's image(s).  Document samples are trained as examples of a Document Type.
Collation Provider
The Collation property of a  Data Type defines the method for converting its raw results into a final result set, governing how lists of matches from the Data Type are combined and interpreted to produce the output data of the Data Type.
 Data Type defines the method for converting its raw results into a final result set, governing how lists of matches from the Data Type are combined and interpreted to produce the output data of the Data Type.
AND
The AND Collation Provider of a  Data Type returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.
 Data Type returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.
Array
The Array Collation Provider of a  Data Type matches a list of values arranged in horizontal, vertical, or flow order, combining instances that qualify into a single result.
 Data Type matches a list of values arranged in horizontal, vertical, or flow order, combining instances that qualify into a single result.
Combine
The Combine Collation Provider of a  Data Type combines instances from returned results based on a specified grouping, controlling how extractor results are assembled together for output.
 Data Type combines instances from returned results based on a specified grouping, controlling how extractor results are assembled together for output.
Key-Value List
The Key-Value List Collation Provider of a  Data Type matches instances where a key and a list of one or more values appear together on the document, adhering to a specific layout pattern.
 Data Type matches instances where a key and a list of one or more values appear together on the document, adhering to a specific layout pattern.
Key-Value Pair
The Key-Value Pair Collation Provider of a  Data Type matches instances where a key is paired with a value on the document in a specific layout, essential for extracting label-value pairs.
 Data Type matches instances where a key is paired with a value on the document in a specific layout, essential for extracting label-value pairs.
Multi-Column
The Multi-Column Collation Provider of a  Data Type combines multiple columns on a page into a single column for extraction.
 Data Type combines multiple columns on a page into a single column for extraction.
Ordered Array
The Ordered Array Collation Provider of a  Data Type finds sequences of values where one result is present for each extractor, in the order they appear.
 Data Type finds sequences of values where one result is present for each extractor, in the order they appear.
Pattern-Based
The Pattern-Based Collation Provider of a  Data Type uses regular expressions to sequence returned results into a final result set.
 Data Type uses regular expressions to sequence returned results into a final result set.
Split
The Split Collation Provider of a  Data Type separates a data instance at each match returned by the Data Type.
 Data Type separates a data instance at each match returned by the Data Type.
Concept
There are many objects and properties a user can configure in Grooper, however, gaining an understanding how, why, and when to use these objects and properties is powered by one's understanding of the underlying concepts that define what what these objects and properties are doing and why.
Activity Processing
Activity Processing is a conceptual term that refers to the execution of a sequence of configured tasks, such as classification, extraction, or data enhancement on documents, which are performed within a  Batch Process to transform raw data from documents into structured and actionable information.
 Batch Process to transform raw data from documents into structured and actionable information.
CMIS+
CMIS+ is a conceptual term that refers to Grooper's CMIS+ architecture that provides a standardized access to document content and metadata across a variety of external storage platforms.
CMIS
CMIS is a conceptual term that refers to Content Management Interoperability Services: an open standard allowing different content management systems to share information over the Internet.
CMIS Query
CMIS Query is a conceptual term that refers to the fact that CMIS Queries are utilized to search documents in CMIS Repositories and to filter documents upon import when using the Import Query Results Import Provider.
CSS Data Viewer Styling
CSS Data Viewer Styling refers to using CSS to custom style the Review activity's Data Viewer interface. This gives you a great deal of control over a  Data Model's appearance and layout during document review.
 Data Model's appearance and layout during document review.
Classification
Classification is a conceptual term that refers to the process of identifying and organizing documents into categorical types based on their content or layout, often using machine learning, rules, or pattern recognition for efficient document management and data extraction workflows. Specifically, the Classify Activity will assign a Content Type to a  Batch Folder.
 Batch Folder.
Code Expressions
Code Expressions (not to be confused with regular expressions) is a conceptual term that refers to snippets of VB.Net code that expand Grooper's core functionality.
Combined Methods
Combined Methods is a conceptual term that refers to the idea that a user can leverage multiple Classification Methods to overcome the shortcomings of an individual method.
Content Type
Content Type is a conceptual term that refers to the grouping of three Grooper objects:  Content Models,
 Content Models,  Content Categories, and
 Content Categories, and  Document Types.
 Document Types.
Data Context
Data Context is a conceptual term that gives definition to data that, without it, is otherwise meaningless.
Data Element
Data Element is a conceptual term that refers to the grouping of five Grooper objects:  Data Models,
 Data Models,  Data Sections,
 Data Sections,  Data Fields,
 Data Fields,  Data Tables, and
 Data Tables, and  Data Columns.
 Data Columns.
Data Extraction
Data Extraction is a conceptual term that involves identifying and capturing specific information from  Batch Folders like forms or invoices using a set of configurable Data Extractors, which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.
 Batch Folders like forms or invoices using a set of configurable Data Extractors, which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.
Data Extractor
Data Extractor (or just "extractor") refers to all Extractor Types and extractor objects. Extractors define the logic used to return data from a document's text content, including general data (such as a date) and specific data (such as an agreement date on a contract).
Data Instance
Data Instance is a conceptual term that refers to an encapsulation of text data within a document. Data instances are the hierarchy of text data that Grooper's extraction mechanisms create.
EDI Integration
EDI Integration is a conceptual term that refers to Grooper's ability to process EDI files.
Expressions
Expressions (not to be confused with regular expressions) is a conceptual term that refers to snippets of VB.Net code that expand Grooper's core functionality.
Expressions Cookbook
Expressions Cookbook is a conceptual term that refers to a reference list for commonly used expressions in Grooper.
Field Mapping
Field Mapping is a conceptual term that refers to how logical connections are made between metadata content in Grooper and an external storage platform.
Five Phases of Grooper
Five Phases of Grooper is a conceptual term that seeks to build understanding of how documents are processed through Grooper.
Flow Collation
Flow Collation is a conceptual term used to define a type of layout used in Collation Providers of  Data Types.
 Data Types.
Footer Rows and Footer Modes is a conceptual term that refers to how a "footer row" (enabled by the Generate Footer Row property of a  Data Table) provides Grooper users a quick way to validate numerical data in a
 Data Table) provides Grooper users a quick way to validate numerical data in a  Data Column. The Data Column's Footer Mode property controls if and how a total is determined for numerical values in a Data Column.
 Data Column. The Data Column's Footer Mode property controls if and how a total is determined for numerical values in a Data Column.
Fuzzy RegEx
Fuzzy RegEx is a conceptual term that refers to the usage of fuzzy logic within Extractor Types that leverage regular expressions to match patterns via the enabling of the Fuzzy Matching' property.
GPT Integration
GPT Integration is a conceptual term that refers to the usage of OpenAI's GPT models within Grooper to enhance the capabilities of data extractors, classification, and lookups.
Grooper Infrastructure
Grooper Infrastructure is a conceptual term that refers to computing underpinnings of what makes up a Grooper repository and the software that allows interface with it.
Grooper Repository
Grooper Repository is a conceptual term that refers to the environment used to create, configure and execute objects in Grooper. It provides the framework to "do work" in Grooper.
Grooper Service
Grooper Services are various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Config.
Image Processing
Image Processing is a conceptual term that refers to how Grooper applies a variety of techniques to enhance scanned documents' quality, improving OCR accuracy by removing imperfections and adjusting visual characteristics to prepare images for data extraction and classification.
Import Mode and Document Linking
Import Mode and Document Linking is a conceptual term that refers to the usage of the Import Mode property. This affects whether or not an imported document maintains a link to its original file and/or if a copy of the file is made on import or not.
LINQ to Grooper Objects
LINQ to Grooper Objects is a conceptual term that refers to the ability of Grooper to leverage LINQ syntax in expressions.
Layered OCR
Layered OCR is a conceptual term that refers to the usage of the Layered OCR setting of the OCR Engine property of an  OCR Profile. The use of this setting enables the usage of secondary OCR Profiles on a single page.  The OCR results from these secondary OCR Profiles are merged with (or layered on top of) the primary OCR Profile's results.
 OCR Profile. The use of this setting enables the usage of secondary OCR Profiles on a single page.  The OCR results from these secondary OCR Profiles are merged with (or layered on top of) the primary OCR Profile's results.
Layout Data
Layout Data is a conceptual term that refers to information such as line locations, OMR checkbox locations and states, barcode values, and detected shapes captured by certain image processing commands. This data is stored as an attached file on a  Batch Folder or
 Batch Folder or  Batch Page object and can later be recalled by various functions within Grooper that rely on the presence of that data to function.
 Batch Page object and can later be recalled by various functions within Grooper that rely on the presence of that data to function.
Microfiche Processing
Microfiche Processing is a conceptual term that refers to how Grooper leverages several IP Commands to accurately process microform documents.
Microsoft Office Integration
Microsoft Office Integration is a conceptual term that refers to Grooper's ability to convert Microsoft Word and Microsoft Excel files into formats that Grooper can read.
OCR
OCR is a conceptual term that stands for Optical Character Recognition. It allows text from paper documents to be digitized, in order to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text.
OCR Synthesis
OCR Synthesis is a conceptual term that refers to Grooper's unique method of pre-processing and re-processing raw results from the OCR Engine to get better results out of it.
Object Nomenclature
Object Nomenclature is a conceptual term that refers to the idea that mastery of a Grooper environment is greatly enhanced by understanding the myriad of objects that can exist and how they are related.
PDF Page Types
PDF Page Types is a conceptual term that refers to specific types of PDF pages. Page types describe the kind of content in a PDF page and informs Grooper how certain Activities should process the page. For example, "single image" pages are OCR'd by the Recognize activity where "text only" pages have their native text extracted.
Regular Expression
Regular Expression is a conceptual term that refers to a standard syntax designed to parse text strings. This is a way of finding information in a block of text. It is the primary method by which Grooper extracts and returns data from documents.
Repository
Repository is a conceptual term that refers to a location where files and/or data is stored and managed.
Separation
Separation is a conceptual term that refers to the process of taking an unorganized  Batch of loose
 Batch of loose  Batch Pages and organizing them into document folders. This is done so Grooper can later assign a Document Type to each document folder in a process known as Classification.
 Batch Pages and organizing them into document folders. This is done so Grooper can later assign a Document Type to each document folder in a process known as Classification.
TF-IDF
TF-IDF is a conceptual term that refers to (term frequency-inverse document frequency), a numerical statistic intended to reflect how important a word is to a document within a collection (or document set or corpus). It is how Grooper uses machine learning for training-based document classification (via the Lexical method) and data extraction (via the  Field Class extractor).
 Field Class extractor).
Table Extraction
Table Extraction is a conceptual term that refers to Grooper's functionality to extract data from cells in tables.  This is accomplished by configuring the  Data Table and its child
 Data Table and its child  Data Column Data Elements in a
 Data Column Data Elements in a  Data Model.
 Data Model.
Test Batch
Test Batch is a conceptual term that refers to any  Batch created in the Test folder of the Batches folder in the Node Tree).
 Batch created in the Test folder of the Batches folder in the Node Tree).
Thread
Thread is a conceptual term that refers to the smallest unit of processing that can be performed within an operating system.
Training-Based Approaches to Document Classification
Training-Based Approaches to Document Classification is a conceptual term that refers to an approach to document classification that classifies  Batch Folders according to the similarity of unclassified Batch Folders to trained examples of that kind of Document Type.
 Batch Folders according to the similarity of unclassified Batch Folders to trained examples of that kind of Document Type.
Training Batch
Training Batch is a conceptual term that refers to a more convenient way to work with all of the samples a  Concent Model has been trained against. You can also still look at the Form Types underneath each Content Type, but the Training Set can show you all the samples in one place.
 Concent Model has been trained against. You can also still look at the Form Types underneath each Content Type, but the Training Set can show you all the samples in one place.
UNC Path
UNC Path is a conceptual term that refers to UNC (Universal Naming Convention) which is a standard used in Microsoft Windows for accessing shared network folders.
Waterfall Classification
Waterfall Classification is a conceptual term that refers to a classification notion in Grooper that manipulates the Positive Extractor property to prioritize training similarity in order to achieve a middle ground between high specificity and accuracy, and generality with minimal accuracy. This is helpful whenever  Batch Folders get misclassified, and simply retraining won't help.
 Batch Folders get misclassified, and simply retraining won't help.
XML Schema Integration
XML Schema Integration is a conceptual term that refers to Grooper's ability to interact with XML schemas and the configuration required to do so.
Export Definition
Export Definitions is a property of Export Behaviors as defined on Content Types or Export Activities. It defines export connectivity to external systems such as file systems, content management repositories, databases, mail servers, etc.
CMIS Export
CMIS Export is an Export Definition available when configuring an Export Behavior.  It exports content over a  CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.
 CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.
Data Export
Data Export is an Export Definition available when configuring an Export Behavior.  It exports extracted document data over a  Data Connection, allowing users to export data to a Microsoft SQL Server or ODBC compliant database.
 Data Connection, allowing users to export data to a Microsoft SQL Server or ODBC compliant database.
Extractor Type
Extractor Type, or value extractor, is a property on a wide array of objects that goes by many different names. It defines a primitive operator which reads data values from the text or visual content of a document. Extractor Types are consumed by higher-level objects such as Data Elements, extractor objects, Content Types and more.
Detect Signature
The Detect Signature Extractor Type detects signatures within a specified rectangular region on a document page by measuring the fill percentage, providing a method to identify and validate the presence of handwritten signatures.
Field Match
The Field Match Extractor Type matches the value stored in a previously-extracted  Data Field or
 Data Field or  Data Column, allowing for consistency and reference across different parts of a document or dataset.
 Data Column, allowing for consistency and reference across different parts of a document or dataset.
Find Barcode
The Find Barcode Extractor Type searches the  Batch Folder layout data for a barcode, capturing its value upon detection.
 Batch Folder layout data for a barcode, capturing its value upon detection.
GPT Complete
The GPT Complete Extractor Type leverages OpenAI's GPT model to generate completions for inputs, returning one hit for each result choice provided by the model's response.
Highlight Zone
The Highlight Zone Extractor Type sets a highlight region on a document without performing any actual data extraction, effectively marking areas of interest or importance.
Label Match
The Label Match Extractor Type matches a list of one or more label values using matching options defined by a Labeling Behavior. It works similarly to List Match, but uses shared settings defined in a Labeling Behavior for Fuzzy Matching, Vertical Wrap, and Constrained Wrap.
Labeled OMR
The Labeled OMR Extractor Type is used to output OMR checkbox labels. It determines whether labeled checkboxes are checked or not. If checked, it outputs the label(s) as the result.
Labeled Value
The Labeled Value Extractor Type identifies and extracts information from a field presented as a label-value pair on a document, by matching a set of labels and a set of values, and determining pairs based on their geometric clustering on the document.
List Match
The List Match Extractor Type is designed to return values matching one or more items in a defined list. By default, the List Match extractor does not use or require regular expression.
Ordered OMR
The Ordered OMR Extractor Type is similar to a Labeled OMR in that it is used to return OMR check box information. Rather than relying on a label for the extraction, the Ordered OMR returns information for multiple check boxes within a given zone based on their order and layout.
Pattern Match
The Pattern Match Extractor Type extracts values from a document that match a specified regular expression, allowing for the detection of data following a known format or pattern.
Query HTML
The Query HTML Extractor Type queries an HTML document using a CSS selector and returns the inner text of each matching element.
Read Barcode
The Read Barcode Extractor Type uses barcode recognition technology to read and extract values from barcodes found in the document content.
Read Meta Data
The Read Meta Data Extractor Type retrieves metadata values associated with a document.
Read Zone
The Read Zone Extractor Type allows you to extract text data in a rectangular region (called a "extraction zone" or just "zone") on a document. This can be a fixed zone, extracting text from the same location on a document, or a zone relative to an extracted text anchor or shape location on the document.
Reference
The Reference Extractor Type allows for the referencing of an external extractor object to be used within a Grooper object's configuration, enabling consistent extraction logic across different objects.
Word Match
The Word Match Extractor Type extracts individual words or phrases containing multiple words from documents. It is designed to collect full words and is often used in n-gram extraction.
Zonal OMR
The Zonal OMR Extractor Type reads one or more checkboxes using manually-configured zones. It is mostly an outdated tool and should only be used if all other OMR extractor options have been exhausted. It requires the most manual setup of any OMR extractor to configure.
Fill Method
The Fill Method property on Data Models, Data Sections, and Data Tables is a collection of various mechanisms that allow for the population of descendant Data Elements of Data Models, Data Sections, and Data Tables (which can be referred to as "containers"). Fill Methods are secondary extraction operations which populate descendant Data Elements as they run after normal extraction.
AI Extract
AI Extract is a Fill Method that leverages a Large Language Model (LLM) to quickly and easily return extraction results to the child elements of Data Models, Data Sections, and Data Tables by using the .json structure of the relavent Data Elements as part of the instruction set to the LLM.
IP Command
The Command property of an  IP Step object in Grooper specifies the Image Processing (IP) command to be executed for that specific step as part of an
 IP Step object in Grooper specifies the Image Processing (IP) command to be executed for that specific step as part of an  IP Profile.
 IP Profile.
Barcode Detection
The Barcode Detection IP Command detects and reads barcode data. The detected barcode information is stored as part of the object's layout data.
Binarize
The Binarize IP Command converts a color or grayscale image to black and white using various thresholding methods.
Extract Page
The Extract Page IP Command removes an image from a carrier image while simultaneously removing any image warping or skewing.
Line Removal
The Line Removal IP Command removes horizontal and vertical lines from documents.
Scratch Removal
The Scratch Removal IP Command detects and removes or repairs scratches from film-based images.
Shape Detection
The Shape Detection IP Command detects shapes on a document matching sample images given by the user.
Shape Removal
The Shape Removal IP Command detects and removes shapes from documents.
Import Provider
The Provider property is a selection of Import Providers which enable import of file-based content from a variety of sources such as file systems, mail servers, and content repositories.
CMIS Import
The CMIS Import Import Provider used to import content over a  CMIS Connection, allowing users to import from various on-premise and cloud based storage platforms.
 CMIS Connection, allowing users to import from various on-premise and cloud based storage platforms.
Import Descendants
Import Descendants is one of two Import Provider that use  CMIS Connections to import document content into Grooper.
 CMIS Connections to import document content into Grooper.
Import Query Results
Import Query Results is one of two Import Provider that use  CMIS Connections to import document content into Grooper.
 CMIS Connections to import document content into Grooper.
Lookup
The Lookups property is a list of lookup operations to be performed on child elements of the associated container. Each Lookup specification defines a lookup operation, where the value of one or more Grooper fields will be used to query an external data source, such as a database. The results of the query can be used to validate existing field values or populate additional field values.
CMIS Lookup
CMIS Lookup is a Lookup Specification that performs a lookup against a  CMIS Repository via a CMISQL Query.
 CMIS Repository via a CMISQL Query.
Database Lookup
Database Lookup is a Lookup Specification that performs a lookup against a  Data Connection via a SQL query.
 Data Connection via a SQL query.
GPT Lookup
GPT Lookup is a Lookup Specification that performs a lookup using an OpenAI GPT model.
Web Service Lookup
Web Service Lookup is a Lookup Specification that looks up external data at an API endpoint by calling a web service.
Object
In Grooper, objects are defined as configurable elements within its hierarchical tree structure. These include nodes and embedded objects that can be manipulated and edited to define the system's behavior, create workflows, and manage content.
Batch
 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as
 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as  Batch Processes.
 Batch Processes.
Batch Folder
 Batch Folder objects are defined as container objects within a
 Batch Folder objects are defined as container objects within a  Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or
 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or  Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.
 Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.
- Batch Folders are frequently referred to simply as "documents".
Batch Page
 Batch Page objects represent individual pages within a
 Batch Page objects represent individual pages within a  Batch. The Batch Page object is the most granular unit in the hierarchy of Batch Objects in Grooper.
  Batch. The Batch Page object is the most granular unit in the hierarchy of Batch Objects in Grooper.
- Batch Pages are frequently referred to simply as "pages".
Batch Process
 Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each
 Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each  Batch of documents is managed systematically and efficiently.
 Batch of documents is managed systematically and efficiently.
- Batch Processes by themselves do nothing.  Instead, the workflows they execute are designed by adding child  Batch Process Steps. Batch Process Steps.
- A Batch Process is often referred to as simply a "process".
Batch Process Step
 Batch Process Step objects are specific actions within the sequence defined by a
 Batch Process Step objects are specific actions within the sequence defined by a  Batch Process. A Batch Procsess Step plays a critical role in automating and managing the flow of documents through the various stages of processing within Grooper.
 Batch Process. A Batch Procsess Step plays a critical role in automating and managing the flow of documents through the various stages of processing within Grooper.
- Batch Process Steps are frequently referred to as simply "steps".
- Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".
CMIS Connection
 CMIS Connection objects provide a standardized way of connecting to various content management systems (CMS). These objects allow Grooper to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of Grooper's immediate environment.
 CMIS Connection objects provide a standardized way of connecting to various content management systems (CMS). These objects allow Grooper to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of Grooper's immediate environment.
- For those that support the CMIS standard, the CMIS Connection connects to the CMS using the CMIS standard.
- For those that do not, the CMIS Connection normalizes connection and transfer protocol as if they were a CMIS platform.
CMIS Repository
 CMIS Repository objects in Grooper allow access to external documents through a
 CMIS Repository objects in Grooper allow access to external documents through a  CMIS Connection. They allows managing and interacting with those documents within Grooper's framework as if they were local. They are created as a child object of a CMIS Connection and used for various Activities.
 CMIS Connection. They allows managing and interacting with those documents within Grooper's framework as if they were local. They are created as a child object of a CMIS Connection and used for various Activities.
Content Category
 Content Category objects are containers within a
 Content Category objects are containers within a  Content Model that hold other Content Categories and
 Content Model that hold other Content Categories and  Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.
 Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.
Content Model
 Content Model objects define the taxonomy of document sets in terms of the
 Content Model objects define the taxonomy of document sets in terms of the  Document Type they contain. They also house the Data Elements that appear on each
 Document Type they contain. They also house the Data Elements that appear on each  Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.
 Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.
Data Column
 Data Column objects are child objects of a
 Data Column objects are child objects of a  Data Table, representing individual columns and defining the type of data each column holds along with its data extraction properties.
 Data Table, representing individual columns and defining the type of data each column holds along with its data extraction properties.
Data Connection
 Data Connection objects define the settings for connecting to and interacting with a database. These interactions may include conducting lookups, exports, or other actions that relate to database management systems (DBMS). Once configured, a Data Connection object can be referenced by other components in Grooper for various DBMS-related activities.
 Data Connection objects define the settings for connecting to and interacting with a database. These interactions may include conducting lookups, exports, or other actions that relate to database management systems (DBMS). Once configured, a Data Connection object can be referenced by other components in Grooper for various DBMS-related activities.
Data Field
 Data Field objects are created as child objects of a
 Data Field objects are created as child objects of a  Data Model. A Data Field is a representation of a single piece of data targeted for extraction on a document.
  Data Model. A Data Field is a representation of a single piece of data targeted for extraction on a document.
Data Fields are frequently referred to simply as "fields".
Data Model
 Data Model objects serve as the top-tier structure defining the taxonomy for Data Elements and are leveraged during the Extract Activity to extract data from a
 Data Model objects serve as the top-tier structure defining the taxonomy for Data Elements and are leveraged during the Extract Activity to extract data from a  Batch Folders. They are a hierarchy of Data Elements that sets the stage for the extraction logic and review of data collected from documents.
 Batch Folders. They are a hierarchy of Data Elements that sets the stage for the extraction logic and review of data collected from documents.
Data Rule
 Data Rule objects define the logic for automated data manipulation which occurs after data has been extracted from
 Data Rule objects define the logic for automated data manipulation which occurs after data has been extracted from  Batch Folders. These rules are applied to normalize or otherwise prepare data collected in a
 Batch Folders. These rules are applied to normalize or otherwise prepare data collected in a  Data Model for downstream processes. Data Rules ensure that extracted data conforms to expected formats or meets certain quality standards.
 Data Model for downstream processes. Data Rules ensure that extracted data conforms to expected formats or meets certain quality standards.
Data Section
 Data Section objects are grouping mechanisms for related
 Data Section objects are grouping mechanisms for related  Data Fields. Data Sections organize and segment child Data Elements into logical divisions of a document based on the structure and semantics of the information the documents contain.
 Data Fields. Data Sections organize and segment child Data Elements into logical divisions of a document based on the structure and semantics of the information the documents contain.
Data Table
 Data Table objects are utilized for extracting repeating data that's formatted in rows and columns, allowing for complex multi-instance data organization that would be present in table-formatted content.
 Data Table objects are utilized for extracting repeating data that's formatted in rows and columns, allowing for complex multi-instance data organization that would be present in table-formatted content.
Data Type
 Data Type objects hold a collection of child, referenced, and locally defined Data Extractors and settings that manage how multiple (even differing) matches from Data Extractors are consolidated (via Collation) into a result set.
 Data Type objects hold a collection of child, referenced, and locally defined Data Extractors and settings that manage how multiple (even differing) matches from Data Extractors are consolidated (via Collation) into a result set.
Document Type
 Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a
 Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a  Content Model or a
 Content Model or a  Content Category and are used to classify individual
 Content Category and are used to classify individual  Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.
 Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.
Field Class
 Field Class objects are trainable extractors that distinguish between multiple instances of similar data within a document by understanding the context in which they occur. Field Classes can be configured to distinguish values within highly structured documents, but this type of extraction is better suited to simpler "Extractor Objects" like
 Field Class objects are trainable extractors that distinguish between multiple instances of similar data within a document by understanding the context in which they occur. Field Classes can be configured to distinguish values within highly structured documents, but this type of extraction is better suited to simpler "Extractor Objects" like  Value Readers or
 Value Readers or  Data Types.
  Data Types. 
Field Classes are most useful when attempting to find values within the flow of natural language. This method involves training with positive and negative examples to distinguish the right context. You'd opt for a Field Class when the value you're after is an entire clause within a contract, or a specific value defined within the flow of text.
File Store
 File Store objects define a storage location within Grooper where file content associated with nodes are saved. They are crucial for managing the content that forms the basis of the Grooper's processing tasks, allowing for the storage and retrieval of documents, images, and other "files". Not every object in Grooper will have files connected to it, but if it does, those files are stored in the location defined by this object.
 File Store objects define a storage location within Grooper where file content associated with nodes are saved. They are crucial for managing the content that forms the basis of the Grooper's processing tasks, allowing for the storage and retrieval of documents, images, and other "files". Not every object in Grooper will have files connected to it, but if it does, those files are stored in the location defined by this object.
Form Type
 Form Type objects represent trained variations of a
 Form Type objects represent trained variations of a  Document Type.  These objects store machine learning training data for Lexical and Visual document classification methods.
 Document Type.  These objects store machine learning training data for Lexical and Visual document classification methods.
IP Group
 IP Group objects are child objects within
 IP Group objects are child objects within  IP Profiles that create a hierarchical structure for organizing image processing commands. IP Groups may contain other IP Groups or
 IP Profiles that create a hierarchical structure for organizing image processing commands. IP Groups may contain other IP Groups or  IP Step objects.
 IP Step objects.
IP Profile
 IP Profile objects detail the operations and parameters for image enhancement and cleanup. These operations improve the accuracy of further processing steps, like  the Recognize and Classify Activities.
 IP Profile objects detail the operations and parameters for image enhancement and cleanup. These operations improve the accuracy of further processing steps, like  the Recognize and Classify Activities.
IP Step
 IP Step objects are the basic units within an
 IP Step objects are the basic units within an  IP Profile that define a single image processing operation. IP Steps are performed sequentially within their parent
 IP Profile that define a single image processing operation. IP Steps are performed sequentially within their parent  IP Group or IP Profile.
 IP Group or IP Profile.
Lexicon
 Lexicon objects are dictionary objects that store a list of keys or key-value pairs. Lexicons can define local entries and/or import entries from other Lexicons and even import entries using a Data Connection. The entries in a Lexicon can be utilized in different areas of Grooper, such as data extraction, Fuzzy Matching, or OCR Correction, providing a reference point that enhances the accuracy and consistency of the software's operations.
 Lexicon objects are dictionary objects that store a list of keys or key-value pairs. Lexicons can define local entries and/or import entries from other Lexicons and even import entries using a Data Connection. The entries in a Lexicon can be utilized in different areas of Grooper, such as data extraction, Fuzzy Matching, or OCR Correction, providing a reference point that enhances the accuracy and consistency of the software's operations.
Machine
 Machine objects represent servers that have connected to the Grooper repository. They allow for the management of Grooper Service instances and serve as a connection points for processing jobs to be executed on the server hardware. Machine objects are essential for the scaling of processing capabilities and for distributing processing loads across multiple servers.
 Machine objects represent servers that have connected to the Grooper repository. They allow for the management of Grooper Service instances and serve as a connection points for processing jobs to be executed on the server hardware. Machine objects are essential for the scaling of processing capabilities and for distributing processing loads across multiple servers.
OCR Profile
 OCR Profile objects configure the settings for optical character recognition (OCR) leveraged by the Recognize activity. OCR converts images of text into machine-encoded text. OCR Profile objects influence how effectively textual content is recognized and from
 OCR Profile objects configure the settings for optical character recognition (OCR) leveraged by the Recognize activity. OCR converts images of text into machine-encoded text. OCR Profile objects influence how effectively textual content is recognized and from  Batch Pages.
 Batch Pages.
Object Library
 Object Library objects are .NET libraries that contain code files for customizing the functionality of Grooper. These libraries are used for a range of customization and integration tasks, allowing users to extend Grooper's capabilities.
 Object Library objects are .NET libraries that contain code files for customizing the functionality of Grooper. These libraries are used for a range of customization and integration tasks, allowing users to extend Grooper's capabilities.
- Examples include:
- Adding custom activities that execute within Batch Processes
- Creating custom commands available during the Review Activity and in the Design page.
- Defining custom methods that can be called from expressions on Data Field and Batch Process Step objects
- Establish custom services that perform automated background tasks at regular intervals
 
Processing Queue
 Processing Queue objects are designed for tasks performed by
 Processing Queue objects are designed for tasks performed by  Machines, which include automated steps in the document processing lifecycle. Processing Queues are used to distribute machine tasks among different servers and control the concurrency or processing rate of these tasks.
 Machines, which include automated steps in the document processing lifecycle. Processing Queues are used to distribute machine tasks among different servers and control the concurrency or processing rate of these tasks. 
- For example, activities such as Render or Export can be managed so that only one activity instance runs per machine or so multiple instances are processed concurrently, according to the queue configuration.
Project
 Project objects are collections of resources and serve as the primary containers for design components within Grooper. The Project object is where various processing objects such as
 Project objects are collections of resources and serve as the primary containers for design components within Grooper. The Project object is where various processing objects such as  Content Models,
 Content Models,  Batch Processes, Profile Objects, and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.
 Batch Processes, Profile Objects, and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.
Resource File
A Resource File object in Grooper is essentially a file that is stored as part of a Grooper  Project. It can include various types of files such as text files or XML schema files.
 Project. It can include various types of files such as text files or XML schema files.
Review Queue
 Review Queue objects are designated for human-performed tasks. They organizes the Review tasks that require human attention and can distribute these tasks among different groups of users based on the queue's settings. Review Queues can be assigned on the
 Review Queue objects are designated for human-performed tasks. They organizes the Review tasks that require human attention and can distribute these tasks among different groups of users based on the queue's settings. Review Queues can be assigned on the  Batch Process level to filter work by an entire process or Review Activities at the
 Batch Process level to filter work by an entire process or Review Activities at the  Batch Process Step level to filter tasks at a more granular step-based level.
 Batch Process Step level to filter tasks at a more granular step-based level.
Root
The  Root object represents the topmost element of the Grooper repository. It serves as the starting point from which all other objects branch out. It is the anchor point for all other structures within the repository and a necessary element for the organization and linkage of all other objects within Grooper.
 Root object represents the topmost element of the Grooper repository. It serves as the starting point from which all other objects branch out. It is the anchor point for all other structures within the repository and a necessary element for the organization and linkage of all other objects within Grooper.
Scanner Profile
 Scanner Profile objects outline the specifications for scanning physical documents into digital forms. This includes settings like resolution, color mode, and any post-scan image processing or enhancement functions.
 Scanner Profile objects outline the specifications for scanning physical documents into digital forms. This includes settings like resolution, color mode, and any post-scan image processing or enhancement functions.
See Desktop Scanning in Grooper for more information.
Separation Profile
 Separation Profile objects contain rules and settings that determine how groupings of scanned pages are separated into individual
 Separation Profile objects contain rules and settings that determine how groupings of scanned pages are separated into individual  Batch Folders, often using barcodes, blank pages, or patch codes as indicators for separation points.
 Batch Folders, often using barcodes, blank pages, or patch codes as indicators for separation points.
Value Reader
 Value Reader objects define a single data extraction operation. You set the Extractor Type on the Value Reader that matches the specific data you're aiming to capture. For example, you would use the Pattern Match Extractor Type to return data using regular expression. You would use a Value Reader when you need to extract a single result or list of simple results from a document.
 Value Reader objects define a single data extraction operation. You set the Extractor Type on the Value Reader that matches the specific data you're aiming to capture. For example, you would use the Pattern Match Extractor Type to return data using regular expression. You would use a Value Reader when you need to extract a single result or list of simple results from a document.
Property
A property is a mechanism by which an object in Grooper is configured that affects how the object performs its function.
Confidence Multiplier and Output Confidence
Some results carry more weight than others. The Confidence Multiplier and Output Confidence properties allow you to manually adjust an extraction result's confidence.
Constrained Wrap
The Constrained Wrap property allows certain Extractor Types and the Labeling Behavior to match values which wrap from one line to the next inside a box (such as a table cell).
Content Type Filter
The Content Type Filter property restricts Activities to specific  Content Categories and/or
 Content Categories and/or  Document Types.
 Document Types.
Document Quoting
Document Quoting is a property of the AI Extract Fill Method that limits the text fed to the AI to reduce the amount of tokens consumed. Controlling specifically what is given can not only reduce the monetary cost of using the AI, but also the time cost of running the Fill Method.
OCR Engine
An OCR Engine is the part of OCR software that does the actual character recognition, analyzing the pixels on an image and figuring out what characters they represent. This raw result can be further processed using Grooper's OCR Synthesis capabilities, producing the final OCR result used by Data Extractors to match text in a document and return the result.
Output Extractor Key
The Output Extractor Key property is another weapon in the arsenal of powerful Grooper classification techniques.  It allows  Data Types to return results normalized in a way more beneficial to document classification.
 Data Types to return results normalized in a way more beneficial to document classification.
Paragraph Marking
Paragraph Marking alters the normal text data in a document by placing the carriage return and new line feed pairs at the end of each paragraph, instead of the end of each line. This allows users to break up a document's text flow into segments of paragraphs instead of segments of lines.
Parameters
Parameters is a colleciton of properties used in the configuration of LLM constructs. Temperature, TopP, Presence Penalty, and Frequency Penalty are parameters that influence text generation in models. Temperature and TopP control the diversity and probability distribution of generated text, while Presence Penalty and Frequency Penalty help manage repetition by discouraging the reuse of words or phrases.
Permission Sets
A Permission Set is a property that allows you to restrict user access to repositories, pages, and certain activities. This helps eliminate the possibility of an unauthorized individual from editing or deleting information or  Batches.
 Batches.
Preprocessing
The Preprocessing grouping of properties consists of settings that adjust how text is formatted and interpreted before any Data Extraction process begins. These properties are crucial for ensuring that the text data is in the most optimal format for subsequent extraction tasks, which could involve complex regular expressions or precise data parsing.
Scope
The Scope property of a  Batch Process Step, as it relates to an Activity, determines at which level in a
 Batch Process Step, as it relates to an Activity, determines at which level in a  Batch hierarchy the Activity runs.
 Batch hierarchy the Activity runs.
Secondary Types
Secondary Types allow the application of multiple Content Types to a single  Batch Folder.
 Batch Folder.
Tab Marking
Tab Marking allows you to insert tab characters into a document's text data.
Vertical Wrap
Vertical Wrap is a property of certain Extractor Types and a Content Type's Labeling Behavior used to provide simplified extraction of vertically wrapped text (typically stacked labels).
Repository Option
The Options property of the database Root object is a collection of optional features that affect the entire repository. These optional features enable entire collections of functionality that otherwise do not work without first establishing the connections these options provide.
LLM Connector
LLM Connector is a Repository Option that enables OpenAI-based functionality for the local Grooper repository.
Section Extract Method
The Extract Method property of a  Data Section defines a "Section Extract Method" which specifies how section instances will be identified and extracted.
 Data Section defines a "Section Extract Method" which specifies how section instances will be identified and extracted.
Nested Table
Nested Table is a "Section Extract Method" enabled for a  Data Section using the Extract Method property. This method divides a document into sections by extracting table data within those sections. This gives Grooper users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single
 Data Section using the Extract Method property. This method divides a document into sections by extracting table data within those sections. This gives Grooper users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single  Data Table object).
 Data Table object).
Transaction Detection
Transaction Detection is a  Data Section Extract Method.  This extraction method produces section instances by detecting repeating patterns of text around the Data Section's child
 Data Section Extract Method.  This extraction method produces section instances by detecting repeating patterns of text around the Data Section's child  Data Fields.
 Data Fields.
Separation Provider
The Provider property of the Separate Activity defines the type of separation to be performed at the designated Scope.
Change in Value Separation
The Change in Value Separation Provider creates a new folder and separates every time an extracted value changes from one  Batch Page to another.
 Batch Page to another.
Control Sheet Separation
Control Sheet Separation is a Separation Provider that uses Grooper  Control Sheets to separate documents.
 Control Sheets to separate documents.
EPI Separation
The EPI Separation Separation Provider uses embedded page information ("EPI") to Separate loose pages into document folders. A Data Extractor is used to find page numbers from the text on a page and Grooper uses this information to separate the pages.
ESP Auto Separation
ESP Auto Separation is a Separation Provider used for document separation.  It is unique in that it both separates and classifies documents at the same time.  It uses page-level classification training examples (among other things) to determine where to insert document folders in a  Batch.
 Batch.
Event-Based Separation
Event-Based Separation is a Separation Provider that Separates documents using one or more "Separation Events". Each Separation Event triggers the creation of a new folder.
Multi Separator
The Multi Separator Separation Provider performs separation using multiple Separation Providers. It allows users to create a list of any of the other Separation Providers. If the first provider on the list fails to separate a page (or, as more often is the case, a series of pages), the next one will be applied. If that fails, the next, and so on.
Pattern-Based Separation
Pattern-Based Separation is a Separation Provider that creates a new document folder every time a value returned by a defined pattern is encountered on a page.
Undo Separation
Undo Separation is a Separation Provider.  Instead of putting loose  Batch Pages into
 Batch Pages into  Batch Folders, this Separation Provider removes Batch Folders, leaving only loose pages.
 Batch Folders, this Separation Provider removes Batch Folders, leaving only loose pages.
Service
Grooper Service is a conceptual term that refers to the various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Config.
API Services
You can perform  Batch processing via REST API web calls by installing  API Services.
 Batch processing via REST API web calls by installing  API Services.
Activity Processing
Activity Processing is a Grooper Service that executes Activities assigned to  Batch Process Steps in a
 Batch Process Steps in a  Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.
 Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.
Grooper Licensing
Grooper Licensing is a Grooper Service that distributes licenses to multiple workstations running Grooper applications.
Import Watcher
An Import Watcher Service schedules and runs import jobs. It periodically executes an Import Provider to query or poll for documents that meet specific criteria. When the matching documents are found, they are imported into Grooper. Afterward, the imported objects are moved, deleted, or modified to prevent repeated imports in the next polling cycle. This ensures that the same set of files is not imported over and over again."
Table Extract Method
The Extract Method property of a  Data Table sets a Table Extract Method which defines the settings and logic for the Data Table to perform extraction.
 Data Table sets a Table Extract Method which defines the settings and logic for the Data Table to perform extraction.
Delimited Extract
The Delimited Extract Table Extract Method extracts tabular data from a delimiter-separated text file, such as a CSV file.
Fluid Layout
The Fluid Layout Table Extract Method will choose between Tabular Layout and Flow Layout configurations, depending on how labels are collected for a  Document Type.
 Document Type.
Grid Layout
The Grid Layout Table Extract Method uses the positional location of row and column headers to interpret where a tabular grid would be around each value in a table and extract values from each cell in the interpreted grid.
Row Match
The Row Match Table Extract Method uses regular expression pattern matching to determine a tables structure based on the pattern of each row and extract cell data from each column.
Tabular Layout
The Tabular Layout Table Extract Method uses column header values determined by the  Data Columns Header Extractor results (or labels collected for the Data Columns when a Labeling Behavior is enabled) as well as Data Column Value Extractor results to model a table's structure and return its values.
 Data Columns Header Extractor results (or labels collected for the Data Columns when a Labeling Behavior is enabled) as well as Data Column Value Extractor results to model a table's structure and return its values.
UI Element
A UI Element is a portion of the Grooper interface that allows users to interact with or otherwise receive information about the application.
Document Viewer
The Grooper Document Viewer is the portal to your documents. It is the UI that allows you to see a  Batch Folder's (or a
 Batch Folder's (or a  Batch Page's) image, text content, and more.
 Batch Page's) image, text content, and more.
Node Tree
The Node Tree is the hierarchical list of objects found in the left panel in the "Design" page. It is the basis for navigation and creation in Design.
Overrides
Overrides is a tab provided to allow overriding of default properties set to a Data Element.
Summary Tabs
 Content Models and
 Content Models and  Content Categories have a Summary tab where you can view "Descendant Node Types",
 Content Categories have a Summary tab where you can view "Descendant Node Types",  Document Types, and Expressions.
 Document Types, and Expressions.
Miscellaneous Features
URL Endpoints for Review
Three different URL endpoints can be used to open Review tasks in the Grooper Web Client, given certain information like the Grooper Repository ID,  Batch Process name,
 Batch Process name,  Batch Id and more. This allows Grooper users to link directly to a Batch in Review with a URL.
 Batch Id and more. This allows Grooper users to link directly to a Batch in Review with a URL.
