Glossary: Difference between revisions

From Grooper Wiki
updated icons // via Wikitext Extension for VSCode
Line 1: Line 1:
== Activity ==
== Activity ==
<section begin="Activity" />'''''[[Activity (Property)|Activity]]'''''  is a property on [[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Step]]''' objects. '''''Activities''''' define specific document processing operations done to a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''', [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]''', or [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]'''.
<section begin="Activity" />'''''[[Activity (Property)|Activity]]'''''  is a property on {{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Step]]''' objects. '''''Activities''''' define specific document processing operations done to a {{BatchIcon}} '''[[Batch (Object)|Batch]]''', {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]''', or {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page]]'''.


'''Batch Process Steps''' configured with specific '''''Activities''''' are frequently referred by the name of the '''''Activity''''' followed by the word "step". For example: '''Classify Step'''.<section end="Activity" />
'''Batch Process Steps''' configured with specific '''''Activities''''' are frequently referred by the name of the '''''Activity''''' followed by the word "step". For example: '''Classify Step'''.<section end="Activity" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Apply Rules ===
=== Apply Rules ===
<section begin="Apply Rules" />'''''[[Apply Rules (Activity)|Apply Rules]]''''' is an '''''[[Activity (Property)|Activity]]''''' that runs '''[[Data Rule (Object)|Data Rules]]''' on data that has already been extracted from a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]'''. A '''[[Batch Process Step (Object)|Batch Process Step]]''' configured with the '''''Apply Rules Activity''''' will always need to be preceded by a '''Batch Process Step''' configured with the '''''Extract Activity'''''. <section end="Apply Rules" />
<section begin="Apply Rules" />'''''[[Apply Rules (Activity)|Apply Rules]]''''' is an '''''[[Activity (Property)|Activity]]''''' that runs '''[[Data Rule (Object)|Data Rules]]''' on data that has already been extracted from a {{BatchIcon}} '''[[Batch (Object)|Batch]]'''. A '''[[Batch Process Step (Object)|Batch Process Step]]''' configured with the '''''Apply Rules Activity''''' will always need to be preceded by a '''Batch Process Step''' configured with the '''''Extract Activity'''''. <section end="Apply Rules" />


=== Classify ===
=== Classify ===
<section begin="Classify" />'''''[[Classify (Activity)|Classify]]''''' is an '''''[[Activity (Property)|Activity]]''''' that "classifies" [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' in a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' by assigning them a '''[[Content Type (Concept)|Content Type]]''' using patterns, lexical understanding, or rules as defined by a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]'''.<section end="Classify" />
<section begin="Classify" />'''''[[Classify (Activity)|Classify]]''''' is an '''''[[Activity (Property)|Activity]]''''' that "classifies" {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' in a {{BatchIcon}} '''[[Batch (Object)|Batch]]''' by assigning them a '''[[Content Type (Concept)|Content Type]]''' using patterns, lexical understanding, or rules as defined by a {{ContentModelIcon}} '''[[Content Model (Object)|Content Model]]'''.<section end="Classify" />


=== Clip Frames ===
=== Clip Frames ===
Line 14: Line 14:


=== Correct ===
=== Correct ===
<section begin="Correct" />The '''''[[Correct (Activity)|Correct]]''''' '''''[[Activity (Property)|Activity]]''''' performs spell correction on the textual content of [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' or specific '''[[Data Element (Concept)|Data Elements]]''', enhancing the accuracy of [[Data Extraction (Concept)|data extraction]] by resolving '''''[[Recognize (Activity)|recognition]]''''' errors.<section end="Correct" />
<section begin="Correct" />The '''''[[Correct (Activity)|Correct]]''''' '''''[[Activity (Property)|Activity]]''''' performs spell correction on the textual content of {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' or specific '''[[Data Element (Concept)|Data Elements]]''', enhancing the accuracy of [[Data Extraction (Concept)|data extraction]] by resolving '''''[[Recognize (Activity)|recognition]]''''' errors.<section end="Correct" />


=== Detect Frames ===
=== Detect Frames ===
Line 26: Line 26:


=== Extract ===
=== Extract ===
<section begin="Extract" />The '''''[[Extract (Activity)|Extract]]''''' '''''[[Activity (Property)|Activity]]''''' retrieves relevant information, defined by '''[[Data Element (Concept)|Data Elements]]''', from [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', transforming unstructured or semi-structured content into structured, usable data.<section end="Extract" />
<section begin="Extract" />The '''''[[Extract (Activity)|Extract]]''''' '''''[[Activity (Property)|Activity]]''''' retrieves relevant information, defined by '''[[Data Element (Concept)|Data Elements]]''', from {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', transforming unstructured or semi-structured content into structured, usable data.<section end="Extract" />


=== Image Processing ===
=== Image Processing ===
<section begin="Image Processing" />The '''''[[Image Processing (Activity)|Image Processing]]''''' '''''[[Activity (Property)|Activity]]''''' enhances and optimizes [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' for better recognition and [[Data Extraction (Concept)|data extraction]] results.<section end="Image Processing" />
<section begin="Image Processing" />The '''''[[Image Processing (Activity)|Image Processing]]''''' '''''[[Activity (Property)|Activity]]''''' enhances and optimizes {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' for better recognition and [[Data Extraction (Concept)|data extraction]] results.<section end="Image Processing" />


=== Initialize Card ===
=== Initialize Card ===
Line 38: Line 38:


=== Recognize ===
=== Recognize ===
<section begin="Recognize" />The '''''[[Recognize (Activity)|Recognize]]''''' '''''[[Activity (Property)|Activity]]''''' interprets [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' and [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', converting them into machine-readable text and capturing layout data for comprehensive analysis and [[Data Extraction (Concept)|data extraction]]. This will attach a text and/or layoutData file to the respective object.<section end="Recognize" />
<section begin="Recognize" />The '''''[[Recognize (Activity)|Recognize]]''''' '''''[[Activity (Property)|Activity]]''''' interprets {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' and {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', converting them into machine-readable text and capturing layout data for comprehensive analysis and [[Data Extraction (Concept)|data extraction]]. This will attach a text and/or layoutData file to the respective object.<section end="Recognize" />


=== Redact ===
=== Redact ===
Line 47: Line 47:


=== Review ===
=== Review ===
<section begin="Review" />The '''''[[Review (Activity)|Review]]''''' '''''[[Activity (Property)|Activity]]''''' facilitates human evaluation and validation of processed [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' and extracted data for accuracy and completeness.<section end="Review" />
<section begin="Review" />The '''''[[Review (Activity)|Review]]''''' '''''[[Activity (Property)|Activity]]''''' facilitates human evaluation and validation of processed {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' and extracted data for accuracy and completeness.<section end="Review" />


=== Send Mail ===
=== Send Mail ===
<section begin="Send Mail" />The '''''[[Send Mail (Activity)|Send Mail]]''''' '''''[[Activity (Property)|Activity]]''''' automates the dispatch of emails with or without attachments, based on [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' events and conditions.<section end="Send Mail" />
<section begin="Send Mail" />The '''''[[Send Mail (Activity)|Send Mail]]''''' '''''[[Activity (Property)|Activity]]''''' automates the dispatch of emails with or without attachments, based on {{BatchProcessIcon}} '''[[Batch Process (Object)|Batch Process]]''' events and conditions.<section end="Send Mail" />


=== Separate ===
=== Separate ===
<section begin="Separate" />The '''''[[Separate (Activity)|Separate]]''''' '''''[[Activity (Property)|Activity]]''''' sorts [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' into individual [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', distinguishing them for independent processing and organization.<section end="Separate" />
<section begin="Separate" />The '''''[[Separate (Activity)|Separate]]''''' '''''[[Activity (Property)|Activity]]''''' sorts {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' into individual {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', distinguishing them for independent processing and organization.<section end="Separate" />


=== Split Pages ===
=== Split Pages ===
<section begin="Split Pages" />Multi-page documents (typically [https://en.wikipedia.org/wiki/PDF PDFs] and [https://en.wikipedia.org/wiki/TIFF TIFFs]) come into '''Grooper''' represented as single [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]'''. The '''''[[Split Pages (Activity)|Split Pages]]''''' '''''[[Activity (Property)|Activity]]''''' exposes [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' as child objects of the '''Batch Folders''' for individualized processing and handling.<section end="Split Pages" />
<section begin="Split Pages" />Multi-page documents (typically [https://en.wikipedia.org/wiki/PDF PDFs] and [https://en.wikipedia.org/wiki/TIFF TIFFs]) come into '''Grooper''' represented as single {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]'''. The '''''[[Split Pages (Activity)|Split Pages]]''''' '''''[[Activity (Property)|Activity]]''''' exposes {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' as child objects of the '''Batch Folders''' for individualized processing and handling.<section end="Split Pages" />


=== XML Transform ===
=== XML Transform ===
Line 72: Line 72:
</div>
</div>
== Behavior ==
== Behavior ==
<section begin="Behavior" />'''''[[Behaviors (Property)|Behaviors]]''''' is a property of '''[[Content Type (Concept)|Content Types]]''' and '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''  that defines configurable actions that automate processing tasks based on the identified '''Content Type''' of a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Behavior" />
<section begin="Behavior" />'''''[[Behaviors (Property)|Behaviors]]''''' is a property of '''[[Content Type (Concept)|Content Types]]''' and '''''[[Export (Activity)|Export]]''''' '''''[[Activity (Property)|Activities]]'''''  that defines configurable actions that automate processing tasks based on the identified '''Content Type''' of a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Behavior" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Export Behavior ===
=== Export Behavior ===
<section begin="Export Behavior" />An '''''[[Export Behavior (Behavior)|Export Behavior]]''''' defines the conditions and actions for exporting [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' and their associated data from '''Grooper''' to other systems.<section end="Export Behavior" />
<section begin="Export Behavior" />An '''''[[Export Behavior (Behavior)|Export Behavior]]''''' defines the conditions and actions for exporting {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' and their associated data from '''Grooper''' to other systems.<section end="Export Behavior" />


=== Labeling Behavior ===
=== Labeling Behavior ===
Line 114: Line 114:
</div>
</div>
== Classification Method ==
== Classification Method ==
<section begin="Classification Method" />The '''''[[Classification Method (Property)|Classification Method]]''''' property determines the technique used for document [[Classification (Concept)|classification]] within a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''', enabling the sorting of [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.<section end="Classification Method" />
<section begin="Classification Method" />The '''''[[Classification Method (Property)|Classification Method]]''''' property determines the technique used for document [[Classification (Concept)|classification]] within a {{ContentModelIcon}} '''[[Content Model (Object)|Content Model]]''', enabling the sorting of {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.<section end="Classification Method" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== GPT Embeddings ===
=== GPT Embeddings ===
Line 120: Line 120:


=== Labelset-Based ===
=== Labelset-Based ===
<section begin="Labelset-Based" />'''''[[Labeling Behavior (Behavior)#About Labelset-Based Classification|Labelset-Based]]''''' is a '''''[[Classification Method (Property)|Classification Method]]''''' that leverages the labels defined via a '''''[[Labeling Behavior (Behavior)|Labeling Behavior]]''''' to classify [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]'''.<section end="Labelset-Based" />
<section begin="Labelset-Based" />'''''[[Labeling Behavior (Behavior)#About Labelset-Based Classification|Labelset-Based]]''''' is a '''''[[Classification Method (Property)|Classification Method]]''''' that leverages the labels defined via a '''''[[Labeling Behavior (Behavior)|Labeling Behavior]]''''' to classify {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]'''.<section end="Labelset-Based" />


=== Lexical ===
=== Lexical ===
<section begin="Lexical" />The '''''[[Lexical (Classification Method)|Lexical]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' classifies [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' based on their text content by utilizing either pre-configured training or rules. This is achieved through the analysis of word frequencies or defined rules that identify document types.<section end="Lexical" />
<section begin="Lexical" />The '''''[[Lexical (Classification Method)|Lexical]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' classifies {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' based on their text content by utilizing either pre-configured training or rules. This is achieved through the analysis of word frequencies or defined rules that identify document types.<section end="Lexical" />


=== Rules-Based ===
=== Rules-Based ===
<section begin="Rules-Based" />The '''''[[Rules-Based (Classification Method)|Rules-Based]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' employs defined "rules" on [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Types]]''' to classify [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', utilizing ''Positive Extractor'' and ''Negative Extractor'' properties to accurately categorize them through rule application, thereby ensuring '''Batch Folders''' match predefined criteria.<section end="Rules-Based" />
<section begin="Rules-Based" />The '''''[[Rules-Based (Classification Method)|Rules-Based]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' employs defined "rules" on {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Types]]''' to classify {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', utilizing ''Positive Extractor'' and ''Negative Extractor'' properties to accurately categorize them through rule application, thereby ensuring '''Batch Folders''' match predefined criteria.<section end="Rules-Based" />


=== Visual ===
=== Visual ===
<section begin="Visual" />The '''''[[Visual (Classification Method)|Visual]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' uses image data instead of text data to determine the [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]''' assigned to a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]''' during [[Classification (Concept)|classification]].  Instead of using text-based extractors, an [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile (Object)|IP Profile]]''' is used with an '''''[[Extract Features (IP Command)|Extract Features]]''''' '''''[[IP Command (Property)|IP Command]]''''' to obtain data pertaining to a '''Batch Folder's''' image(s).  Document samples are trained as examples of a '''Document Type'''.<section end="Visual" />
<section begin="Visual" />The '''''[[Visual (Classification Method)|Visual]]''''' '''''[[Classification Method (Property)|Classification Method]]''''' uses image data instead of text data to determine the {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]''' assigned to a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]''' during [[Classification (Concept)|classification]].  Instead of using text-based extractors, an {{IPProfileIcon}} '''[[IP Profile (Object)|IP Profile]]''' is used with an '''''[[Extract Features (IP Command)|Extract Features]]''''' '''''[[IP Command (Property)|IP Command]]''''' to obtain data pertaining to a '''Batch Folder's''' image(s).  Document samples are trained as examples of a '''Document Type'''.<section end="Visual" />
</div>
</div>
== Collation Provider ==
== Collation Provider ==
<section begin="Collation Provider" />The '''''[[Collation Provider (Property)|Collation]]''''' property of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' defines the method for converting its raw results into a final result set, governing how lists of matches from the '''Data Type''' are combined and interpreted to produce the output data of the '''Data Type'''.<section end="Collation Provider" />
<section begin="Collation Provider" />The '''''[[Collation Provider (Property)|Collation]]''''' property of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' defines the method for converting its raw results into a final result set, governing how lists of matches from the '''Data Type''' are combined and interpreted to produce the output data of the '''Data Type'''.<section end="Collation Provider" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== AND ===
=== AND ===
<section begin="AND" />The '''''[[AND (Collation Provider)|AND]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.<section end="AND" />
<section begin="AND" />The '''''[[AND (Collation Provider)|AND]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.<section end="AND" />


=== Array ===
=== Array ===
<section begin="Array" />The '''''[[Array (Collation Provider)|Array]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' matches a list of values arranged in horizontal, vertical, or flow order, combining instances that qualify into a single result.<section end="Array" />
<section begin="Array" />The '''''[[Array (Collation Provider)|Array]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' matches a list of values arranged in horizontal, vertical, or flow order, combining instances that qualify into a single result.<section end="Array" />


=== Combine ===
=== Combine ===
<section begin="Combine" />The '''''[[Combine (Collation Provider)|Combine]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' combines instances from returned results based on a specified grouping, controlling how extractor results are assembled together for output.<section end="Combine" />
<section begin="Combine" />The '''''[[Combine (Collation Provider)|Combine]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' combines instances from returned results based on a specified grouping, controlling how extractor results are assembled together for output.<section end="Combine" />


=== Key-Value List ===
=== Key-Value List ===
<section begin="Key-Value List" />The '''''[[Key-Value List (Collation Provider)|Key-Value List]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' matches instances where a key and a list of one or more values appear together on the document, adhering to a specific layout pattern.<section end="Key-Value List" />
<section begin="Key-Value List" />The '''''[[Key-Value List (Collation Provider)|Key-Value List]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' matches instances where a key and a list of one or more values appear together on the document, adhering to a specific layout pattern.<section end="Key-Value List" />


=== Key-Value Pair ===
=== Key-Value Pair ===
<section begin="Key-Value Pair" />The '''''[[Key-Value Pair (Collation Provider)|Key-Value Pair]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' matches instances where a key is paired with a value on the document in a specific layout, essential for extracting label-value pairs.<section end="Key-Value Pair" />
<section begin="Key-Value Pair" />The '''''[[Key-Value Pair (Collation Provider)|Key-Value Pair]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' matches instances where a key is paired with a value on the document in a specific layout, essential for extracting label-value pairs.<section end="Key-Value Pair" />


=== Multi-Column ===
=== Multi-Column ===
<section begin="Multi-Column" />The '''''[[Multi-Column (Collation Provider)|Multi-Column]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' combines multiple columns on a page into a single column for [[Data Extraction (Concept)|extraction]].<section end="Multi-Column" />
<section begin="Multi-Column" />The '''''[[Multi-Column (Collation Provider)|Multi-Column]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' combines multiple columns on a page into a single column for [[Data Extraction (Concept)|extraction]].<section end="Multi-Column" />


=== Ordered Array ===
=== Ordered Array ===
<section begin="Ordered Array" />The '''''[[Ordered Array (Collation Provider)|Ordered Array]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' finds sequences of values where one result is present for each extractor, in the order they appear.<section end="Ordered Array" />
<section begin="Ordered Array" />The '''''[[Ordered Array (Collation Provider)|Ordered Array]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' finds sequences of values where one result is present for each extractor, in the order they appear.<section end="Ordered Array" />


=== Pattern-Based ===
=== Pattern-Based ===
<section begin="Pattern-Based" />The '''''[[Pattern-Based (Collation Provider)|Pattern-Based]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' uses [https://en.wikipedia.org/wiki/Regular_expression regular expressions] to sequence returned results into a final result set.<section end="Pattern-Based" />
<section begin="Pattern-Based" />The '''''[[Pattern-Based (Collation Provider)|Pattern-Based]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' uses [https://en.wikipedia.org/wiki/Regular_expression regular expressions] to sequence returned results into a final result set.<section end="Pattern-Based" />


=== Split ===
=== Split ===
<section begin="Split" />The '''''[[Split (Collation Provider)|Split]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' separates a [[Data Instance (Concept)|data instance]] at each match returned by the '''Data Type'''.<section end="Split" />
<section begin="Split" />The '''''[[Split (Collation Provider)|Split]]''''' '''''[[Collation Provider (Property)|Collation Provider]]''''' of a {{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' separates a [[Data Instance (Concept)|data instance]] at each match returned by the '''Data Type'''.<section end="Split" />
</div>
</div>
== Concept ==
== Concept ==
Line 165: Line 165:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Activity Processing ===
=== Activity Processing ===
<section begin="Activity Processing Concept" />[[Activity Processing (Concept)|Activity Processing]] is a conceptual term that refers to the execution of a sequence of configured tasks, such as [[Classification (Concept)|classification]], [[Data Extraction (Concept)|extraction]], or data enhancement on documents, which are performed within a [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' to transform raw data from documents into structured and actionable information.<section end="Activity Processing Concept" />
<section begin="Activity Processing Concept" />[[Activity Processing (Concept)|Activity Processing]] is a conceptual term that refers to the execution of a sequence of configured tasks, such as [[Classification (Concept)|classification]], [[Data Extraction (Concept)|extraction]], or data enhancement on documents, which are performed within a {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]''' to transform raw data from documents into structured and actionable information.<section end="Activity Processing Concept" />


=== CMIS+ ===
=== CMIS+ ===
Line 177: Line 177:


=== CSS Data Viewer Styling ===
=== CSS Data Viewer Styling ===
<section begin="CSS Data Viewer Styling" />[[CSS Data Viewer Styling (Concept)|CSS Data Viewer Styling]] refers to using [https://en.wikipedia.org/wiki/CSS CSS] to custom style the '''''Review''''' activity's '''''Data Viewer''''' interface. This gives you a great deal of control over a [[image:GrooperIcon_DataModel.png]] '''[[Data Model (Object)|Data Model's]]''' appearance and layout during document review.<section end="CSS Data Viewer Styling" />
<section begin="CSS Data Viewer Styling" />[[CSS Data Viewer Styling (Concept)|CSS Data Viewer Styling]] refers to using [https://en.wikipedia.org/wiki/CSS CSS] to custom style the '''''Review''''' activity's '''''Data Viewer''''' interface. This gives you a great deal of control over a {{DataModelIcon}} '''[[Data Model (Object)|Data Model's]]''' appearance and layout during document review.<section end="CSS Data Viewer Styling" />


=== Classification ===
=== Classification ===
<section begin="Classification" />[[Classification (Concept)|Classification]] is a conceptual term that refers to the process of identifying and organizing documents into categorical types based on their content or layout, often using [https://en.wikipedia.org/wiki/Machine_learning machine learning], rules, or pattern recognition for efficient document management and [[Data Extraction (Concept)|data extraction]] workflows. Specifically, the '''''[[Classify (Activity)|Classify]]''''' '''''[[Activity (Property)|Activity]]''''' will assign a '''[[Content Type (Concept)|Content Type]]''' to a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Classification" />
<section begin="Classification" />[[Classification (Concept)|Classification]] is a conceptual term that refers to the process of identifying and organizing documents into categorical types based on their content or layout, often using [https://en.wikipedia.org/wiki/Machine_learning machine learning], rules, or pattern recognition for efficient document management and [[Data Extraction (Concept)|data extraction]] workflows. Specifically, the '''''[[Classify (Activity)|Classify]]''''' '''''[[Activity (Property)|Activity]]''''' will assign a '''[[Content Type (Concept)|Content Type]]''' to a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Classification" />


=== Code Expressions ===
=== Code Expressions ===
Line 189: Line 189:


=== Content Type ===
=== Content Type ===
<section begin="Content Type" />'''[[Content Type (Concept)|Content Type]]''' is a conceptual term that refers to the grouping of three '''Grooper''' objects: [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Models]]''', [[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Categories]]''', and [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Types]]'''.<section end="Content Type" />
<section begin="Content Type" />'''[[Content Type (Concept)|Content Type]]''' is a conceptual term that refers to the grouping of three '''Grooper''' objects: {{ContentModelIcon}} '''[[Content Model (Object)|Content Models]]''', {{ContentCategoryIcon}} '''[[Content Category (Object)|Content Categories]]''', and {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Types]]'''.<section end="Content Type" />


=== Data Context ===
=== Data Context ===
Line 195: Line 195:


=== Data Element ===
=== Data Element ===
<section begin="Data Element" />'''[[Data Element (Concept)|Data Element]]''' is a conceptual term that refers to the grouping of five '''Grooper''' objects: [[image:GrooperIcon_DataModel.png]] '''[[Data Model (Object)|Data Models]]''', [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Sections]]''', [[image:GrooperIcon_DataField.png]] '''[[Data Field (Object)|Data Fields]]''', [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Tables]]''', and [[image:GrooperIcon_DataColumn.png]] '''[[Data Column|Data Columns]]'''.<section end="Data Element" />
<section begin="Data Element" />'''[[Data Element (Concept)|Data Element]]''' is a conceptual term that refers to the grouping of five '''Grooper''' objects: {{DataModelIcon}} '''[[Data Model (Object)|Data Models]]''', {{DataSectionIcon}} '''[[Data Section (Object)|Data Sections]]''', {{DataFieldIcon}} '''[[Data Field (Object)|Data Fields]]''', {{DataTableIcon}} '''[[Data Table (Object)|Data Tables]]''', and {{DataColumnIcon}} '''[[Data Column|Data Columns]]'''.<section end="Data Element" />


=== Data Extraction ===
=== Data Extraction ===
<section begin="Data Extraction" />[[Data Extraction (Concept)|Data Extraction]] is a conceptual term that involves identifying and capturing specific information from [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' like forms or invoices using a set of configurable [[Data Extractor (Concept)|Data Extractors]], which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.<section end="Data Extraction" />
<section begin="Data Extraction" />[[Data Extraction (Concept)|Data Extraction]] is a conceptual term that involves identifying and capturing specific information from {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' like forms or invoices using a set of configurable [[Data Extractor (Concept)|Data Extractors]], which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.<section end="Data Extraction" />


=== Data Extractor ===
=== Data Extractor ===
Line 222: Line 222:


=== Flow Collation ===
=== Flow Collation ===
<section begin="Flow Collation" />[[Flow Collation (Concept)|Flow Collation]] is a conceptual term used to define a type of layout used in '''''[[Collation Provider (Property)|Collation Providers]]''''' of [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Types]]'''.<section end="Flow Collation" />
<section begin="Flow Collation" />[[Flow Collation (Concept)|Flow Collation]] is a conceptual term used to define a type of layout used in '''''[[Collation Provider (Property)|Collation Providers]]''''' of {{DataTypeIcon}} '''[[Data Type (Object)|Data Types]]'''.<section end="Flow Collation" />


=== Footer Rows and Footer Modes ===
=== Footer Rows and Footer Modes ===
<section begin="Footer Rows and Footer Modes" />[[Footer Rows and Footer Modes (Concept)|Footer Rows and Footer Modes]] is a conceptual term that refers to how a "footer row" (enabled by the '''''Generate Footer Row''''' property of a [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''') provides '''Grooper''' users a quick way to validate numerical data in a [[image:GrooperIcon_DataColumn.png]] '''[[Data Column|Data Column]]'''. The '''Data Column's''' '''''Footer Mode''''' property controls if and how a total is determined for numerical values in a '''Data Column'''.<section end="Footer Rows and Footer Modes" />
<section begin="Footer Rows and Footer Modes" />[[Footer Rows and Footer Modes (Concept)|Footer Rows and Footer Modes]] is a conceptual term that refers to how a "footer row" (enabled by the '''''Generate Footer Row''''' property of a {{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''') provides '''Grooper''' users a quick way to validate numerical data in a {{DataColumnIcon}} '''[[Data Column|Data Column]]'''. The '''Data Column's''' '''''Footer Mode''''' property controls if and how a total is determined for numerical values in a '''Data Column'''.<section end="Footer Rows and Footer Modes" />


=== Fuzzy RegEx ===
=== Fuzzy RegEx ===
Line 252: Line 252:


=== Layered OCR ===
=== Layered OCR ===
<section begin="Layered OCR" />[[Layered OCR (Concept)|Layered OCR]] is a conceptual term that refers to the usage of the ''Layered OCR'' setting of the '''''OCR Engine''''' property of an [[image:GrooperIcon_OCRProfile.png]] '''[[OCR Profile (Object)|OCR Profile]]'''. The use of this setting enables the usage of secondary '''OCR Profiles''' on a single page.  The [[OCR (Concept)|OCR]] results from these secondary '''OCR Profiles''' are merged with (or ''layered'' on top of) the primary '''OCR Profile's''' results.<section end="Layered OCR" />
<section begin="Layered OCR" />[[Layered OCR (Concept)|Layered OCR]] is a conceptual term that refers to the usage of the ''Layered OCR'' setting of the '''''OCR Engine''''' property of an {{OCRProfileIcon}} '''[[OCR Profile (Object)|OCR Profile]]'''. The use of this setting enables the usage of secondary '''OCR Profiles''' on a single page.  The [[OCR (Concept)|OCR]] results from these secondary '''OCR Profiles''' are merged with (or ''layered'' on top of) the primary '''OCR Profile's''' results.<section end="Layered OCR" />


=== Layout Data ===
=== Layout Data ===
<section begin="Layout Data" />[[Layout Data (Concept)|Layout Data]] is a conceptual term that refers to information such as line locations, [https://en.wikipedia.org/wiki/Optical_mark_recognition OMR] checkbox locations and states, [https://en.wikipedia.org/wiki/Barcode barcode] values, and detected shapes captured by certain [[Image Processing (Concept)|image processing]] commands. This data is stored as an attached file on a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]''' or [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' object and can later be recalled by various functions within '''Grooper''' that rely on the presence of that data to function.<section end="Layout Data" />
<section begin="Layout Data" />[[Layout Data (Concept)|Layout Data]] is a conceptual term that refers to information such as line locations, [https://en.wikipedia.org/wiki/Optical_mark_recognition OMR] checkbox locations and states, [https://en.wikipedia.org/wiki/Barcode barcode] values, and detected shapes captured by certain [[Image Processing (Concept)|image processing]] commands. This data is stored as an attached file on a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]''' or {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page]]''' object and can later be recalled by various functions within '''Grooper''' that rely on the presence of that data to function.<section end="Layout Data" />


=== Microfiche Processing ===
=== Microfiche Processing ===
Line 282: Line 282:


=== Separation ===
=== Separation ===
<section begin="Separation" />[[Separation (Concept)|Separation]] is a conceptual term that refers to the process of taking an unorganized [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' of loose [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' and organizing them into document folders. This is done so Grooper can later assign a Document Type to each document folder in a process known as Classification.<section end="Separation" />
<section begin="Separation" />[[Separation (Concept)|Separation]] is a conceptual term that refers to the process of taking an unorganized {{BatchIcon}} '''[[Batch (Object)|Batch]]''' of loose {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' and organizing them into document folders. This is done so Grooper can later assign a Document Type to each document folder in a process known as Classification.<section end="Separation" />


=== TF-IDF ===
=== TF-IDF ===
<section begin="TF-IDF" />[[TF-IDF (Concept)|TF-IDF]] is a conceptual term that refers to ([https://en.wikipedia.org/wiki/Tf%E2%80%93idf term frequency-inverse document frequency]), a numerical statistic intended to reflect how important a word is to a document within a collection (or document set or [https://en.wikipedia.org/wiki/Text_corpus corpus]). It is how '''Grooper''' uses [https://en.wikipedia.org/wiki/Machine_learning machine learning] for training-based document [[Classification (Concept)|classification]] (via the [[Lexical (Classification Method)|Lexical]] method) and [[Data Extraction (Concept)|data extraction]] (via the [[image:GrooperIcon_FieldClass.png]] [[Field Class (Object)|Field Class]] extractor).<section end="TF-IDF" />
<section begin="TF-IDF" />[[TF-IDF (Concept)|TF-IDF]] is a conceptual term that refers to ([https://en.wikipedia.org/wiki/Tf%E2%80%93idf term frequency-inverse document frequency]), a numerical statistic intended to reflect how important a word is to a document within a collection (or document set or [https://en.wikipedia.org/wiki/Text_corpus corpus]). It is how '''Grooper''' uses [https://en.wikipedia.org/wiki/Machine_learning machine learning] for training-based document [[Classification (Concept)|classification]] (via the [[Lexical (Classification Method)|Lexical]] method) and [[Data Extraction (Concept)|data extraction]] (via the {{FieldClassIcon}} [[Field Class (Object)|Field Class]] extractor).<section end="TF-IDF" />


=== Table Extraction ===
=== Table Extraction ===
<section begin="Table Extraction" />[[Table Extraction (Concept)|Table Extraction]] is a conceptual term that refers to '''Grooper's''' functionality to extract data from [https://en.wikipedia.org/wiki/Table_cell cells] in [https://en.wikipedia.org/wiki/Table_(information) tables].  This is accomplished by configuring the [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' and its child [[image:GrooperIcon_DataColumn.png]] '''[[Data Column|Data Column]]''' '''[[Data Element (Concept)|Data Elements]]''' in a [[image:GrooperIcon_DataModel.png]] '''[[Data Model (Object)|Data Model]]'''.<section end="Table Extraction" />
<section begin="Table Extraction" />[[Table Extraction (Concept)|Table Extraction]] is a conceptual term that refers to '''Grooper's''' functionality to extract data from [https://en.wikipedia.org/wiki/Table_cell cells] in [https://en.wikipedia.org/wiki/Table_(information) tables].  This is accomplished by configuring the {{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''' and its child {{DataColumnIcon}} '''[[Data Column|Data Column]]''' '''[[Data Element (Concept)|Data Elements]]''' in a {{DataModelIcon}} '''[[Data Model (Object)|Data Model]]'''.<section end="Table Extraction" />


=== Test Batch ===
=== Test Batch ===
<section begin="Test Batch" />[[Test Batch (Concept)|Test Batch]] is a conceptual term that refers to any [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' created in the '''Test''' folder of the '''Batches''' folder in the [[Node Tree (UI Element)|Node Tree]]).<section end="Test Batch" />
<section begin="Test Batch" />[[Test Batch (Concept)|Test Batch]] is a conceptual term that refers to any {{BatchIcon}} '''[[Batch (Object)|Batch]]''' created in the '''Test''' folder of the '''Batches''' folder in the [[Node Tree (UI Element)|Node Tree]]).<section end="Test Batch" />


=== Thread ===
=== Thread ===
Line 297: Line 297:


=== Training-Based Approaches to Document Classification ===
=== Training-Based Approaches to Document Classification ===
<section begin="Training-Based Approaches to Document Classification" />[[Training-Based Approaches to Document Classification (Concept)|Training-Based Approaches to Document Classification]] is a conceptual term that refers to an approach to document [[Classification (Concept)|classification]] that classifies [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' according to the similarity of unclassified '''Batch Folders''' to trained examples of that kind of '''[[Document Type (Object)|Document Type]]'''.<section end="Training-Based Approaches to Document Classification" />
<section begin="Training-Based Approaches to Document Classification" />[[Training-Based Approaches to Document Classification (Concept)|Training-Based Approaches to Document Classification]] is a conceptual term that refers to an approach to document [[Classification (Concept)|classification]] that classifies {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' according to the similarity of unclassified '''Batch Folders''' to trained examples of that kind of '''[[Document Type (Object)|Document Type]]'''.<section end="Training-Based Approaches to Document Classification" />


=== Training Batch ===
=== Training Batch ===
<section begin="Training Batch" />[[Training Batch (Concept)|Training Batch]] is a conceptual term that refers to a more convenient way to work with all of the samples a [[image:GrooperIcon_ContentModel.png]] [[Content Model (Object)|Concent Model]] has been trained against. You can also still look at the '''[[Form Type (Object)|Form Types]]''' underneath each '''[[Content Type (Concept)|Content Type]]''', but the '''Training Set''' can show you all the samples in one place.<section end="Training Batch" />
<section begin="Training Batch" />[[Training Batch (Concept)|Training Batch]] is a conceptual term that refers to a more convenient way to work with all of the samples a {{ContentModelIcon}} [[Content Model (Object)|Concent Model]] has been trained against. You can also still look at the '''[[Form Type (Object)|Form Types]]''' underneath each '''[[Content Type (Concept)|Content Type]]''', but the '''Training Set''' can show you all the samples in one place.<section end="Training Batch" />


=== UNC Path ===
=== UNC Path ===
Line 306: Line 306:


=== Waterfall Classification ===
=== Waterfall Classification ===
<section begin="Waterfall Classification" />[[Waterfall Classification (Concept)|Waterfall Classification]] is a conceptual term that refers to a [[Classification (Concept)|classification]] notion in '''Grooper''' that manipulates the '''''Positive Extractor''''' property to prioritize training similarity in order to achieve a middle ground between high specificity and accuracy, and generality with minimal accuracy. This is helpful whenever [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''' get misclassified, and simply retraining won't help.<section end="Waterfall Classification" />
<section begin="Waterfall Classification" />[[Waterfall Classification (Concept)|Waterfall Classification]] is a conceptual term that refers to a [[Classification (Concept)|classification]] notion in '''Grooper''' that manipulates the '''''Positive Extractor''''' property to prioritize training similarity in order to achieve a middle ground between high specificity and accuracy, and generality with minimal accuracy. This is helpful whenever {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''' get misclassified, and simply retraining won't help.<section end="Waterfall Classification" />


=== XML Schema Integration ===
=== XML Schema Integration ===
Line 315: Line 315:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== CMIS Export ===
=== CMIS Export ===
<section begin="CMIS Export" />'''''[[CMIS Export (Export Definition)|CMIS Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to export documents and their [https://en.wikipedia.org/wiki/Metadata metadata] to various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud-based storage platforms].<section end="CMIS Export" />
<section begin="CMIS Export" />'''''[[CMIS Export (Export Definition)|CMIS Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports content over a {{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to export documents and their [https://en.wikipedia.org/wiki/Metadata metadata] to various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud-based storage platforms].<section end="CMIS Export" />


=== Data Export ===
=== Data Export ===
<section begin="Data Export" />'''''[[Data Export (Export Definition)|Data Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports extracted document data over a [[image:GrooperIcon_DataConnection.png]] '''[[Data Connection (Object)|Data Connection]]''', allowing users to export data to a [https://en.wikipedia.org/wiki/Microsoft_SQL_Server Microsoft SQL Server] or [https://en.wikipedia.org/wiki/Open_Database_Connectivity ODBC] compliant [https://en.wikipedia.org/wiki/Database database].<section end="Data Export" />
<section begin="Data Export" />'''''[[Data Export (Export Definition)|Data Export]]''''' is an '''''[[Export Definitions (Property)|Export Definition]]''''' available when configuring an '''''[[Export Behavior (Behavior)|Export Behavior]]'''''.  It exports extracted document data over a {{DataConnectionIcon}} '''[[Data Connection (Object)|Data Connection]]''', allowing users to export data to a [https://en.wikipedia.org/wiki/Microsoft_SQL_Server Microsoft SQL Server] or [https://en.wikipedia.org/wiki/Open_Database_Connectivity ODBC] compliant [https://en.wikipedia.org/wiki/Database database].<section end="Data Export" />
</div>
</div>
== Extractor Type ==
== Extractor Type ==
Line 327: Line 327:


=== Field Match ===
=== Field Match ===
<section begin="Field Match" />The '''''[[Field Match (Extractor Type)|Field Match]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' matches the value stored in a previously-extracted [[image:GrooperIcon_DataField.png]] '''[[Data Field (Object)|Data Field]]''' or [[image:GrooperIcon_DataColumn.png]] '''[[Data Column|Data Column]]''', allowing for consistency and reference across different parts of a document or dataset.<section end="Field Match" />
<section begin="Field Match" />The '''''[[Field Match (Extractor Type)|Field Match]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' matches the value stored in a previously-extracted {{DataFieldIcon}} '''[[Data Field (Object)|Data Field]]''' or {{DataColumnIcon}} '''[[Data Column|Data Column]]''', allowing for consistency and reference across different parts of a document or dataset.<section end="Field Match" />


=== Find Barcode ===
=== Find Barcode ===
<section begin="Find Barcode" />The '''''[[Find Barcode (Extractor Type)|Find Barcode]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' searches the [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder]]''' [[Layout Data (Concept)|layout data]] for a [https://en.wikipedia.org/wiki/Barcode barcode], capturing its value upon detection.<section end="Find Barcode" />
<section begin="Find Barcode" />The '''''[[Find Barcode (Extractor Type)|Find Barcode]]''''' '''''[[Extractor Type (Property)|Extractor Type]]''''' searches the {{BatchFolderIcon}} '''[[Batch Folder]]''' [[Layout Data (Concept)|layout data]] for a [https://en.wikipedia.org/wiki/Barcode barcode], capturing its value upon detection.<section end="Find Barcode" />


=== GPT Complete ===
=== GPT Complete ===
Line 385: Line 385:


== IP Command ==
== IP Command ==
<section begin="IP Command" />The '''''[[IP Command (Property)|Command]]''''' property of an [[image:GrooperIcon_IPStep.png]] '''[[IP Step (Object)|IP Step]]''' object in '''Grooper''' specifies the [[Image Processing (Concept)|Image Processing (IP)]] command to be executed for that specific step as part of an [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile (Object)|IP Profile]]'''.<section end="IP Command" />
<section begin="IP Command" />The '''''[[IP Command (Property)|Command]]''''' property of an {{IPStepIcon}} '''[[IP Step (Object)|IP Step]]''' object in '''Grooper''' specifies the [[Image Processing (Concept)|Image Processing (IP)]] command to be executed for that specific step as part of an {{IPProfileIcon}} '''[[IP Profile (Object)|IP Profile]]'''.<section end="IP Command" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Barcode Detection ===
=== Barcode Detection ===
Line 412: Line 412:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== CMIS Import ===
=== CMIS Import ===
<section begin="CMIS Import" />The '''''[[CMIS Import (Import Provider)|CMIS Import]]''''' '''''[[Import Provider (Property)|Import Provider]]''''' used to import content over a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to import from various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud based storage] platforms.<section end="CMIS Import" />
<section begin="CMIS Import" />The '''''[[CMIS Import (Import Provider)|CMIS Import]]''''' '''''[[Import Provider (Property)|Import Provider]]''''' used to import content over a {{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connection]]''', allowing users to import from various [https://en.wikipedia.org/wiki/On-premises_software on-premise] and [https://en.wikipedia.org/wiki/Cloud_storage cloud based storage] platforms.<section end="CMIS Import" />


=== Import Descendants ===
=== Import Descendants ===
<section begin="Import Descendants" />'''''[[Import Descendants (Import Provider)|Import Descendants]]''''' is one of two '''''[[Import Provider (Property)|Import Provider]]''''' that use [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connections]]''' to import document content into '''Grooper'''.<section end="Import Descendants" />
<section begin="Import Descendants" />'''''[[Import Descendants (Import Provider)|Import Descendants]]''''' is one of two '''''[[Import Provider (Property)|Import Provider]]''''' that use {{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connections]]''' to import document content into '''Grooper'''.<section end="Import Descendants" />


=== Import Query Results ===
=== Import Query Results ===
<section begin="Import Query Results" />'''''[[Import Query Results (Import Provider)|Import Query Results]]''''' is one of two '''''[[Import Provider (Property)|Import Provider]]''''' that use [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connections]]''' to import document content into '''Grooper'''.<section end="Import Query Results" />
<section begin="Import Query Results" />'''''[[Import Query Results (Import Provider)|Import Query Results]]''''' is one of two '''''[[Import Provider (Property)|Import Provider]]''''' that use {{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connections]]''' to import document content into '''Grooper'''.<section end="Import Query Results" />
</div>
</div>
== Lookup ==
== Lookup ==
Line 424: Line 424:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== CMIS Lookup ===
=== CMIS Lookup ===
<section begin="CMIS Lookup" />'''''[[CMIS Lookup (Lookup)|CMIS Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a [[image:GrooperIcon_CMISRepository.png]] '''[[CMIS Repository (Object)|CMIS Repository]]''' via a [[CMIS Query|CMISQL Query]].<section end="CMIS Lookup" />
<section begin="CMIS Lookup" />'''''[[CMIS Lookup (Lookup)|CMIS Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a {{CMISRepositoryIcon}} '''[[CMIS Repository (Object)|CMIS Repository]]''' via a [[CMIS Query|CMISQL Query]].<section end="CMIS Lookup" />


=== Database Lookup ===
=== Database Lookup ===
<section begin="Database Lookup" />'''''[[Database Lookup (Lookup)|Database Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a [[image:GrooperIcon_DataConnection.png]] '''[[Data Connection (Object)|Data Connection]]''' via a [https://en.wikipedia.org/wiki/SQL SQL query].<section end="Database Lookup" />
<section begin="Database Lookup" />'''''[[Database Lookup (Lookup)|Database Lookup]]''''' is a '''''[[Lookups (Property)|Lookup Specification]]''''' that performs a lookup against a {{DataConnectionIcon}} '''[[Data Connection (Object)|Data Connection]]''' via a [https://en.wikipedia.org/wiki/SQL SQL query].<section end="Database Lookup" />


=== GPT Lookup ===
=== GPT Lookup ===
Line 439: Line 439:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Batch ===
=== Batch ===
<section begin="Batch" />[[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' objects are fundamental in '''Grooper's''' architecture as they are the containers of documents that get moved through '''Grooper's''' workflow mechanisms known as [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Processes]]'''.<section end="Batch" />
<section begin="Batch" />{{BatchIcon}} '''[[Batch (Object)|Batch]]''' objects are fundamental in '''Grooper's''' architecture as they are the containers of documents that get moved through '''Grooper's''' workflow mechanisms known as {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Processes]]'''.<section end="Batch" />


=== Batch Folder ===
=== Batch Folder ===
<section begin="Batch Folder" />[[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]''' objects are defined as container objects within a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' that are used to represent and organize both folders and pages. They can hold other '''Batch Folders''' or [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' objects as children. The '''Batch Folder''' acts as an organizational unit within a '''Batch''', allowing for a structured approach to managing and processing a collection of documents.
<section begin="Batch Folder" />{{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]''' objects are defined as container objects within a {{BatchIcon}} '''[[Batch (Object)|Batch]]''' that are used to represent and organize both folders and pages. They can hold other '''Batch Folders''' or {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page]]''' objects as children. The '''Batch Folder''' acts as an organizational unit within a '''Batch''', allowing for a structured approach to managing and processing a collection of documents.
* '''Batch Folders''' are frequently referred to simply as "documents".<section end="Batch Folder" />
* '''Batch Folders''' are frequently referred to simply as "documents".<section end="Batch Folder" />


=== Batch Page ===
=== Batch Page ===
<section begin="Batch Page" />[[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' objects represent individual pages within a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]'''. The '''Batch Page''' object is the most granular unit in the hierarchy of [[Object_Nomenclature_(Concept)#Batch_Objects|Batch Objects]] in '''Grooper'''.
<section begin="Batch Page" />{{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page]]''' objects represent individual pages within a {{BatchIcon}} '''[[Batch (Object)|Batch]]'''. The '''Batch Page''' object is the most granular unit in the hierarchy of [[Object_Nomenclature_(Concept)#Batch_Objects|Batch Objects]] in '''Grooper'''.
* '''Batch Pages''' are frequently referred to simply as "pages".<section end="Batch Page" />
* '''Batch Pages''' are frequently referred to simply as "pages".<section end="Batch Page" />


=== Batch Process ===
=== Batch Process ===
<section begin="Batch Process" />[[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' objects are crucial components in '''Grooper's''' architecture. A '''Batch Process''' orchestrates the document processing strategy and ensures each [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' of documents is managed systematically and efficiently.
<section begin="Batch Process" />{{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]''' objects are crucial components in '''Grooper's''' architecture. A '''Batch Process''' orchestrates the document processing strategy and ensures each {{BatchIcon}} '''[[Batch (Object)|Batch]]''' of documents is managed systematically and efficiently.
* '''Batch Processes''' by themselves do nothing.  Instead, the workflows they execute are designed by adding child [[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Steps]]'''.
* '''Batch Processes''' by themselves do nothing.  Instead, the workflows they execute are designed by adding child {{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Steps]]'''.
* A '''Batch Process''' is often referred to as simply a "process".<section end="Batch Process" />
* A '''Batch Process''' is often referred to as simply a "process".<section end="Batch Process" />


=== Batch Process Step ===
=== Batch Process Step ===
<section begin="Batch Process Step" />[[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Step]]''' objects are specific actions within the sequence defined by a [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]'''. A '''Batch Procsess Step''' plays a critical role in automating and managing the flow of documents through the various stages of processing within '''Grooper'''.
<section begin="Batch Process Step" />{{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Step]]''' objects are specific actions within the sequence defined by a {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]'''. A '''Batch Procsess Step''' plays a critical role in automating and managing the flow of documents through the various stages of processing within '''Grooper'''.
* '''Batch Process Steps''' are frequently referred to as simply "steps".
* '''Batch Process Steps''' are frequently referred to as simply "steps".
* Because a single '''Batch Process Step''' executes a single '''''Activity''''' configuration, they are often referred to by their referenced '''''Activity''''' as well.  For example, a "'''''Recognize''''' step".<section end="Batch Process Step" />
* Because a single '''Batch Process Step''' executes a single '''''Activity''''' configuration, they are often referred to by their referenced '''''Activity''''' as well.  For example, a "'''''Recognize''''' step".<section end="Batch Process Step" />


=== CMIS Connection ===
=== CMIS Connection ===
<section begin="CMIS Connection" />[[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]''' objects provide a standardized way of connecting to various [https://en.wikipedia.org/wiki/Content_management_system content management systems (CMS)]. These objects allow '''Grooper''' to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of '''Grooper's''' immediate environment.
<section begin="CMIS Connection" />{{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connection]]''' objects provide a standardized way of connecting to various [https://en.wikipedia.org/wiki/Content_management_system content management systems (CMS)]. These objects allow '''Grooper''' to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of '''Grooper's''' immediate environment.
* For those that support the [https://en.wikipedia.org/wiki/Content_Management_Interoperability_Services CMIS] standard, the '''CMIS Connection''' connects to the CMS using the CMIS standard.
* For those that support the [https://en.wikipedia.org/wiki/Content_Management_Interoperability_Services CMIS] standard, the '''CMIS Connection''' connects to the CMS using the CMIS standard.
* For those that do not, the '''CMIS Connection''' normalizes [https://en.wikipedia.org/wiki/Comparison_of_file_transfer_protocols connection and transfer protocol] as if they ''were'' a CMIS platform.<section end="CMIS Connection" />
* For those that do not, the '''CMIS Connection''' normalizes [https://en.wikipedia.org/wiki/Comparison_of_file_transfer_protocols connection and transfer protocol] as if they ''were'' a CMIS platform.<section end="CMIS Connection" />


=== CMIS Repository ===
=== CMIS Repository ===
<section begin="CMIS Repository" />[[image:GrooperIcon_CMISRepository.png]] '''[[CMIS Repository (Object)|CMIS Repository]]''' objects in '''Grooper''' allow access to external documents through a [[image:GrooperIcon_CMISConnection.png]] '''[[CMIS Connection (Object)|CMIS Connection]]'''. They allows managing and interacting with those documents within '''Grooper's''' framework as if they were local. They are created as a child object of a '''CMIS Connection''' and used for various '''''[[Activity (Property)|Activities]]'''''.<section end="CMIS Repository" />
<section begin="CMIS Repository" />{{CMISRepositoryIcon}} '''[[CMIS Repository (Object)|CMIS Repository]]''' objects in '''Grooper''' allow access to external documents through a {{CMISConnectionIcon}} '''[[CMIS Connection (Object)|CMIS Connection]]'''. They allows managing and interacting with those documents within '''Grooper's''' framework as if they were local. They are created as a child object of a '''CMIS Connection''' and used for various '''''[[Activity (Property)|Activities]]'''''.<section end="CMIS Repository" />


=== Content Category ===
=== Content Category ===
<section begin="Content Category" />[[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Category]]''' objects are containers within a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''' that hold other '''Content Categories''' and [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]''' objects. They allow for further [[Classification (Concept)|classification]] and grouping of '''Document Types''' within a [https://en.wikipedia.org/wiki/Taxonomy taxonomy], aiding in the logical structuring of complex document sets. Besides grouping '''Document Types''' together, '''Content Categories''' also serve to create new branches in a '''[[Data Element (Concept)|Data Element]]''' hierarchy. In most cases '''Content Categories''' are used as organizational buckets to group like '''Document Types''' together.<section end="Content Category" />
<section begin="Content Category" />{{ContentCategoryIcon}} '''[[Content Category (Object)|Content Category]]''' objects are containers within a {{ContentModelIcon}} '''[[Content Model (Object)|Content Model]]''' that hold other '''Content Categories''' and {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]''' objects. They allow for further [[Classification (Concept)|classification]] and grouping of '''Document Types''' within a [https://en.wikipedia.org/wiki/Taxonomy taxonomy], aiding in the logical structuring of complex document sets. Besides grouping '''Document Types''' together, '''Content Categories''' also serve to create new branches in a '''[[Data Element (Concept)|Data Element]]''' hierarchy. In most cases '''Content Categories''' are used as organizational buckets to group like '''Document Types''' together.<section end="Content Category" />


=== Content Model ===
=== Content Model ===
<section begin="Content Model" />[[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''' objects define the [https://en.wikipedia.org/wiki/Taxonomy taxonomy] of document sets in terms of the [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]''' they contain. They also house the '''[[Data Element (Concept)|Data Elements]]''' that appear on each [[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Category]]''' and '''Document Type''' within them. '''Content Models''' serve as the root of a '''[[Content Type (Concept)|Content Type]]''' hierarchy and are crucial for organizing the different types of documents that '''Grooper''' can recognize and process.<section end="Content Model" />
<section begin="Content Model" />{{ContentModelIcon}} '''[[Content Model (Object)|Content Model]]''' objects define the [https://en.wikipedia.org/wiki/Taxonomy taxonomy] of document sets in terms of the {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]''' they contain. They also house the '''[[Data Element (Concept)|Data Elements]]''' that appear on each {{ContentCategoryIcon}} '''[[Content Category (Object)|Content Category]]''' and '''Document Type''' within them. '''Content Models''' serve as the root of a '''[[Content Type (Concept)|Content Type]]''' hierarchy and are crucial for organizing the different types of documents that '''Grooper''' can recognize and process.<section end="Content Model" />


=== Data Column ===
=== Data Column ===
<section begin="Data Column" />[[image:GrooperIcon_DataColumn.png]] '''[[Data Column (Object)|Data Column]]''' objects are child objects of a [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''', representing individual columns and defining the type of data each column holds along with its [[Data Extraction (Concept)|data extraction]] properties.<section end="Data Column" />
<section begin="Data Column" />{{DataColumnIcon}} '''[[Data Column (Object)|Data Column]]''' objects are child objects of a {{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''', representing individual columns and defining the type of data each column holds along with its [[Data Extraction (Concept)|data extraction]] properties.<section end="Data Column" />


=== Data Connection ===
=== Data Connection ===
<section begin="Data Connection" />[[image:GrooperIcon_DataConnection.png]] '''[[Data Connection (Object)|Data Connection]]''' objects define the settings for connecting to and interacting with a [https://en.wikipedia.org/wiki/Database database]. These interactions may include conducting lookups, exports, or other actions that relate to [https://en.wikipedia.org/wiki/Database#Database_management_system database management systems (DBMS)]. Once configured, a '''Data Connection''' object can be referenced by other components in '''Grooper''' for various DBMS-related activities.<section end="Data Connection" />
<section begin="Data Connection" />{{DataConnectionIcon}} '''[[Data Connection (Object)|Data Connection]]''' objects define the settings for connecting to and interacting with a [https://en.wikipedia.org/wiki/Database database]. These interactions may include conducting lookups, exports, or other actions that relate to [https://en.wikipedia.org/wiki/Database#Database_management_system database management systems (DBMS)]. Once configured, a '''Data Connection''' object can be referenced by other components in '''Grooper''' for various DBMS-related activities.<section end="Data Connection" />


=== Data Field ===
=== Data Field ===
<section begin="Data Field" />[[image:GrooperIcon_DataField.png]] '''[[Data Field (Object)|Data Field]]''' objects are created as child objects of a [[image:GrooperIcon_DataModel.png]] '''[[Data Model (Object)|Data Model]]'''. A '''Data Field''' is a representation of a single piece of data targeted for [[Data Extraction (Concept)|extraction]] on a document.
<section begin="Data Field" />{{DataFieldIcon}} '''[[Data Field (Object)|Data Field]]''' objects are created as child objects of a {{DataModelIcon}} '''[[Data Model (Object)|Data Model]]'''. A '''Data Field''' is a representation of a single piece of data targeted for [[Data Extraction (Concept)|extraction]] on a document.


'''Data Fields''' are frequently referred to simply as "fields".<section end="Data Field" />
'''Data Fields''' are frequently referred to simply as "fields".<section end="Data Field" />


=== Data Model ===
=== Data Model ===
<section begin="Data Model" />[[image:GrooperIcon_DataModel.png]] '''[[Data Model (Object)|Data Model]]''' objects serve as the top-tier structure defining the [https://en.wikipedia.org/wiki/Taxonomy taxonomy] for '''[[Data Element (Concept)|Data Elements]]''' and are leveraged during the '''[[Extract (Activity)|Extract]]''' '''''[[Activity (Property)|Activity]]''''' to extract data from a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]'''. They are a hierarchy of '''Data Elements''' that sets the stage for the [[Data Extraction (Concept)|extraction]] logic and review of data collected from documents.<section end="Data Model" />
<section begin="Data Model" />{{DataModelIcon}} '''[[Data Model (Object)|Data Model]]''' objects serve as the top-tier structure defining the [https://en.wikipedia.org/wiki/Taxonomy taxonomy] for '''[[Data Element (Concept)|Data Elements]]''' and are leveraged during the '''[[Extract (Activity)|Extract]]''' '''''[[Activity (Property)|Activity]]''''' to extract data from a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]'''. They are a hierarchy of '''Data Elements''' that sets the stage for the [[Data Extraction (Concept)|extraction]] logic and review of data collected from documents.<section end="Data Model" />


=== Data Rule ===
=== Data Rule ===
<section begin="Data Rule" />[[image:GrooperIcon_DataRule.png]] '''[[Data Rule (Object)|Data Rule]]''' objects define the logic for automated data manipulation which occurs after data has been extracted from [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]'''. These rules are applied to normalize or otherwise prepare data collected in a [[image:GrooperIcon_DataModel.png]] '''[[Data Model|Data Model]]''' for downstream processes. '''Data Rules''' ensure that extracted data conforms to expected formats or meets certain quality standards.<section end="Data Rule" />
<section begin="Data Rule" />{{DataRuleIcon}} '''[[Data Rule (Object)|Data Rule]]''' objects define the logic for automated data manipulation which occurs after data has been extracted from {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]'''. These rules are applied to normalize or otherwise prepare data collected in a {{DataModelIcon}} '''[[Data Model|Data Model]]''' for downstream processes. '''Data Rules''' ensure that extracted data conforms to expected formats or meets certain quality standards.<section end="Data Rule" />


=== Data Section ===
=== Data Section ===
<section begin="Data Section" />[[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' objects are grouping mechanisms for related [[image:GrooperIcon_DataField.png]] '''[[Data Field (Object)|Data Fields]]'''. '''Data Sections''' organize and segment child '''[[Data Element (Concept)|Data Elements]]''' into logical divisions of a document based on the structure and semantics of the information the documents contain.<section end="Data Section" />
<section begin="Data Section" />{{DataSectionIcon}} '''[[Data Section (Object)|Data Section]]''' objects are grouping mechanisms for related {{DataFieldIcon}} '''[[Data Field (Object)|Data Fields]]'''. '''Data Sections''' organize and segment child '''[[Data Element (Concept)|Data Elements]]''' into logical divisions of a document based on the structure and semantics of the information the documents contain.<section end="Data Section" />


=== Data Table ===
=== Data Table ===
<section begin="Data Table" />[[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' objects are utilized for extracting repeating data that's formatted in [https://en.wikipedia.org/wiki/Table_(information) rows and columns], allowing for complex multi-instance data organization that would be present in table-formatted content.<section end="Data Table" />
<section begin="Data Table" />{{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''' objects are utilized for extracting repeating data that's formatted in [https://en.wikipedia.org/wiki/Table_(information) rows and columns], allowing for complex multi-instance data organization that would be present in table-formatted content.<section end="Data Table" />


=== Data Type ===
=== Data Type ===
<section begin="Data Type" />[[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Type]]''' objects hold a collection of child, referenced, and locally defined [[Data Extractor (Concept)|Data Extractors]] and settings that manage how multiple (even differing) matches from Data Extractors are consolidated (via '''''[[Collation Provider (Property)|Collation]]''''') into a result set.<section end="Data Type" />
<section begin="Data Type" />{{DataTypeIcon}} '''[[Data Type (Object)|Data Type]]''' objects hold a collection of child, referenced, and locally defined [[Data Extractor (Concept)|Data Extractors]] and settings that manage how multiple (even differing) matches from Data Extractors are consolidated (via '''''[[Collation Provider (Property)|Collation]]''''') into a result set.<section end="Data Type" />


=== Document Type ===
=== Document Type ===
<section begin="Document Type" />[[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]''' objects represent a distinct type of document, like an invoice or contract. '''Document Types''' are created as children of a [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Model]]''' or a [[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Category]]''' and are used to classify individual [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]'''. Each '''Document Type''' in the hierarchy defines the '''[[Data Element (Concept)|Data Elements]]''' and '''''[[Behaviors (Property)|Behaviors]]''''' that apply to '''Batch Folders''' of that specific classification.<section end="Document Type" />
<section begin="Document Type" />{{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]''' objects represent a distinct type of document, like an invoice or contract. '''Document Types''' are created as children of a {{ContentModelIcon}} '''[[Content Model (Object)|Content Model]]''' or a {{ContentCategoryIcon}} '''[[Content Category (Object)|Content Category]]''' and are used to classify individual {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]'''. Each '''Document Type''' in the hierarchy defines the '''[[Data Element (Concept)|Data Elements]]''' and '''''[[Behaviors (Property)|Behaviors]]''''' that apply to '''Batch Folders''' of that specific classification.<section end="Document Type" />


=== Field Class ===
=== Field Class ===
<section begin="Field Class" />[[image:GrooperIcon_FieldClass.png]] '''[[Field Class (Object)|Field Class]]''' objects are trainable extractors that distinguish between multiple instances of similar data within a document by understanding the context in which they occur. '''Field Classes''' ''can'' be configured to distinguish values within highly structured documents, but this type of [[Data Extraction (Concept)|extraction]] is better suited to simpler "Extractor Objects" like [[image:GrooperIcon_ValueReader.png]] '''[[Value Reader (Object)|Value Readers]]''' or [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Types]]'''.  
<section begin="Field Class" />{{FieldClassIcon}} '''[[Field Class (Object)|Field Class]]''' objects are trainable extractors that distinguish between multiple instances of similar data within a document by understanding the context in which they occur. '''Field Classes''' ''can'' be configured to distinguish values within highly structured documents, but this type of [[Data Extraction (Concept)|extraction]] is better suited to simpler "Extractor Objects" like {{ValueReaderIcon}} '''[[Value Reader (Object)|Value Readers]]''' or {{DataTypeIcon}} '''[[Data Type (Object)|Data Types]]'''.  


'''Field Classes''' are most useful when attempting to find values within the flow of natural language. This method involves training with positive and negative examples to distinguish the right context. You'd opt for a '''Field Class''' when the value you're after is an entire clause within a contract, or a specific value defined within the flow of text.<section end="Field Class" />
'''Field Classes''' are most useful when attempting to find values within the flow of natural language. This method involves training with positive and negative examples to distinguish the right context. You'd opt for a '''Field Class''' when the value you're after is an entire clause within a contract, or a specific value defined within the flow of text.<section end="Field Class" />


=== File Store ===
=== File Store ===
<section begin="File Store" />[[image:GrooperIcon_FileStore.png]] '''[[File Store (Object)|File Store]]''' objects define a storage location within '''Grooper''' where file content associated with nodes are saved. They are crucial for managing the content that forms the basis of the '''Grooper's''' processing tasks, allowing for the storage and retrieval of documents, images, and other "files". Not every object in '''Grooper''' will have files connected to it, but if it does, those files are stored in the location defined by this object.<section end="File Store" />
<section begin="File Store" />{{FileStoreIcon}} '''[[File Store (Object)|File Store]]''' objects define a storage location within '''Grooper''' where file content associated with nodes are saved. They are crucial for managing the content that forms the basis of the '''Grooper's''' processing tasks, allowing for the storage and retrieval of documents, images, and other "files". Not every object in '''Grooper''' will have files connected to it, but if it does, those files are stored in the location defined by this object.<section end="File Store" />


=== Form Type ===
=== Form Type ===
<section begin="Form Type" />[[image:GrooperIcon_FormType.png]] '''[[Form Type (Object)|Form Type]]''' objects represent trained variations of a [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]'''.  These objects store [https://en.wikipedia.org/wiki/Machine_learning machine learning] training data for '''''[[Lexical (Classification Method)|Lexical]]''''' and '''''[[Visual (Classification Method)|Visual]]''''' document classification methods.<section end="Form Type" />
<section begin="Form Type" />{{FormTypeIcon}} '''[[Form Type (Object)|Form Type]]''' objects represent trained variations of a {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]'''.  These objects store [https://en.wikipedia.org/wiki/Machine_learning machine learning] training data for '''''[[Lexical (Classification Method)|Lexical]]''''' and '''''[[Visual (Classification Method)|Visual]]''''' document classification methods.<section end="Form Type" />


=== IP Group ===
=== IP Group ===
<section begin="IP Group" />[[image:GrooperIcon_IPGroup.png]] '''[[IP Group (Object)|IP Group]]''' objects are child objects within [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile (Object)|IP Profiles]]''' that create a hierarchical structure for organizing [[Image Processing (Concept)|image processing]] commands. '''IP Groups''' may contain other '''IP Groups''' or [[image:GrooperIcon_IPStep.png]] '''[[IP Step|IP Step]]''' objects.<section end="IP Group" />
<section begin="IP Group" />{{IPGroupIcon}} '''[[IP Group (Object)|IP Group]]''' objects are child objects within {{IPProfileIcon}} '''[[IP Profile (Object)|IP Profiles]]''' that create a hierarchical structure for organizing [[Image Processing (Concept)|image processing]] commands. '''IP Groups''' may contain other '''IP Groups''' or {{IPStepIcon}} '''[[IP Step|IP Step]]''' objects.<section end="IP Group" />


=== IP Profile ===
=== IP Profile ===
<section begin="IP Profile" />[[image:GrooperIcon_IPProfile.png]] '''[[IP Profile|IP Profile]]''' objects detail the operations and parameters for image enhancement and cleanup. These operations improve the accuracy of further processing steps, like  the '''''[[Recognize (Activity)|Recognize]]''''' and '''''[[Classify (Activity)|Classify]]''''' '''''[[Activity (Property)|Activities]]'''''.<section end="IP Profile" />
<section begin="IP Profile" />{{IPProfileIcon}} '''[[IP Profile|IP Profile]]''' objects detail the operations and parameters for image enhancement and cleanup. These operations improve the accuracy of further processing steps, like  the '''''[[Recognize (Activity)|Recognize]]''''' and '''''[[Classify (Activity)|Classify]]''''' '''''[[Activity (Property)|Activities]]'''''.<section end="IP Profile" />


=== IP Step ===
=== IP Step ===
<section begin="IP Step" />[[image:GrooperIcon_IPStep.png]] '''[[IP Step (Object)|IP Step]]''' objects are the basic units within an [[image:GrooperIcon_IPProfile.png]] '''[[IP Profile|IP Profile]]''' that define a single [[Image Processing (Concept)|image processing]] operation. '''IP Steps''' are performed sequentially within their parent [[image:GrooperIcon_IPGroup.png]] '''[[IP Group (Object)|IP Group]]''' or '''IP Profile'''.<section end="IP Step" />
<section begin="IP Step" />{{IPStepIcon}} '''[[IP Step (Object)|IP Step]]''' objects are the basic units within an {{IPProfileIcon}} '''[[IP Profile|IP Profile]]''' that define a single [[Image Processing (Concept)|image processing]] operation. '''IP Steps''' are performed sequentially within their parent {{IPGroupIcon}} '''[[IP Group (Object)|IP Group]]''' or '''IP Profile'''.<section end="IP Step" />


=== Lexicon ===
=== Lexicon ===
<section begin="Lexicon" />[[image:GrooperIcon_Lexicon.png]] '''[[Lexicon (Object)|Lexicon]]''' objects are dictionary objects that store a list of keys or key-value pairs. '''Lexicons''' can define local entries and/or import entries from other '''Lexicons''' and even import entries using a '''Data Connection'''. The entries in a '''Lexicon''' can be utilized in different areas of '''Grooper''', such as [[Data Extraction (Concept)|data extraction]], '''''[[Fuzzy Matching (Property)|Fuzzy Matching]]''''', or [https://en.wikipedia.org/wiki/OCR OCR] '''''[[Correct (Activity)|Correction]]''''', providing a reference point that enhances the accuracy and consistency of the [https://en.wikipedia.org/wiki/Software software's] operations.<section end="Lexicon" />
<section begin="Lexicon" />{{LexiconIcon}} '''[[Lexicon (Object)|Lexicon]]''' objects are dictionary objects that store a list of keys or key-value pairs. '''Lexicons''' can define local entries and/or import entries from other '''Lexicons''' and even import entries using a '''Data Connection'''. The entries in a '''Lexicon''' can be utilized in different areas of '''Grooper''', such as [[Data Extraction (Concept)|data extraction]], '''''[[Fuzzy Matching (Property)|Fuzzy Matching]]''''', or [https://en.wikipedia.org/wiki/OCR OCR] '''''[[Correct (Activity)|Correction]]''''', providing a reference point that enhances the accuracy and consistency of the [https://en.wikipedia.org/wiki/Software software's] operations.<section end="Lexicon" />


=== Machine ===
=== Machine ===
<section begin="Machine" />[[image:GrooperIcon_Machine.png]] '''[[Machine (Object)|Machine]]''' objects represent [https://en.wikipedia.org/wiki/Server_(computing) servers] that have connected to the '''Grooper''' [[Repository (Concept)|repository]]. They allow for the management of [[Grooper Service (Concept)|Grooper Service]] instances and serve as a connection points for processing jobs to be executed on the server hardware. '''Machine''' objects are essential for the scaling of processing capabilities and for distributing processing loads across multiple servers.<section end="Machine" />
<section begin="Machine" />{{MachineIcon}} '''[[Machine (Object)|Machine]]''' objects represent [https://en.wikipedia.org/wiki/Server_(computing) servers] that have connected to the '''Grooper''' [[Repository (Concept)|repository]]. They allow for the management of [[Grooper Service (Concept)|Grooper Service]] instances and serve as a connection points for processing jobs to be executed on the server hardware. '''Machine''' objects are essential for the scaling of processing capabilities and for distributing processing loads across multiple servers.<section end="Machine" />


=== OCR Profile ===
=== OCR Profile ===
<section begin="OCR Profile" />[[image:GrooperIcon_OCRProfile.png]] '''[[OCR Profile (Object)|OCR Profile]]''' objects configure the settings for optical character recognition ([https://en.wikipedia.org/wiki/Optical_character_recognition OCR]) leveraged by the '''[[Recognize (Activity)|Recognize]]''' activity. OCR converts images of text into machine-encoded text. '''OCR Profile''' objects influence how effectively textual content is '''''[[Recognize (Activity)|recognized]]''''' and from [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]'''.<section end="OCR Profile" />
<section begin="OCR Profile" />{{OCRProfileIcon}} '''[[OCR Profile (Object)|OCR Profile]]''' objects configure the settings for optical character recognition ([https://en.wikipedia.org/wiki/Optical_character_recognition OCR]) leveraged by the '''[[Recognize (Activity)|Recognize]]''' activity. OCR converts images of text into machine-encoded text. '''OCR Profile''' objects influence how effectively textual content is '''''[[Recognize (Activity)|recognized]]''''' and from {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]'''.<section end="OCR Profile" />


=== Object Library ===
=== Object Library ===
<section begin="Object Library" />[[image:GrooperIcon_ObjectLibrary.png]] '''[[Object Library (Object)|Object Library]]''' objects are [https://en.wikipedia.org/wiki/.NET_Framework .NET] [https://en.wikipedia.org/wiki/Library_(computing) libraries] that contain code files for customizing the functionality of '''Grooper'''. These libraries are used for a range of customization and integration tasks, allowing users to extend '''Grooper's''' capabilities.
<section begin="Object Library" />{{ObjectLibraryIcon}} '''[[Object Library (Object)|Object Library]]''' objects are [https://en.wikipedia.org/wiki/.NET_Framework .NET] [https://en.wikipedia.org/wiki/Library_(computing) libraries] that contain code files for customizing the functionality of '''Grooper'''. These libraries are used for a range of customization and integration tasks, allowing users to extend '''Grooper's''' capabilities.
: Examples include:
: Examples include:
:* Adding custom '''''[[Activity (Property)|activities]]''''' that execute within '''[[Batch Process (Object)|Batch Processes]]'''
:* Adding custom '''''[[Activity (Property)|activities]]''''' that execute within '''[[Batch Process (Object)|Batch Processes]]'''
Line 540: Line 540:


=== Processing Queue ===
=== Processing Queue ===
<section begin="Processing Queue" />[[image:GrooperIcon_ProcessingQueue.png]] '''[[Processing Queue (Object)|Processing Queue]]''' objects are designed for tasks performed by [[image:GrooperIcon_Machine.png]] '''[[Machine (Object)|Machines]]''', which include automated steps in the document processing lifecycle. '''Processing Queues''' are used to distribute machine tasks among different [https://en.wikipedia.org/wiki/Server_(computing) servers] and control the concurrency or processing rate of these tasks.  
<section begin="Processing Queue" />{{ProcessingQueueIcon}} '''[[Processing Queue (Object)|Processing Queue]]''' objects are designed for tasks performed by {{MachineIcon}} '''[[Machine (Object)|Machines]]''', which include automated steps in the document processing lifecycle. '''Processing Queues''' are used to distribute machine tasks among different [https://en.wikipedia.org/wiki/Server_(computing) servers] and control the concurrency or processing rate of these tasks.  
* For example, activities such as '''''[[Render (Activity)|Render]]''''' or '''''[[Export (Activity)|Export]]''''' can be managed so that only one activity instance runs per machine or so multiple instances are processed concurrently, according to the queue configuration.<section end="Processing Queue" />
* For example, activities such as '''''[[Render (Activity)|Render]]''''' or '''''[[Export (Activity)|Export]]''''' can be managed so that only one activity instance runs per machine or so multiple instances are processed concurrently, according to the queue configuration.<section end="Processing Queue" />


=== Project ===
=== Project ===
<section begin="Project" />[[image:GrooperIcon_Project.png]] '''[[Project (Object)|Project]]''' objects are collections of resources and serve as the primary containers for design components within '''Grooper'''. The '''Project''' object is where various processing objects such as [[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Models]]''', [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Processes]]''', [[Object Nomenclature (Concept)#Profile_Objects|Profile Objects]], and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.<section end="Project" />
<section begin="Project" />{{ProjectIcon}} '''[[Project (Object)|Project]]''' objects are collections of resources and serve as the primary containers for design components within '''Grooper'''. The '''Project''' object is where various processing objects such as {{ContentModelIcon}} '''[[Content Model (Object)|Content Models]]''', {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Processes]]''', [[Object Nomenclature (Concept)#Profile_Objects|Profile Objects]], and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.<section end="Project" />


=== Resource File ===
=== Resource File ===
<section begin="Resource File" />A '''[[Resource File (Object)|Resource File]]''' object in '''Grooper''' is essentially a file that is stored as part of a '''Grooper''' [[image:GrooperIcon_Project.png]] '''[[Project (Object)|Project]]'''. It can include various types of files such as text files or [[XML Schema Integration|XML schema files]].<section end="Resource File" />
<section begin="Resource File" />A '''[[Resource File (Object)|Resource File]]''' object in '''Grooper''' is essentially a file that is stored as part of a '''Grooper''' {{ProjectIcon}} '''[[Project (Object)|Project]]'''. It can include various types of files such as text files or [[XML Schema Integration|XML schema files]].<section end="Resource File" />


=== Review Queue ===
=== Review Queue ===
<section begin="Review Queue" />[[image:GrooperIcon_ReviewQueue.png]] '''[[Review Queue (Object)|Review Queue]]''' objects are designated for human-performed tasks. They organizes the '''''[[Review (Activity)|Review]]''''' tasks that require human attention and can distribute these tasks among different groups of users based on the queue's settings. '''Review Queues''' can be assigned on the [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' level to filter work by an entire process or '''''[[Review (Activity)|Review]]''''' '''''[[Activity (Property)|Activities]]''''' at the [[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Step]]''' level to filter tasks at a more granular step-based level.<section end="Review Queue" />
<section begin="Review Queue" />{{ReviewQueueIcon}} '''[[Review Queue (Object)|Review Queue]]''' objects are designated for human-performed tasks. They organizes the '''''[[Review (Activity)|Review]]''''' tasks that require human attention and can distribute these tasks among different groups of users based on the queue's settings. '''Review Queues''' can be assigned on the {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]''' level to filter work by an entire process or '''''[[Review (Activity)|Review]]''''' '''''[[Activity (Property)|Activities]]''''' at the {{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Step]]''' level to filter tasks at a more granular step-based level.<section end="Review Queue" />


=== Root ===
=== Root ===
<section begin="Root" />The [[image:GrooperIcon_GrooperRoot.png]] '''[[Root (Object)|Root]]''' object represents the topmost element of the '''Grooper''' [[Grooper Repository (Concept)|repository]]. It serves as the starting point from which all other objects branch out. It is the anchor point for all other structures within the repository and a necessary element for the organization and linkage of all other objects within '''Grooper'''.<section end="Root" />
<section begin="Root" />The {{GrooperRootIcon}} '''[[Root (Object)|Root]]''' object represents the topmost element of the '''Grooper''' [[Grooper Repository (Concept)|repository]]. It serves as the starting point from which all other objects branch out. It is the anchor point for all other structures within the repository and a necessary element for the organization and linkage of all other objects within '''Grooper'''.<section end="Root" />


=== Scanner Profile ===
=== Scanner Profile ===
<section begin="Scanner Profile" />[[image:GrooperIcon_ScannerProfile.png]] '''[[Scanner Profile (Object)|Scanner Profile]]''' objects outline the specifications for [https://en.wikipedia.org/wiki/Image_scanner scanning] physical documents into digital forms. This includes settings like [https://en.wikipedia.org/wiki/Image_resolution resolution], [https://en.wikipedia.org/wiki/Color_model color mode], and any post-scan image processing or enhancement functions.
<section begin="Scanner Profile" />{{ScannerProfileIcon}} '''[[Scanner Profile (Object)|Scanner Profile]]''' objects outline the specifications for [https://en.wikipedia.org/wiki/Image_scanner scanning] physical documents into digital forms. This includes settings like [https://en.wikipedia.org/wiki/Image_resolution resolution], [https://en.wikipedia.org/wiki/Color_model color mode], and any post-scan image processing or enhancement functions.


See [[Desktop Scanning in Grooper]] for more information.<section end="Scanner Profile" />
See [[Desktop Scanning in Grooper]] for more information.<section end="Scanner Profile" />


=== Separation Profile ===
=== Separation Profile ===
<section begin="Separation Profile" />[[image:GrooperIcon_SeparationProfile.png]] '''[[Separation Profile (Object)|Separation Profile]]''' objects contain rules and settings that determine how groupings of scanned pages are '''''[[Separate (Activity)|separated]]''''' into individual [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', often using [https://en.wikipedia.org/wiki/Barcode barcodes], blank pages, or [https://en.wikipedia.org/wiki/Patch_Code patch codes] as indicators for separation points.<section end="Separation Profile" />
<section begin="Separation Profile" />{{SeparationProfileIcon}} '''[[Separation Profile (Object)|Separation Profile]]''' objects contain rules and settings that determine how groupings of scanned pages are '''''[[Separate (Activity)|separated]]''''' into individual {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', often using [https://en.wikipedia.org/wiki/Barcode barcodes], blank pages, or [https://en.wikipedia.org/wiki/Patch_Code patch codes] as indicators for separation points.<section end="Separation Profile" />


=== Value Reader ===
=== Value Reader ===
<section begin="Value Reader" />[[image:GrooperIcon_ValueReader.png]] '''[[Value Reader (Object)|Value Reader]]''' objects define a single [[Data Extraction (Concept)|data extraction]] operation. You set the '''''[[Extractor Type (Property)|Extractor Type]]''''' on the '''Value Reader''' that matches the specific data you're aiming to capture. For example, you would use the '''''[[Pattern Match (Extractor Type)|Pattern Match]]''''' '''''Extractor Type''''' to return data using [https://en.wikipedia.org/wiki/Regular_expression regular expression]. You would use a '''Value Reader''' when you need to extract a single result or list of simple results from a document.<section end="Value Reader" />
<section begin="Value Reader" />{{ValueReaderIcon}} '''[[Value Reader (Object)|Value Reader]]''' objects define a single [[Data Extraction (Concept)|data extraction]] operation. You set the '''''[[Extractor Type (Property)|Extractor Type]]''''' on the '''Value Reader''' that matches the specific data you're aiming to capture. For example, you would use the '''''[[Pattern Match (Extractor Type)|Pattern Match]]''''' '''''Extractor Type''''' to return data using [https://en.wikipedia.org/wiki/Regular_expression regular expression]. You would use a '''Value Reader''' when you need to extract a single result or list of simple results from a document.<section end="Value Reader" />
</div>
</div>
== Property ==
== Property ==
Line 579: Line 579:


=== Content Type Filter ===
=== Content Type Filter ===
<section begin="Content Type Filter" />The '''''[[Content Type Filter (Property)|Content Type Filter]]''''' property restricts '''''[[Activity (Property)|Activities]]''''' to specific [[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Categories]]''' and/or [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Types]]'''.<section end="Content Type Filter" />
<section begin="Content Type Filter" />The '''''[[Content Type Filter (Property)|Content Type Filter]]''''' property restricts '''''[[Activity (Property)|Activities]]''''' to specific {{ContentCategoryIcon}} '''[[Content Category (Object)|Content Categories]]''' and/or {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Types]]'''.<section end="Content Type Filter" />


=== Document Quoting ===
=== Document Quoting ===
Line 588: Line 588:


=== Output Extractor Key ===
=== Output Extractor Key ===
<section begin="Output Extractor Key" />The '''''[[Output Extractor Key (Property)|Output Extractor Key]]''''' property is another weapon in the arsenal of powerful '''Grooper''' [[Classification (Concept)|classification]] techniques.  It allows [[image:GrooperIcon_DataType.png]] '''[[Data Type (Object)|Data Types]]''' to return results normalized in a way more beneficial to document classification.<section end="Output Extractor Key" />
<section begin="Output Extractor Key" />The '''''[[Output Extractor Key (Property)|Output Extractor Key]]''''' property is another weapon in the arsenal of powerful '''Grooper''' [[Classification (Concept)|classification]] techniques.  It allows {{DataTypeIcon}} '''[[Data Type (Object)|Data Types]]''' to return results normalized in a way more beneficial to document classification.<section end="Output Extractor Key" />


=== Paragraph Marking ===
=== Paragraph Marking ===
Line 597: Line 597:


=== Permission Sets ===
=== Permission Sets ===
<section begin="Permission Sets" />A '''''[[Permission Sets (Property)|Permission Set]]''''' is a property that allows you to restrict user access to repositories, pages, and certain activities. This helps eliminate the possibility of an unauthorized individual from editing or deleting information or [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batches]]'''.<section end="Permission Sets" />
<section begin="Permission Sets" />A '''''[[Permission Sets (Property)|Permission Set]]''''' is a property that allows you to restrict user access to repositories, pages, and certain activities. This helps eliminate the possibility of an unauthorized individual from editing or deleting information or {{BatchIcon}} '''[[Batch (Object)|Batches]]'''.<section end="Permission Sets" />


=== Preprocessing ===
=== Preprocessing ===
Line 603: Line 603:


=== Scope ===
=== Scope ===
<section begin="Scope" />The '''''[[Scope (Property)|Scope]]''''' property of a [[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Step]]''', as it relates to an '''''[[Activity (Property)|Activity]]''''', determines at which level in a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' hierarchy the '''''Activity''''' runs.<section end="Scope" />
<section begin="Scope" />The '''''[[Scope (Property)|Scope]]''''' property of a {{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Step]]''', as it relates to an '''''[[Activity (Property)|Activity]]''''', determines at which level in a {{BatchIcon}} '''[[Batch (Object)|Batch]]''' hierarchy the '''''Activity''''' runs.<section end="Scope" />


=== Secondary Types ===
=== Secondary Types ===
<section begin="Secondary Types" />'''''[[Secondary Types (Property)|Secondary Types]]''''' allow the application of multiple '''[[Content Type (Concept)|Content Types]]''' to a single [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Secondary Types" />
<section begin="Secondary Types" />'''''[[Secondary Types (Property)|Secondary Types]]''''' allow the application of multiple '''[[Content Type (Concept)|Content Types]]''' to a single {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder]]'''.<section end="Secondary Types" />


=== Tab Marking ===
=== Tab Marking ===
Line 622: Line 622:
</div>
</div>
== Section Extract Method ==
== Section Extract Method ==
<section begin="Section Extract Method" />The '''''Extract Method''''' property of a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' defines a "Section Extract Method" which specifies how section instances will be identified and extracted.<section end="Section Extract Method" />
<section begin="Section Extract Method" />The '''''Extract Method''''' property of a {{DataSectionIcon}} '''[[Data Section (Object)|Data Section]]''' defines a "Section Extract Method" which specifies how section instances will be identified and extracted.<section end="Section Extract Method" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Nested Table ===
=== Nested Table ===
<section begin="Nested Table" />'''''[[Nested Table (Section Extract Method)|Nested Table]]''''' is a "Section Extract Method" enabled for a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' using the '''''Extract Method''''' property. This method divides a document into sections by extracting table data within those sections. This gives '''Grooper''' users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' object).<section end="Nested Table" />
<section begin="Nested Table" />'''''[[Nested Table (Section Extract Method)|Nested Table]]''''' is a "Section Extract Method" enabled for a {{DataSectionIcon}} '''[[Data Section (Object)|Data Section]]''' using the '''''Extract Method''''' property. This method divides a document into sections by extracting table data within those sections. This gives '''Grooper''' users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single {{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''' object).<section end="Nested Table" />


=== Transaction Detection ===
=== Transaction Detection ===
<section begin="Transaction Detection" />'''''[[Transaction Detection (Section Extract Method)|Transaction Detection]]''''' is a [[image:GrooperIcon_DataSection.png]] '''[[Data Section (Object)|Data Section]]''' '''''Extract Method'''''.  This [[Data Extraction (Concept)|extraction]] method produces section instances by detecting repeating patterns of text around the '''Data Section's''' child [[image:GrooperIcon_DataField.png]] '''[[Data Field (Object)|Data Fields]]'''.<section end="Transaction Detection" />
<section begin="Transaction Detection" />'''''[[Transaction Detection (Section Extract Method)|Transaction Detection]]''''' is a {{DataSectionIcon}} '''[[Data Section (Object)|Data Section]]''' '''''Extract Method'''''.  This [[Data Extraction (Concept)|extraction]] method produces section instances by detecting repeating patterns of text around the '''Data Section's''' child {{DataFieldIcon}} '''[[Data Field (Object)|Data Fields]]'''.<section end="Transaction Detection" />
</div>
</div>
== Separation Provider ==
== Separation Provider ==
Line 634: Line 634:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Change in Value Separation ===
=== Change in Value Separation ===
<section begin="Change in Value Separation" />The '''''[[Change in Value Separation (Separation Provider)|Change in Value]]''''' '''''[[Separation Provider (Property)|Separation Provider]]''''' creates a new folder and separates every time an extracted value changes from one [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page]]''' to another.<section end="Change in Value Separation" />
<section begin="Change in Value Separation" />The '''''[[Change in Value Separation (Separation Provider)|Change in Value]]''''' '''''[[Separation Provider (Property)|Separation Provider]]''''' creates a new folder and separates every time an extracted value changes from one {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page]]''' to another.<section end="Change in Value Separation" />


=== Control Sheet Separation ===
=== Control Sheet Separation ===
<section begin="Control Sheet Separation" />'''''[[Control Sheet Separation (Separation Provider)|Control Sheet Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]''''' that uses '''Grooper''' [[image:GrooperIcon_ControlSheet.png]] '''Control Sheets''' to separate documents.<section end="Control Sheet Separation" />
<section begin="Control Sheet Separation" />'''''[[Control Sheet Separation (Separation Provider)|Control Sheet Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]''''' that uses '''Grooper''' {{ControlSheetIcon}} '''Control Sheets''' to separate documents.<section end="Control Sheet Separation" />


=== EPI Separation ===
=== EPI Separation ===
Line 643: Line 643:


=== ESP Auto Separation ===
=== ESP Auto Separation ===
<section begin="ESP Auto Separation" />'''''[[ESP Auto Separation (Separation Provider)|ESP Auto Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]''''' used for document separation.  It is unique in that it ''both'' separates ''and'' classifies documents at the same time.  It uses page-level classification training examples (among other things) to determine where to insert document folders in a [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]'''.<section end="ESP Auto Separation" />
<section begin="ESP Auto Separation" />'''''[[ESP Auto Separation (Separation Provider)|ESP Auto Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]''''' used for document separation.  It is unique in that it ''both'' separates ''and'' classifies documents at the same time.  It uses page-level classification training examples (among other things) to determine where to insert document folders in a {{BatchIcon}} '''[[Batch (Object)|Batch]]'''.<section end="ESP Auto Separation" />


=== Event-Based Separation ===
=== Event-Based Separation ===
Line 655: Line 655:


=== Undo Separation ===
=== Undo Separation ===
<section begin="Undo Separation" />'''''[[Undo Separation (Separation Provider)|Undo Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]'''''.  Instead of putting loose [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Pages]]''' into [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folders]]''', this '''''Separation Provider''''' removes '''Batch Folders''', leaving only loose pages.<section end="Undo Separation" />
<section begin="Undo Separation" />'''''[[Undo Separation (Separation Provider)|Undo Separation]]''''' is a '''''[[Separation Provider (Property)|Separation Provider]]'''''.  Instead of putting loose {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Pages]]''' into {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folders]]''', this '''''Separation Provider''''' removes '''Batch Folders''', leaving only loose pages.<section end="Undo Separation" />
</div>
</div>
== Service ==
== Service ==
Line 661: Line 661:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== API Services ===
=== API Services ===
<section begin="API Services" />You can perform [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' processing via [https://en.wikipedia.org/wiki/REST REST] [https://en.wikipedia.org/wiki/API API] web calls by installing  '''''[[API Services (Service)|API Services]]'''''.<section end="API Services" />
<section begin="API Services" />You can perform {{BatchIcon}} '''[[Batch (Object)|Batch]]''' processing via [https://en.wikipedia.org/wiki/REST REST] [https://en.wikipedia.org/wiki/API API] web calls by installing  '''''[[API Services (Service)|API Services]]'''''.<section end="API Services" />


=== Activity Processing ===
=== Activity Processing ===
<section begin="Activity Processing Service" />'''[[Activity Processing (Service)|Activity Processing]]''' is a [[Grooper Service (Concept)|Grooper Service]] that executes '''[[Activity (Property)|Activities]]''' assigned to [[image:GrooperIcon_BatchProcessStep.png]] '''[[Batch Process Step (Object)|Batch Process Steps]]''' in a [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]'''. This allows '''Grooper''' to automate '''Batch Steps''' that do not require a human operator.<section end="Activity Processing Service" />
<section begin="Activity Processing Service" />'''[[Activity Processing (Service)|Activity Processing]]''' is a [[Grooper Service (Concept)|Grooper Service]] that executes '''[[Activity (Property)|Activities]]''' assigned to {{BatchProcessStepIcon}} '''[[Batch Process Step (Object)|Batch Process Steps]]''' in a {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]'''. This allows '''Grooper''' to automate '''Batch Steps''' that do not require a human operator.<section end="Activity Processing Service" />


=== Grooper Licensing ===
=== Grooper Licensing ===
Line 674: Line 674:
</div>
</div>
== Table Extract Method ==
== Table Extract Method ==
<section begin="Table Extract Method" />The '''''[[Table Extract Method (Property)|Extract Method]]''''' property of a [[image:GrooperIcon_DataTable.png]] '''[[Data Table (Object)|Data Table]]''' sets a '''''Table Extract Method''''' which defines the settings and logic for the '''Data Table''' to perform [[Data Extraction (Concept)|extraction]].<section end="Table Extract Method" />
<section begin="Table Extract Method" />The '''''[[Table Extract Method (Property)|Extract Method]]''''' property of a {{DataTableIcon}} '''[[Data Table (Object)|Data Table]]''' sets a '''''Table Extract Method''''' which defines the settings and logic for the '''Data Table''' to perform [[Data Extraction (Concept)|extraction]].<section end="Table Extract Method" />
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Delimited Extract ===
=== Delimited Extract ===
Line 680: Line 680:


=== Fluid Layout ===
=== Fluid Layout ===
<section begin="Fluid Layout" />The '''''[[Fluid Layout (Table Extract Method)|Fluid Layout]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' will choose between '''''[[Tabular Layout (Table Extract Method)|Tabular Layout]]''''' and '''''Flow Layout''''' configurations, depending on how labels are collected for a [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Type]]'''.<section end="Fluid Layout" />
<section begin="Fluid Layout" />The '''''[[Fluid Layout (Table Extract Method)|Fluid Layout]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' will choose between '''''[[Tabular Layout (Table Extract Method)|Tabular Layout]]''''' and '''''Flow Layout''''' configurations, depending on how labels are collected for a {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Type]]'''.<section end="Fluid Layout" />


=== Grid Layout ===
=== Grid Layout ===
Line 689: Line 689:


=== Tabular Layout ===
=== Tabular Layout ===
<section begin="Tabular Layout" />The '''''[[Tabular Layout (Table Extract Method)|Tabular Layout]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' uses column header values determined by the [[image:GrooperIcon_DataColumn.png]] '''[[Data Column (Object)|Data Columns]]''''' '''''Header Extractor''''' results (or labels collected for the '''Data Columns''' when a '''''[[Labeling Behavior (Behavior)|Labeling Behavior]]''''' is enabled) as well as '''Data Column''' '''''Value Extractor''''' results to model a table's structure and return its values.<section end="Tabular Layout" />
<section begin="Tabular Layout" />The '''''[[Tabular Layout (Table Extract Method)|Tabular Layout]]''''' '''''[[Table Extract Method (Property)|Table Extract Method]]''''' uses column header values determined by the {{DataColumnIcon}} '''[[Data Column (Object)|Data Columns]]''''' '''''Header Extractor''''' results (or labels collected for the '''Data Columns''' when a '''''[[Labeling Behavior (Behavior)|Labeling Behavior]]''''' is enabled) as well as '''Data Column''' '''''Value Extractor''''' results to model a table's structure and return its values.<section end="Tabular Layout" />
</div>
</div>
== UI Element ==
== UI Element ==
Line 695: Line 695:
<div style="padding-left: 1.5em;">
<div style="padding-left: 1.5em;">
=== Document Viewer ===
=== Document Viewer ===
<section begin="Document Viewer" />The [[Document Viewer (UI Element)|Grooper Document Viewer]] is the portal to your documents. It is the UI that allows you to see a [[image:GrooperIcon_BatchFolder.png]] '''[[Batch Folder (Object)|Batch Folder's]]''' (or a [[image:GrooperIcon_BatchPage.png]] '''[[Batch Page (Object)|Batch Page's]]''') image, text content, and more.<section end="Document Viewer" />
<section begin="Document Viewer" />The [[Document Viewer (UI Element)|Grooper Document Viewer]] is the portal to your documents. It is the UI that allows you to see a {{BatchFolderIcon}} '''[[Batch Folder (Object)|Batch Folder's]]''' (or a {{BatchPageIcon}} '''[[Batch Page (Object)|Batch Page's]]''') image, text content, and more.<section end="Document Viewer" />


=== Node Tree ===
=== Node Tree ===
Line 704: Line 704:


=== Summary Tabs ===
=== Summary Tabs ===
<section begin="Summary Tabs" />[[image:GrooperIcon_ContentModel.png]] '''[[Content Model (Object)|Content Models]]''' and [[image:GrooperIcon_ContentCategory.png]] '''[[Content Category (Object)|Content Categories]]''' have a [[Summary Tabs (UI Element)|Summary]] tab where you can view "Descendant Node Types", [[image:GrooperIcon_DocumentType.png]] '''[[Document Type (Object)|Document Types]]''', and '''[[Expressions (Concept)|Expressions]]'''.<section end="Summary Tabs" />
<section begin="Summary Tabs" />{{ContentModelIcon}} '''[[Content Model (Object)|Content Models]]''' and {{ContentCategoryIcon}} '''[[Content Category (Object)|Content Categories]]''' have a [[Summary Tabs (UI Element)|Summary]] tab where you can view "Descendant Node Types", {{DocumentTypeIcon}} '''[[Document Type (Object)|Document Types]]''', and '''[[Expressions (Concept)|Expressions]]'''.<section end="Summary Tabs" />
</div>
</div>


Line 710: Line 710:


=== URL Endpoints for Review ===
=== URL Endpoints for Review ===
<section begin="URL Endpoints for Review" />Three different URL endpoints can be used to open '''''[[Review (Activity)|Review]]''''' tasks in the '''[[Web Client (Application)|Grooper Web Client]]''', given certain information like the '''Grooper''' '''''Repository ID''''', [[image:GrooperIcon_BatchProcess.png]] '''[[Batch Process (Object)|Batch Process]]''' name, [[image:GrooperIcon_Batch.png]] '''[[Batch (Object)|Batch]]''' '''''Id''''' and more. This allows Grooper users to link directly to a '''Batch''' in '''''Review''''' with a URL.<section end="URL Endpoints for Review" />
<section begin="URL Endpoints for Review" />Three different URL endpoints can be used to open '''''[[Review (Activity)|Review]]''''' tasks in the '''[[Web Client (Application)|Grooper Web Client]]''', given certain information like the '''Grooper''' '''''Repository ID''''', {{BatchProcesslIcon}} '''[[Batch Process (Object)|Batch Process]]''' name, {{BatchIcon}} '''[[Batch (Object)|Batch]]''' '''''Id''''' and more. This allows Grooper users to link directly to a '''Batch''' in '''''Review''''' with a URL.<section end="URL Endpoints for Review" />

Revision as of 15:56, 24 July 2024

Activity

Activity is a property on edit_document Batch Process Step objects. Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page.

Batch Process Steps configured with specific Activities are frequently referred by the name of the Activity followed by the word "step". For example: Classify Step.

Apply Rules

Apply Rules is an Activity that runs Data Rules on data that has already been extracted from a inventory_2 Batch. A Batch Process Step configured with the Apply Rules Activity will always need to be preceded by a Batch Process Step configured with the Extract Activity.

Classify

Classify is an Activity that "classifies" folder Batch Folders in a inventory_2 Batch by assigning them a Content Type using patterns, lexical understanding, or rules as defined by a stacks Content Model.

Clip Frames

The Clip Frames Activity extracts defined areas from microfiche card images, creating new image frames or layers for focused analysis or processing.

Correct

The Correct Activity performs spell correction on the textual content of folder Batch Folders or specific Data Elements, enhancing the accuracy of data extraction by resolving recognition errors.

Detect Frames

The Detect Frames Activity locates and identifies frame lines on microfiche card images, enabling the isolation of areas within the frames for further data extraction or processing.

Execute

The Execute Activity runs a specified child command, allowing for the modular and controlled execution of tasks within a larger automated workflow.

Export

The Export Activity facilitates the transfer of documents and extracted information to external systems or formats, completing the data processing workflow.

Extract

The Extract Activity retrieves relevant information, defined by Data Elements, from folder Batch Folders, transforming unstructured or semi-structured content into structured, usable data.

Image Processing

The Image Processing Activity enhances and optimizes contract Batch Pages for better recognition and data extraction results.

Initialize Card

The Initialize Card Activity prepares and configures microfiche card images for further processing.

Merge

The Merge Activity creates a document from the Page objects in your Batch and saves to a Batch Folder.

Recognize

The Recognize Activity interprets contract Batch Pages and folder Batch Folders, converting them into machine-readable text and capturing layout data for comprehensive analysis and data extraction. This will attach a text and/or layoutData file to the respective object.

Redact

The Redact Activity hides or "redacts" text information on the image or PDF of a document based on data returned from a configured Extractor. It does not alter the text data attached to the image or PDF.

Render

The Render Activity normalizes electronic document content from file formats Grooper cannot read innately to a PDF format. This allows Grooper to extract the text via the Recognize Activity.

Review

The Review Activity facilitates human evaluation and validation of processed folder Batch Folders and extracted data for accuracy and completeness.

Send Mail

The Send Mail Activity automates the dispatch of emails with or without attachments, based on settings Batch Process events and conditions.

Separate

The Separate Activity sorts contract Batch Pages into individual folder Batch Folders, distinguishing them for independent processing and organization.

Split Pages

Multi-page documents (typically PDFs and TIFFs) come into Grooper represented as single folder Batch Folders. The Split Pages Activity exposes contract Batch Pages as child objects of the Batch Folders for individualized processing and handling.

XML Transform

The XML Transform Activity applies XSLT stylesheets to XML data to modify or reformat the output structure for various purposes.

Application

A Grooper repository consists of a series of tables in a database, and a File Store containing relevant files associated to objects that exist within that database. A Grooper application is the interface by which a user can interact with that repository of information in an intuitive way.

Grooper Command Console

The Grooper Command Console is a command-line interface that performs system configuration and administration tasks within Grooper.

Web Client

The Grooper Web Client allows users to connect to Grooper via a web browser using a URL. The URL is pointed at a website hosted by a server on which Grooper is installed and Internet Information Services configured.

Behavior

Behaviors is a property of Content Types and Export Activities that defines configurable actions that automate processing tasks based on the identified Content Type of a folder Batch Folder.

Export Behavior

An Export Behavior defines the conditions and actions for exporting folder Batch Folders and their associated data from Grooper to other systems.

Labeling Behavior

A Labeling Behavior is a Content Type Behavior designed to collect and utilize a document's field labels in a variety of ways. This includes functionality for Classification and Extraction.

PDF Data Mapping

PDF Data Mapping is a Content Type Behavior designed to create an exportable PDF file with additional native PDF elements.

CMIS Connection Type

CMIS Connection Type, or "binding", establishes the communication protocols used to connect Grooper with content management systems adhering to the CMIS standard.

AppXtender

The AppXtender CMIS Connection Type, or "binding", connects Grooper to the ApplicationXtender content management system for import and export operations.

Box

The Box CMIS Connection Type, or "binding", connects Grooper to the Box content management system for import and export operations.

Exchange

The Exchange CMIS Connection Type, or "binding", connects Grooper to the Microsoft Exchange Server mail server for import and export operations.

FTP

The FTP CMIS Connection Type, or "binding", connects Grooper to FTP directories for import and export operations.

IMAP

The IMAP CMIS Connection Type, or "binding", connects Grooper to email messages and folders through an IMAP email server.

NTFS

The NTFS CMIS Connection Type, or "binding", connects Grooper to files and folders in the Microsoft Windows NTFS file system.

OneDrive

The OneDrive CMIS Connection Type, or "binding", connects Grooper to Microsoft OneDrive cloud services.

SFTP

The SFTP CMIS Connection Type, or "binding", connects Grooper to SFTP directories for import and export operations.

SharePoint

The SharePoint CMIS Connection Type, or "binding", connects Grooper to Microsoft SharePoint, providing access to content stored in "document libraries" and "picture lLibraries".

Classification Method

The Classification Method property determines the technique used for document classification within a stacks Content Model, enabling the sorting of folder Batch Folders into categories based on their content or structure. It can utilize pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.

GPT Embeddings

The GPT Embeddings Classification Method is an OpenAI GPT training-based classification approach that uses "embeddings" to tell one document from another.

Labelset-Based

Labelset-Based is a Classification Method that leverages the labels defined via a Labeling Behavior to classify folder Batch Folders.

Lexical

The Lexical Classification Method classifies folder Batch Folders based on their text content by utilizing either pre-configured training or rules. This is achieved through the analysis of word frequencies or defined rules that identify document types.

Rules-Based

The Rules-Based Classification Method employs defined "rules" on description Document Types to classify folder Batch Folders, utilizing Positive Extractor and Negative Extractor properties to accurately categorize them through rule application, thereby ensuring Batch Folders match predefined criteria.

Visual

The Visual Classification Method uses image data instead of text data to determine the description Document Type assigned to a folder Batch Folder during classification. Instead of using text-based extractors, an perm_media IP Profile is used with an Extract Features IP Command to obtain data pertaining to a Batch Folder's image(s). Document samples are trained as examples of a Document Type.

Collation Provider

The Collation property of a pin Data Type defines the method for converting its raw results into a final result set, governing how lists of matches from the Data Type are combined and interpreted to produce the output data of the Data Type.

AND

The AND Collation Provider of a pin Data Type returns results only when each individual extractor specified within it gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.

Array

The Array Collation Provider of a pin Data Type matches a list of values arranged in horizontal, vertical, or flow order, combining instances that qualify into a single result.

Combine

The Combine Collation Provider of a pin Data Type combines instances from returned results based on a specified grouping, controlling how extractor results are assembled together for output.

Key-Value List

The Key-Value List Collation Provider of a pin Data Type matches instances where a key and a list of one or more values appear together on the document, adhering to a specific layout pattern.

Key-Value Pair

The Key-Value Pair Collation Provider of a pin Data Type matches instances where a key is paired with a value on the document in a specific layout, essential for extracting label-value pairs.

Multi-Column

The Multi-Column Collation Provider of a pin Data Type combines multiple columns on a page into a single column for extraction.

Ordered Array

The Ordered Array Collation Provider of a pin Data Type finds sequences of values where one result is present for each extractor, in the order they appear.

Pattern-Based

The Pattern-Based Collation Provider of a pin Data Type uses regular expressions to sequence returned results into a final result set.

Split

The Split Collation Provider of a pin Data Type separates a data instance at each match returned by the Data Type.

Concept

There are many objects and properties a user can configure in Grooper, however, gaining an understanding how, why, and when to use these objects and properties is powered by one's understanding of the underlying concepts that define what what these objects and properties are doing and why.

Activity Processing

Activity Processing is a conceptual term that refers to the execution of a sequence of configured tasks, such as classification, extraction, or data enhancement on documents, which are performed within a Template:BatchProcesslIcon Batch Process to transform raw data from documents into structured and actionable information.

CMIS+

CMIS+ is a conceptual term that refers to Grooper's CMIS+ architecture that provides a standardized access to document content and metadata across a variety of external storage platforms.

CMIS

CMIS is a conceptual term that refers to Content Management Interoperability Services: an open standard allowing different content management systems to share information over the Internet.

CMIS Query

CMIS Query is a conceptual term that refers to the fact that CMIS Queries are utilized to search documents in CMIS Repositories and to filter documents upon import when using the Import Query Results Import Provider.

CSS Data Viewer Styling

CSS Data Viewer Styling refers to using CSS to custom style the Review activity's Data Viewer interface. This gives you a great deal of control over a data_table Data Model's appearance and layout during document review.

Classification

Classification is a conceptual term that refers to the process of identifying and organizing documents into categorical types based on their content or layout, often using machine learning, rules, or pattern recognition for efficient document management and data extraction workflows. Specifically, the Classify Activity will assign a Content Type to a folder Batch Folder.

Code Expressions

Code Expressions (not to be confused with regular expressions) is a conceptual term that refers to snippets of VB.Net code that expand Grooper's core functionality.

Combined Methods

Combined Methods is a conceptual term that refers to the idea that a user can leverage multiple Classification Methods to overcome the shortcomings of an individual method.

Content Type

Content Type is a conceptual term that refers to the grouping of three Grooper objects: stacks Content Models, collections_bookmark Content Categories, and description Document Types.

Data Context

Data Context is a conceptual term that gives definition to data that, without it, is otherwise meaningless.

Data Element

Data Element is a conceptual term that refers to the grouping of five Grooper objects: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Data Extraction

Data Extraction is a conceptual term that involves identifying and capturing specific information from folder Batch Folders like forms or invoices using a set of configurable Data Extractors, which transform unstructured or semi-structured data into a structured, usable format for processing and analysis.

Data Extractor

Data Extractor (or just "extractor") refers to all Extractor Types and extractor objects. Extractors define the logic used to return data from a document's text content, including general data (such as a date) and specific data (such as an agreement date on a contract).

Data Instance

Data Instance is a conceptual term that refers to an encapsulation of text data within a document. Data instances are the hierarchy of text data that Grooper's extraction mechanisms create.

EDI Integration

EDI Integration is a conceptual term that refers to Grooper's ability to process EDI files.

Expressions

Expressions (not to be confused with regular expressions) is a conceptual term that refers to snippets of VB.Net code that expand Grooper's core functionality.

Expressions Cookbook

Expressions Cookbook is a conceptual term that refers to a reference list for commonly used expressions in Grooper.

Field Mapping

Field Mapping is a conceptual term that refers to how logical connections are made between metadata content in Grooper and an external storage platform.

Five Phases of Grooper

Five Phases of Grooper is a conceptual term that seeks to build understanding of how documents are processed through Grooper.

Flow Collation

Flow Collation is a conceptual term used to define a type of layout used in Collation Providers of pin Data Types.

Footer Rows and Footer Modes

Footer Rows and Footer Modes is a conceptual term that refers to how a "footer row" (enabled by the Generate Footer Row property of a table Data Table) provides Grooper users a quick way to validate numerical data in a view_column Data Column. The Data Column's Footer Mode property controls if and how a total is determined for numerical values in a Data Column.

Fuzzy RegEx

Fuzzy RegEx is a conceptual term that refers to the usage of fuzzy logic within Extractor Types that leverage regular expressions to match patterns via the enabling of the Fuzzy Matching' property.

GPT Integration

GPT Integration is a conceptual term that refers to the usage of OpenAI's GPT models within Grooper to enhance the capabilities of data extractors, classification, and lookups.

Grooper Infrastructure

Grooper Infrastructure is a conceptual term that refers to computing underpinnings of what makes up a Grooper repository and the software that allows interface with it.

Grooper Repository

Grooper Repository is a conceptual term that refers to the environment used to create, configure and execute objects in Grooper. It provides the framework to "do work" in Grooper.

Grooper Service

Grooper Services are various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Config.

Image Processing

Image Processing is a conceptual term that refers to how Grooper applies a variety of techniques to enhance scanned documents' quality, improving OCR accuracy by removing imperfections and adjusting visual characteristics to prepare images for data extraction and classification.

Import Mode and Document Linking

Import Mode and Document Linking is a conceptual term that refers to the usage of the Import Mode property. This affects whether or not an imported document maintains a link to its original file and/or if a copy of the file is made on import or not.

LINQ to Grooper Objects

LINQ to Grooper Objects is a conceptual term that refers to the ability of Grooper to leverage LINQ syntax in expressions.

Layered OCR

Layered OCR is a conceptual term that refers to the usage of the Layered OCR setting of the OCR Engine property of an library_books OCR Profile. The use of this setting enables the usage of secondary OCR Profiles on a single page. The OCR results from these secondary OCR Profiles are merged with (or layered on top of) the primary OCR Profile's results.

Layout Data

Layout Data is a conceptual term that refers to information such as line locations, OMR checkbox locations and states, barcode values, and detected shapes captured by certain image processing commands. This data is stored as an attached file on a folder Batch Folder or contract Batch Page object and can later be recalled by various functions within Grooper that rely on the presence of that data to function.

Microfiche Processing

Microfiche Processing is a conceptual term that refers to how Grooper leverages several IP Commands to accurately process microform documents.

Microsoft Office Integration

Microsoft Office Integration is a conceptual term that refers to Grooper's ability to convert Microsoft Word and Microsoft Excel files into formats that Grooper can read.

OCR

OCR is a conceptual term that stands for Optical Character Recognition. It allows text from paper documents to be digitized, in order to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text.

OCR Synthesis

OCR Synthesis is a conceptual term that refers to Grooper's unique method of pre-processing and re-processing raw results from the OCR Engine to get better results out of it.

Object Nomenclature

Object Nomenclature is a conceptual term that refers to the idea that mastery of a Grooper environment is greatly enhanced by understanding the myriad of objects that can exist and how they are related.

PDF Page Types

PDF Page Types is a conceptual term that refers to specific types of PDF pages. Page types describe the kind of content in a PDF page and informs Grooper how certain Activities should process the page. For example, "single image" pages are OCR'd by the Recognize activity where "text only" pages have their native text extracted.

Regular Expression

Regular Expression is a conceptual term that refers to a standard syntax designed to parse text strings. This is a way of finding information in a block of text. It is the primary method by which Grooper extracts and returns data from documents.

Repository

Repository is a conceptual term that refers to a location where files and/or data is stored and managed.

Separation

Separation is a conceptual term that refers to the process of taking an unorganized inventory_2 Batch of loose contract Batch Pages and organizing them into document folders. This is done so Grooper can later assign a Document Type to each document folder in a process known as Classification.

TF-IDF

TF-IDF is a conceptual term that refers to (term frequency-inverse document frequency), a numerical statistic intended to reflect how important a word is to a document within a collection (or document set or corpus). It is how Grooper uses machine learning for training-based document classification (via the Lexical method) and data extraction (via the input Field Class extractor).

Table Extraction

Table Extraction is a conceptual term that refers to Grooper's functionality to extract data from cells in tables. This is accomplished by configuring the table Data Table and its child view_column Data Column Data Elements in a data_table Data Model.

Test Batch

Test Batch is a conceptual term that refers to any inventory_2 Batch created in the Test folder of the Batches folder in the Node Tree).

Thread

Thread is a conceptual term that refers to the smallest unit of processing that can be performed within an operating system.

Training-Based Approaches to Document Classification

Training-Based Approaches to Document Classification is a conceptual term that refers to an approach to document classification that classifies folder Batch Folders according to the similarity of unclassified Batch Folders to trained examples of that kind of Document Type.

Training Batch

Training Batch is a conceptual term that refers to a more convenient way to work with all of the samples a stacks Concent Model has been trained against. You can also still look at the Form Types underneath each Content Type, but the Training Set can show you all the samples in one place.

UNC Path

UNC Path is a conceptual term that refers to UNC (Universal Naming Convention) which is a standard used in Microsoft Windows for accessing shared network folders.

Waterfall Classification

Waterfall Classification is a conceptual term that refers to a classification notion in Grooper that manipulates the Positive Extractor property to prioritize training similarity in order to achieve a middle ground between high specificity and accuracy, and generality with minimal accuracy. This is helpful whenever folder Batch Folders get misclassified, and simply retraining won't help.

XML Schema Integration

XML Schema Integration is a conceptual term that refers to Grooper's ability to interact with XML schemas and the configuration required to do so.

Export Definition

Export Definitions is a property of Export Behaviors as defined on Content Types or Export Activities. It defines export connectivity to external systems such as file systems, content management repositories, databases, mail servers, etc.

CMIS Export

CMIS Export is an Export Definition available when configuring an Export Behavior. It exports content over a cloud CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.

Data Export

Data Export is an Export Definition available when configuring an Export Behavior. It exports extracted document data over a database Data Connection, allowing users to export data to a Microsoft SQL Server or ODBC compliant database.

Extractor Type

Extractor Type, or value extractor, is a property on a wide array of objects that goes by many different names. It defines a primitive operator which reads data values from the text or visual content of a document. Extractor Types are consumed by higher-level objects such as Data Elements, extractor objects, Content Types and more.

Detect Signature

The Detect Signature Extractor Type detects signatures within a specified rectangular region on a document page by measuring the fill percentage, providing a method to identify and validate the presence of handwritten signatures.

Field Match

The Field Match Extractor Type matches the value stored in a previously-extracted variables Data Field or view_column Data Column, allowing for consistency and reference across different parts of a document or dataset.

Find Barcode

The Find Barcode Extractor Type searches the folder Batch Folder layout data for a barcode, capturing its value upon detection.

GPT Complete

The GPT Complete Extractor Type leverages OpenAI's GPT model to generate completions for inputs, returning one hit for each result choice provided by the model's response.

Highlight Zone

The Highlight Zone Extractor Type sets a highlight region on a document without performing any actual data extraction, effectively marking areas of interest or importance.

Label Match

The Label Match Extractor Type matches a list of one or more label values using matching options defined by a Labeling Behavior. It works similarly to List Match, but uses shared settings defined in a Labeling Behavior for Fuzzy Matching, Vertical Wrap, and Constrained Wrap.

Labeled OMR

The Labeled OMR Extractor Type is used to output OMR checkbox labels. It determines whether labeled checkboxes are checked or not. If checked, it outputs the label(s) as the result.

Labeled Value

The Labeled Value Extractor Type identifies and extracts information from a field presented as a label-value pair on a document, by matching a set of labels and a set of values, and determining pairs based on their geometric clustering on the document.

List Match

The List Match Extractor Type is designed to return values matching one or more items in a defined list. By default, the List Match extractor does not use or require regular expression.

Ordered OMR

The Ordered OMR Extractor Type is similar to a Labeled OMR in that it is used to return OMR check box information. Rather than relying on a label for the extraction, the Ordered OMR returns information for multiple check boxes within a given zone based on their order and layout.

Pattern Match

The Pattern Match Extractor Type extracts values from a document that match a specified regular expression, allowing for the detection of data following a known format or pattern.

Query HTML

The Query HTML Extractor Type queries an HTML document using a CSS selector and returns the inner text of each matching element.

Read Barcode

The Read Barcode Extractor Type uses barcode recognition technology to read and extract values from barcodes found in the document content.

Read Meta Data

The Read Meta Data Extractor Type retrieves metadata values associated with a document.

Read Zone

The Read Zone Extractor Type allows you to extract text data in a rectangular region (called a "extraction zone" or just "zone") on a document. This can be a fixed zone, extracting text from the same location on a document, or a zone relative to an extracted text anchor or shape location on the document.

Reference

The Reference Extractor Type allows for the referencing of an external extractor object to be used within a Grooper object's configuration, enabling consistent extraction logic across different objects.

Word Match

The Word Match Extractor Type extracts individual words or phrases containing multiple words from documents. It is designed to collect full words and is often used in n-gram extraction.

Zonal OMR

The Zonal OMR Extractor Type reads one or more checkboxes using manually-configured zones. It is mostly an outdated tool and should only be used if all other OMR extractor options have been exhausted. It requires the most manual setup of any OMR extractor to configure.

Fill Method

The Fill Method property on data_table Data Models, insert_page_break Data Sections, and table Data Tables is a collection of various mechanisms that allow for the population of descendant Data Elements of Data Models, Data Sections, and Data Tables (which can be referred to as "containers"). Fill Methods are secondary extraction operations which populate descendant Data Elements as they run after normal extraction.

AI Extract

AI Extract is a Fill Method that leverages a Large Language Model (LLM) to quickly and easily return extraction results to the child elements of data_table Data Models, insert_page_break Data Sections, and table Data Tables by using the .json structure of the relavent Data Elements as part of the instruction set to the LLM.

IP Command

The Command property of an image IP Step object in Grooper specifies the Image Processing (IP) command to be executed for that specific step as part of an perm_media IP Profile.

Barcode Detection

The Barcode Detection IP Command detects and reads barcode data. The detected barcode information is stored as part of the object's layout data.

Binarize

The Binarize IP Command converts a color or grayscale image to black and white using various thresholding methods.

Extract Page

The Extract Page IP Command removes an image from a carrier image while simultaneously removing any image warping or skewing.

Line Removal

The Line Removal IP Command removes horizontal and vertical lines from documents.

Scratch Removal

The Scratch Removal IP Command detects and removes or repairs scratches from film-based images.

Shape Detection

The Shape Detection IP Command detects shapes on a document matching sample images given by the user.

Shape Removal

The Shape Removal IP Command detects and removes shapes from documents.

Import Provider

The Provider property is a selection of Import Providers which enable import of file-based content from a variety of sources such as file systems, mail servers, and content repositories.

CMIS Import

The CMIS Import Import Provider used to import content over a cloud CMIS Connection, allowing users to import from various on-premise and cloud based storage platforms.

Import Descendants

Import Descendants is one of two Import Provider that use cloud CMIS Connections to import document content into Grooper.

Import Query Results

Import Query Results is one of two Import Provider that use cloud CMIS Connections to import document content into Grooper.

Lookup

The Lookups property is a list of lookup operations to be performed on child elements of the associated container. Each Lookup specification defines a lookup operation, where the value of one or more Grooper fields will be used to query an external data source, such as a database. The results of the query can be used to validate existing field values or populate additional field values.

CMIS Lookup

CMIS Lookup is a Lookup Specification that performs a lookup against a settings_system_daydream CMIS Repository via a CMISQL Query.

Database Lookup

Database Lookup is a Lookup Specification that performs a lookup against a database Data Connection via a SQL query.

GPT Lookup

GPT Lookup is a Lookup Specification that performs a lookup using an OpenAI GPT model.

Web Service Lookup

Web Service Lookup is a Lookup Specification that looks up external data at an API endpoint by calling a web service.

Object

In Grooper, objects are defined as configurable elements within its hierarchical tree structure. These include nodes and embedded objects that can be manipulated and edited to define the system's behavior, create workflows, and manage content.

Batch

inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as Template:BatchProcesslIcon Batch Processes.

Batch Folder

folder Batch Folder objects are defined as container objects within a inventory_2 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or contract Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.

  • Batch Folders are frequently referred to simply as "documents".

Batch Page

contract Batch Page objects represent individual pages within a inventory_2 Batch. The Batch Page object is the most granular unit in the hierarchy of Batch Objects in Grooper.

  • Batch Pages are frequently referred to simply as "pages".

Batch Process

Template:BatchProcesslIcon Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each inventory_2 Batch of documents is managed systematically and efficiently.

  • Batch Processes by themselves do nothing. Instead, the workflows they execute are designed by adding child edit_document Batch Process Steps.
  • A Batch Process is often referred to as simply a "process".

Batch Process Step

edit_document Batch Process Step objects are specific actions within the sequence defined by a Template:BatchProcesslIcon Batch Process. A Batch Procsess Step plays a critical role in automating and managing the flow of documents through the various stages of processing within Grooper.

  • Batch Process Steps are frequently referred to as simply "steps".
  • Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".

CMIS Connection

cloud CMIS Connection objects provide a standardized way of connecting to various content management systems (CMS). These objects allow Grooper to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of Grooper's immediate environment.

  • For those that support the CMIS standard, the CMIS Connection connects to the CMS using the CMIS standard.
  • For those that do not, the CMIS Connection normalizes connection and transfer protocol as if they were a CMIS platform.

CMIS Repository

settings_system_daydream CMIS Repository objects in Grooper allow access to external documents through a cloud CMIS Connection. They allows managing and interacting with those documents within Grooper's framework as if they were local. They are created as a child object of a CMIS Connection and used for various Activities.

Content Category

collections_bookmark Content Category objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.

Content Model

stacks Content Model objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.

Data Column

view_column Data Column objects are child objects of a table Data Table, representing individual columns and defining the type of data each column holds along with its data extraction properties.

Data Connection

database Data Connection objects define the settings for connecting to and interacting with a database. These interactions may include conducting lookups, exports, or other actions that relate to database management systems (DBMS). Once configured, a Data Connection object can be referenced by other components in Grooper for various DBMS-related activities.

Data Field

variables Data Field objects are created as child objects of a data_table Data Model. A Data Field is a representation of a single piece of data targeted for extraction on a document.

Data Fields are frequently referred to simply as "fields".

Data Model

data_table Data Model objects serve as the top-tier structure defining the taxonomy for Data Elements and are leveraged during the Extract Activity to extract data from a folder Batch Folders. They are a hierarchy of Data Elements that sets the stage for the extraction logic and review of data collected from documents.

Data Rule

flowsheet Data Rule objects define the logic for automated data manipulation which occurs after data has been extracted from folder Batch Folders. These rules are applied to normalize or otherwise prepare data collected in a data_table Data Model for downstream processes. Data Rules ensure that extracted data conforms to expected formats or meets certain quality standards.

Data Section

insert_page_break Data Section objects are grouping mechanisms for related variables Data Fields. Data Sections organize and segment child Data Elements into logical divisions of a document based on the structure and semantics of the information the documents contain.

Data Table

table Data Table objects are utilized for extracting repeating data that's formatted in rows and columns, allowing for complex multi-instance data organization that would be present in table-formatted content.

Data Type

pin Data Type objects hold a collection of child, referenced, and locally defined Data Extractors and settings that manage how multiple (even differing) matches from Data Extractors are consolidated (via Collation) into a result set.

Document Type

description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.

Field Class

input Field Class objects are trainable extractors that distinguish between multiple instances of similar data within a document by understanding the context in which they occur. Field Classes can be configured to distinguish values within highly structured documents, but this type of extraction is better suited to simpler "Extractor Objects" like quick_reference_all Value Readers or pin Data Types.

Field Classes are most useful when attempting to find values within the flow of natural language. This method involves training with positive and negative examples to distinguish the right context. You'd opt for a Field Class when the value you're after is an entire clause within a contract, or a specific value defined within the flow of text.

File Store

hard_drive File Store objects define a storage location within Grooper where file content associated with nodes are saved. They are crucial for managing the content that forms the basis of the Grooper's processing tasks, allowing for the storage and retrieval of documents, images, and other "files". Not every object in Grooper will have files connected to it, but if it does, those files are stored in the location defined by this object.

Form Type

two_pager Form Type objects represent trained variations of a description Document Type. These objects store machine learning training data for Lexical and Visual document classification methods.

IP Group

gallery_thumbnail IP Group objects are child objects within perm_media IP Profiles that create a hierarchical structure for organizing image processing commands. IP Groups may contain other IP Groups or image IP Step objects.

IP Profile

perm_media IP Profile objects detail the operations and parameters for image enhancement and cleanup. These operations improve the accuracy of further processing steps, like the Recognize and Classify Activities.

IP Step

image IP Step objects are the basic units within an perm_media IP Profile that define a single image processing operation. IP Steps are performed sequentially within their parent gallery_thumbnail IP Group or IP Profile.

Lexicon

dictionary Lexicon objects are dictionary objects that store a list of keys or key-value pairs. Lexicons can define local entries and/or import entries from other Lexicons and even import entries using a Data Connection. The entries in a Lexicon can be utilized in different areas of Grooper, such as data extraction, Fuzzy Matching, or OCR Correction, providing a reference point that enhances the accuracy and consistency of the software's operations.

Machine

computer Machine objects represent servers that have connected to the Grooper repository. They allow for the management of Grooper Service instances and serve as a connection points for processing jobs to be executed on the server hardware. Machine objects are essential for the scaling of processing capabilities and for distributing processing loads across multiple servers.

OCR Profile

library_books OCR Profile objects configure the settings for optical character recognition (OCR) leveraged by the Recognize activity. OCR converts images of text into machine-encoded text. OCR Profile objects influence how effectively textual content is recognized and from contract Batch Pages.

Object Library

extension Object Library objects are .NET libraries that contain code files for customizing the functionality of Grooper. These libraries are used for a range of customization and integration tasks, allowing users to extend Grooper's capabilities.

Examples include:
  • Adding custom activities that execute within Batch Processes
  • Creating custom commands available during the Review Activity and in the Design page.
  • Defining custom methods that can be called from expressions on Data Field and Batch Process Step objects
  • Establish custom services that perform automated background tasks at regular intervals

Processing Queue

memory Processing Queue objects are designed for tasks performed by computer Machines, which include automated steps in the document processing lifecycle. Processing Queues are used to distribute machine tasks among different servers and control the concurrency or processing rate of these tasks.

  • For example, activities such as Render or Export can be managed so that only one activity instance runs per machine or so multiple instances are processed concurrently, according to the queue configuration.

Project

package_2 Project objects are collections of resources and serve as the primary containers for design components within Grooper. The Project object is where various processing objects such as stacks Content Models, Template:BatchProcesslIcon Batch Processes, Profile Objects, and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.

Resource File

A Resource File object in Grooper is essentially a file that is stored as part of a Grooper package_2 Project. It can include various types of files such as text files or XML schema files.

Review Queue

person_play Review Queue objects are designated for human-performed tasks. They organizes the Review tasks that require human attention and can distribute these tasks among different groups of users based on the queue's settings. Review Queues can be assigned on the Template:BatchProcesslIcon Batch Process level to filter work by an entire process or Review Activities at the edit_document Batch Process Step level to filter tasks at a more granular step-based level.

Root

The database Root object represents the topmost element of the Grooper repository. It serves as the starting point from which all other objects branch out. It is the anchor point for all other structures within the repository and a necessary element for the organization and linkage of all other objects within Grooper.

Scanner Profile

scanner Scanner Profile objects outline the specifications for scanning physical documents into digital forms. This includes settings like resolution, color mode, and any post-scan image processing or enhancement functions.

See Desktop Scanning in Grooper for more information.

Separation Profile

insert_page_break Separation Profile objects contain rules and settings that determine how groupings of scanned pages are separated into individual folder Batch Folders, often using barcodes, blank pages, or patch codes as indicators for separation points.

Value Reader

quick_reference_all Value Reader objects define a single data extraction operation. You set the Extractor Type on the Value Reader that matches the specific data you're aiming to capture. For example, you would use the Pattern Match Extractor Type to return data using regular expression. You would use a Value Reader when you need to extract a single result or list of simple results from a document.

Property

A property is a mechanism by which an object in Grooper is configured that affects how the object performs its function.

Alignment

Alignment is a grouping of properties found on Fill Methods and Data Elements that manipulate the prompt provided to an LLM chatbot in an attempt to provide accurate highlighting of values displayed within the Document Viewer.

Confidence Multiplier and Output Confidence

Some results carry more weight than others. The Confidence Multiplier and Output Confidence properties allow you to manually adjust an extraction result's confidence.

Constrained Wrap

The Constrained Wrap property allows certain Extractor Types and the Labeling Behavior to match values which wrap from one line to the next inside a box (such as a table cell).

Content Type Filter

The Content Type Filter property restricts Activities to specific collections_bookmark Content Categories and/or description Document Types.

Document Quoting

Document Quoting is a property of the AI Extract Fill Method that limits the text fed to the AI to reduce the amount of tokens consumed. Controlling specifically what is given can not only reduce the monetary cost of using the AI, but also the time cost of running the Fill Method.

OCR Engine

An OCR Engine is the part of OCR software that does the actual character recognition, analyzing the pixels on an image and figuring out what characters they represent. This raw result can be further processed using Grooper's OCR Synthesis capabilities, producing the final OCR result used by Data Extractors to match text in a document and return the result.

Output Extractor Key

The Output Extractor Key property is another weapon in the arsenal of powerful Grooper classification techniques. It allows pin Data Types to return results normalized in a way more beneficial to document classification.

Paragraph Marking

Paragraph Marking alters the normal text data in a document by placing the carriage return and new line feed pairs at the end of each paragraph, instead of the end of each line. This allows users to break up a document's text flow into segments of paragraphs instead of segments of lines.

Parameters

Parameters is a colleciton of properties used in the configuration of LLM constructs. Temperature, TopP, Presence Penalty, and Frequency Penalty are parameters that influence text generation in models. Temperature and TopP control the diversity and probability distribution of generated text, while Presence Penalty and Frequency Penalty help manage repetition by discouraging the reuse of words or phrases.

Permission Sets

A Permission Set is a property that allows you to restrict user access to repositories, pages, and certain activities. This helps eliminate the possibility of an unauthorized individual from editing or deleting information or inventory_2 Batches.

Preprocessing

The Preprocessing grouping of properties consists of settings that adjust how text is formatted and interpreted before any Data Extraction process begins. These properties are crucial for ensuring that the text data is in the most optimal format for subsequent extraction tasks, which could involve complex regular expressions or precise data parsing.

Scope

The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

Secondary Types

Secondary Types allow the application of multiple Content Types to a single folder Batch Folder.

Tab Marking

Tab Marking allows you to insert tab characters into a document's text data.

Vertical Wrap

Vertical Wrap is a property of certain Extractor Types and a Content Type's Labeling Behavior used to provide simplified extraction of vertically wrapped text (typically stacked labels).

Repository Option

The Options property of the database Root object is a collection of optional features that affect the entire repository. These optional features enable entire collections of functionality that otherwise do not work without first establishing the connections these options provide.

LLM Connector

LLM Connector is a Repository Option that enables OpenAI-based functionality for the local Grooper repository.

Section Extract Method

The Extract Method property of a insert_page_break Data Section defines a "Section Extract Method" which specifies how section instances will be identified and extracted.

Nested Table

Nested Table is a "Section Extract Method" enabled for a insert_page_break Data Section using the Extract Method property. This method divides a document into sections by extracting table data within those sections. This gives Grooper users a method for extracting hierarchical tables as well as dividing up a document into sections where each of those sections have the same table (or at least tabular data which can be extracted by a single table Data Table object).

Transaction Detection

Transaction Detection is a insert_page_break Data Section Extract Method. This extraction method produces section instances by detecting repeating patterns of text around the Data Section's child variables Data Fields.

Separation Provider

The Provider property of the Separate Activity defines the type of separation to be performed at the designated Scope.

Change in Value Separation

The Change in Value Separation Provider creates a new folder and separates every time an extracted value changes from one contract Batch Page to another.

Control Sheet Separation

Control Sheet Separation is a Separation Provider that uses Grooper document_scanner Control Sheets to separate documents.

EPI Separation

The EPI Separation Separation Provider uses embedded page information ("EPI") to Separate loose pages into document folders. A Data Extractor is used to find page numbers from the text on a page and Grooper uses this information to separate the pages.

ESP Auto Separation

ESP Auto Separation is a Separation Provider used for document separation. It is unique in that it both separates and classifies documents at the same time. It uses page-level classification training examples (among other things) to determine where to insert document folders in a inventory_2 Batch.

Event-Based Separation

Event-Based Separation is a Separation Provider that Separates documents using one or more "Separation Events". Each Separation Event triggers the creation of a new folder.

Multi Separator

The Multi Separator Separation Provider performs separation using multiple Separation Providers. It allows users to create a list of any of the other Separation Providers. If the first provider on the list fails to separate a page (or, as more often is the case, a series of pages), the next one will be applied. If that fails, the next, and so on.

Pattern-Based Separation

Pattern-Based Separation is a Separation Provider that creates a new document folder every time a value returned by a defined pattern is encountered on a page.

Undo Separation

Undo Separation is a Separation Provider. Instead of putting loose contract Batch Pages into folder Batch Folders, this Separation Provider removes Batch Folders, leaving only loose pages.

Service

Grooper Service is a conceptual term that refers to the various executable programs that run as a Windows Services to facilitate Grooper processing. Service instances are installed, configured, started and stopped using Grooper Config.

API Services

You can perform inventory_2 Batch processing via REST API web calls by installing API Services.

Activity Processing

Activity Processing is a Grooper Service that executes Activities assigned to edit_document Batch Process Steps in a Template:BatchProcesslIcon Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.

Grooper Licensing

Grooper Licensing is a Grooper Service that distributes licenses to multiple workstations running Grooper applications.

Import Watcher

An Import Watcher Service schedules and runs import jobs. It periodically executes an Import Provider to query or poll for documents that meet specific criteria. When the matching documents are found, they are imported into Grooper. Afterward, the imported objects are moved, deleted, or modified to prevent repeated imports in the next polling cycle. This ensures that the same set of files is not imported over and over again."

Table Extract Method

The Extract Method property of a table Data Table sets a Table Extract Method which defines the settings and logic for the Data Table to perform extraction.

Delimited Extract

The Delimited Extract Table Extract Method extracts tabular data from a delimiter-separated text file, such as a CSV file.

Fluid Layout

The Fluid Layout Table Extract Method will choose between Tabular Layout and Flow Layout configurations, depending on how labels are collected for a description Document Type.

Grid Layout

The Grid Layout Table Extract Method uses the positional location of row and column headers to interpret where a tabular grid would be around each value in a table and extract values from each cell in the interpreted grid.

Row Match

The Row Match Table Extract Method uses regular expression pattern matching to determine a tables structure based on the pattern of each row and extract cell data from each column.

Tabular Layout

The Tabular Layout Table Extract Method uses column header values determined by the view_column Data Columns Header Extractor results (or labels collected for the Data Columns when a Labeling Behavior is enabled) as well as Data Column Value Extractor results to model a table's structure and return its values.

UI Element

A UI Element is a portion of the Grooper interface that allows users to interact with or otherwise receive information about the application.

Document Viewer

The Grooper Document Viewer is the portal to your documents. It is the UI that allows you to see a folder Batch Folder's (or a contract Batch Page's) image, text content, and more.

Node Tree

The Node Tree is the hierarchical list of objects found in the left panel in the "Design" page. It is the basis for navigation and creation in Design.

Overrides

Overrides is a tab provided to allow overriding of default properties set to a Data Element.

Summary Tabs

stacks Content Models and collections_bookmark Content Categories have a Summary tab where you can view "Descendant Node Types", description Document Types, and Expressions.

Miscellaneous Features

URL Endpoints for Review

Three different URL endpoints can be used to open Review tasks in the Grooper Web Client, given certain information like the Grooper Repository ID, Template:BatchProcesslIcon Batch Process name, inventory_2 Batch Id and more. This allows Grooper users to link directly to a Batch in Review with a URL.