2023:Overrides (UI Element): Difference between revisions

← Older edit Newer edit →

Revision as of 10:14, 27 August 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

2025

2023

Overrides is a tab provided to allow overriding of default properties set to a Data Element.

FYI

The ability to override Data Model and Data Element property configurations by Document Type has evolved throughout Grooper's version history.

Prior to version 2023, "Overrides" were referred to as "Data Element Overrides". If you see older documentation referring to "Data Element Overrides", you may assume it is talking about "Overrides".

You may download and import the file(s) below into your own Grooper environment (version 2023). There is a Batch with the example document(s) discussed in this tutorial, as well as a Project configured according to its instructions.
Please upload the Project to your Grooper environment before uploading the Batch. This will allow the documents within the Batch to maintain their classification status.

About

Grooper solutions can range from simple scan and archive processes to extremely complex solutions. Overrides allow discrete control of Data Elements on a per Content Type basis. This greatly magnifies Grooper’s inheritance-based architecture and allows for more robust and scalable Data Models. You are no longer required to make copies of Data Elements when you just need to modify a property for an oddball Document Type. This can greatly save time building the solution and reduce complexity by eliminating those copied Data Elements. One can also quickly and easily Test Extraction directly in the Overrides tab. After modifying any of the Data Element properties, you can easily test the results of the modification against a test Batch without leaving the tab.

How To

Following is an example of how to setup Overrides. In this example are three different document formats, all of which are collecting the same data. Format A and B follow a similar enough structure and will not use an override to extract. Format C is different enough that it will override the default extractor to get its data.

Understanding the Forms

In the image on the right you can see that "YES" / "NO" values are being returned from the "Document Format A". The Data Model's child Data Elements are set up with simple Labeled Value extraction logic to find the "Value (1,2,3)" labels and return a "YES" or a "NO".

The same can be seen for "Document Format B". Even thogh the orientation of the results is different from "Document Forma A", the Labeled Value approach still works here.

However, with "Document Format C" we are still returning results, but not from the simple Labeled Value settings used on the Data Elements of the Data Model. An override on the "Document Format C" Document Type is being leveraged to allow the OMR checkboxes to be detected and return "YES" / "NO" results.

Setting up the Override

Setting up a Data Element Override is quite simple.

Select a Content Type, in this case, a Document Type.
- Yes, Overrides can be applied to Content Categories.
Select the Overrides tab.
Select a Data Element you want to set overrides for, in this case a Data Field.
- Note that Data Elements that have had properties overridden will have an orage asterisk near them, as well as an orage numeric value in parenthsis indicating the amount of properties overridden.
Properties overridden will also have an orange asterisk near them. You will see here the Labeled Value setting of the Value Extractor has been changed to Labeled OMR. It's sub-properties have been set to accomodate for this.

Testing the Results

The crux of this all is that you can now use the main Data Model, with the same established Data Elements, and get results from all the forms.

Click on the Data Model.
Click the "Tester" tab.
Click on the document you want to extract from.
Click Test Extraction
- Rinse and repeat for the other documents. Document Format C will now successfully extract due to the overrides.

It's important to note that because the Overrides are applied to a Content Type a document must be properly classified in order for the Data Model to know that overrides would be used for extraction for that document. You may be able to successfully test results from the Overrides interface without a classified document, but doing so on the Data Model will result in no extraction.

A simpler, perhaps more common, example of where Overrides very much come in handy is with the visibility of Data Elements. One of the properties of a Data Element is the Visible property which is default True. Imagine a Data Model that has five Data Fields, and the Content Model has 3 Document Types. Document1 uses Data Fields 1-3, Document2 uses Data Fields 2-4, and Document3 uses Data Fields 3-5. In Data Review you want to simplify the job for the person reviewing, so you do not want them to concern themselves with fields that are not relevant. To accomplish this you could use Overrides on each of the aforementioned hypothetical Document Types and set the Visible property to False on all the fields you don't need. This would keep only relevant Data Fields visibile upon review.

Glossary

Batch: inventory_2 Batch nodes are fundamental in Grooper's architecture. They are containers of documents that are moved through workflow mechanisms called settings Batch Processes. Documents and their pages are represented in Batches by a hierarchy of folder Batch Folders and contract Batch Pages.

Content Model: stacks Content Model nodes define a classification taxonomy for document sets in Grooper. This taxonomy is defined by the collections_bookmark Content Categories and description Document Types they contain. Content Models serve as the root of a Content Type hierarchy, which defines Data Element inheritance and Behavior inheritance. Content Models are crucial for organizing documents for data extraction and more.

Content Type: Content Types are a class of node types used used to classify folder Batch Folders. They represent categories of documents (stacks Content Models and collections_bookmark Content Categories) or distinct types of documents (description Document Types). Content Types serve an important role in defining Data Elements and Behaviors that apply to a document.

Data Element: Data Elements are a class of node types used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Data Field: variables Data Fields represent a single value targeted for data extraction on a document. Data Fields are created as child nodes of a data_table Data Model and/or insert_page_break Data Sections.

Data Fields are frequently referred to simply as "fields".

Data Model: data_table Data Models are leveraged during the Extract activity to collect data from documents (folder Batch Folders). Data Models are the root of a Data Element hierarchy. The Data Model and its child Data Elements define a schema for data present on a document. The Data Model's configuration (and its child Data Elements' configuration) define data extraction logic and settings for how data is reviewed in a Data Viewer.

Document Type: description Document Type nodes represent a distinct type of document, such as an invoice or a contract. Document Types are created as child nodes of a stacks Content Model or a collections_bookmark Content Category. They serve three primary purposes:

They are used to classify documents. Documents are considered "classified" when the folder Batch Folder is assigned a Content Type (most typically, a Document Type).
The Document Type's data_table Data Model defines the Data Elements extracted by the Extract activity (including any Data Elements inherited from parent Content Types).
The Document Type defines all "Behaviors" that apply (whether from the Document Type's Behavior settings or those inherited from a parent Content Type).

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

Labeled OMR: Labeled OMR is a Value Extractor used to output OMR checkbox labels. It determines whether labeled checkboxes are checked or not. If checked, it outputs the label(s) or a Boolean true/false value as the result.

Labeled Value: Labeled Value is a Value Extractor that identifies and extracts a value next to a label. This is one of the most commonly used extractors to extract data from structured documents (such as a standardized form) and static values on semi-structured documents (such as the header details on an invoice).

Overrides: Overrides is a tab provided to allow overriding of default properties set to a Data Element.

Project: package_2 Projects are the primary containers for configuration nodes within Grooper. The Project is where various processing objects such as stacks Content Models, settings Batch Processes, profile objects are stored. This makes resources easier to manage, easier to save, and simplifies how node references are made in a Grooper Repository.

Review: person_search Review is an Activity that allows user attended review of Grooper's results. This allows human operators to validate processed contract Batch Page and folder Batch Folder content using specialized user interfaces called "Viewers". Different kinds of Viewers assist users in reviewing Grooper's image processing, document classification, data extraction and operating document scanners.

UI Element: A UI Element is a portion of the Grooper interface that allows users to interact with or otherwise receive information about the application.

@@ Line 25: / Line 25: @@
 * [[Media:2023_Wiki_Overrides_Batch.zip]]
 |}
-== Glossary ==
-<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
-<u><big>'''Content Model'''</big></u>: {{#lst:Glossary|Content Model}}
-<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
-<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
-<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
-<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
-<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
-<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
-<u><big>'''Labeled OMR'''</big></u>: {{#lst:Glossary|Labeled OMR}}
-<u><big>'''Labeled Value'''</big></u>: {{#lst:Glossary|Labeled Value}}
-<u><big>'''Overrides'''</big></u>: {{#lst:Glossary|Overrides}}
-<u><big>'''Project'''</big></u>: {{#lst:Glossary|Project}}
-<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
-<u><big>'''UI Element'''</big></u>: {{#lst:Glossary|UI Element}}
 == About ==
@@ Line 105: / Line 76: @@
 A simpler, perhaps more common, example of where '''Overrides''' very much come in handy is with the visibility of '''Data Elements'''. One of the properties of a '''Data Element''' is the '''''Visible''''' property which is default ''True''. Imagine a '''Data Model''' that has five '''Data Fields''', and the '''Content Model''' has 3 '''Document Types'''. '''Document1''' uses '''Data Fields''' 1-3, '''Document2''' uses '''Data Fields''' 2-4, and '''Document3''' uses '''Data Fields''' 3-5. In '''Data Review''' you want to simplify the job for the person reviewing, so you do not want them to concern themselves with fields that are not relevant. To accomplish this you could use '''Overrides''' on each of the aforementioned hypothetical '''Document Types''' and set the '''''Visible''''' property to ''False'' on all the fields you don't need. This would keep only relevant '''Data Fields''' visibile upon review.
+== Glossary ==
+<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
+<u><big>'''Content Model'''</big></u>: {{#lst:Glossary|Content Model}}
+<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
+<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
+<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
+<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
+<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
+<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
+<u><big>'''Labeled OMR'''</big></u>: {{#lst:Glossary|Labeled OMR}}
+<u><big>'''Labeled Value'''</big></u>: {{#lst:Glossary|Labeled Value}}
+<u><big>'''Overrides'''</big></u>: {{#lst:Glossary|Overrides}}
+<u><big>'''Project'''</big></u>: {{#lst:Glossary|Project}}
+<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
+<u><big>'''UI Element'''</big></u>: {{#lst:Glossary|UI Element}}