This is a snippet of the Grooper Design Studio UI showing the Data Element Overrides tab.
Data Element Overrides is a tab provided to allow overriding of default properties set to a Data Element.
A completed Content Model and accompanying Batch for what will be built can be found here. It is not required to download to understand this article, but can be helpful because it can be used to follow along with the content of this article. This file was exported from and meant for use in Grooper 2.9
Grooper solutions can range from simple scan and archive processes to extremely complex solutions. Data Element Overrides allow discrete control of Data Elements on a per Content Type basis. This greatly magnifies Grooper’s inheritance-based architecture and allows for more robust and scalable Data Models. You are no longer required to make copies of Data Elements when you just need to modify a property for an oddball Document Type. This can greatly save time building the solution and reduce complexity by eliminating those copied Data Elements. One can also quickly and easily Test Extraction directly in the Overrides tab. After modifying any of the Data Element properties, you can easily test the results of the modification against a test Batch without leaving the tab.
Following is an example of how to setup Data Element Overrides. In this example are three different document formats, all of which are collecting the same data. Format A and B follow a similar enough structure and will not use an override to extract. Format C is different enough that it will override the default extractor to get its data.
Some of the tabs in this tutorial are longer than the others. Please scroll to the bottom of each step's tab before going to the step.
Understanding the Forms
In the image on the right you can see that Format A and Format B have values that can be captured with simple key-value pair extractors. In fact, the Value ExtractorData Type for the Value 1Data Field is simply referencing two different extractors, each in either a horizontal or vertical layout. This one extractor is successfully extracting values for both Format A and Format B, but it fails on Format C because that form is using OMR boxes instead of YES/NO values.
Setting up the Override
Setting up a Data Element Override is quite simple.
1. Select a Content Type, in this case, a Document Type.
Yes, Data Element Overrides can be applied to Content Categories.
2. Select the Data Element Overrides tab.
3. Select a Data Element you want to set overrides for, in this case a Data Field.
Note that Data Elements that have had properties overridden will be underlined.
4. Select the Property Overrides tab.
5. Adjust properties. Any and all properties available to the Data Element can be changed here. The default settings will reflect that of the original Data Element, changing any property is considered to be overriding the property as established on the original Data Element.
In this example the properties were adjusted to allow for the reading of the OMR box, as opposed to the default setup which leveraged two different key-value pair extractors.
6. Click the Test Extraction button to see the results.
Testing the Results
The crux of this all is that you can now use the main Data Model, with the same established Data Elements, and get results from all the forms.
1. Click on the Data Model.
2. Click on the document you want to extract from.
3. Click Test Extraction
Rinse and repeat for the other documents. Document Format C will now successfully extract due to the overrides.
It's important to note that because the Data Element Overrides are applied to a Content Type a document must be properly classified in order for the Data Model to know that overrides would be used for extraction for that document. You may be able to successfully test results from the Data Element Overrides interface without a classified document, but doing so on the Data Model will result in no extraction.
It is worth noting that one could have accomplished the above by simply making another extractor and set it up for OMR, then have the Value ExtractorData Types for each Data Field simply reference a third element. Overrides would not be necessary in that case. This example, however, sufficed to provide something to show. As with many things in Grooper there isn't always a right or wrong way. There is perhaps a best practice, and in this case, making the third extractor would be the better thing to do.
A simpler, perhaps more common, example of where Data Element Overrides very much come in handy is with the visibility of Data Elements. One of the properties of a Data Element is the Visible property which is default True. Imagine a Data Model that has five Data Fields, and the Content Model has 3 Document Types. Document1 uses Data Fields 1-3, Document2 uses Data Fields 2-4, and Document3 uses Data Fields 3-5. In Data Review you want to simplify the job for the person reviewing, so you do not want them to concern themselves with fields that are not relevant. To accomplish this you could use Data Element Overrides on each of the aforementioned hypothetical Document Types and set the Visible property to False on all the fields you don't need. This would keep only relevant Data Fields visibile upon review.
The following is a use case that heavily leverages Data Element Overrides.
The three images below are examples of three different Document Types in the Revenue Statements model. Notice how different their formats are. As a result, the techniques used for extracting their information is quite different.
The image on the right is an expanded look at the Data Model of the Revenue Statements model (click the image if you would like to see a higher resolution image). It is quite complex. This model is used across all the different Document Types in spite of their extreme variation in formatting. Because the information being collected from the different formats is normalized via this Data Model, but the formats themselves are so widely varied, Data Element Overrides ended up playing a huge role in overcoming this obstacle.
The main means of circumventing the varied formats was to not set any extractor settings on the Data Elements of the main Data Model (and its subsequent hierarchy).
Instead, each Document Type gets its own (local resources) which contains extractors configured to specifically work with that format.
Data Element Overrides are then set to override the default no extractor settings, and istead use the extractors built in the (local resources) (notice the underline on the Data Elements indicating an override has been set.) These overrides are especially important Data Elements like multi-instance Data Sections, or Data Tables.
Versions prior to Grooper 2.9 had an initial concept version of overrides in the Data Element Profiles tab located on the Content Model or Document Type. These profiles only allowed modification to a limited number of properties on the data element, as opposed to Grooper 2.9 where all properties can be overridden.
Where Did Zonal Properties Go?
All the zonal extraction properties are now set directly on the Data Element.