Data Model (Node Type): Difference between revisions

Latest revision as of 15:12, 20 November 2025

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

data_table Data Models are leveraged during the Extract activity to collect data from documents (folder Batch Folders). Data Models are the root of a Data Element hierarchy. The Data Model and its child Data Elements define a schema for data present on a document. The Data Model's configuration (and its child Data Elements' configuration) define data extraction logic and settings for how data is reviewed in a Data Viewer.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

What is a Data Model?

The Data Model defines the data structure for a Content Type and can live at varying levels of structure, allowing for inheritance if a hierarchy exists. This can be a simple list of Data Fields or a complex hierarchy of sections, subsections, tables and fields.

The Data Model is leveraged by Grooper to extract data from a Batch. All extraction logic (i.e. referencing a Data Extractor to fill a field, performing a Database Lookup, or generating a calculated Field Expression) is set on the Data Model or the Data Elements related to the Data Model. It also provides information to the Data Review activity setting expectations for field appearance and behavior (i.e. if a field is required before completing batch validation).

One Data Model can be created for each:

A Data Model is a critical component of data hierarchy and extraction in general. You cannot have Data Fields/Tables/Sections without first having a Data Model. You cannot set up extraction without first having a Data Model.

Data Models in Grooper

Data Models are mainly used in the organization and extraction of data. Inheritance of said data from child elements (Data Fields, Data Tables, Data Sections) plays a vital part of the extraction along with any overrides that may need to be performed. In addition, should users desire a more visually appealing view of their extracted data, they can customize the appearance of any Data Fields, Sections, and/or Tables within the Data Model through the use of CSS Styling. These three topics, extraction, inheritance and overrides, and appearance and styling, are explained below.

Extraction

A Data Model organizes its data using three child element types:

Data Field: Captures a single value, such as an invoice number or date.
Data Section: Groups related fields and tables, supporting hierarchical and repeating structures.
Data Table: Extracts tabular data, such as line items or transaction logs.

During extraction, Grooper uses the Data Model to guide the process. Each child element extracts its data from the document, and the results are fed up to the parent Data Model. The extraction process is typically performed by the "Extract" activity on a Batch Folder. For more information about the Extract Activity, click here. In essence, a Data Model acts as a container for all Data Elements configured for extraction. Think of it like a bucket that you can throw all of the Data Fields/Tables/Sections that you create into. Each element contains its own data that it extracts (a Data Field configured to extract an invoice number, for example), but each element will feed its data up into the parent Data Model, ultimately creating what you see depicted below.

Here, we have a Data Model, whose children are comprised of Data Fields, a Data Table, and two Data Sections, all configured to extract employee information on the Information Sheets within the Batch. Each element extracts its own piece of data and passes it up to the Data Model, resulting in the detailed, structured Data Model being depicted.

Inheritance and overrides

Suppose you have an instance where you have more than one Data Model. Specifically, let's say your Node Tree looked like this:

stacks Content Model - HR

data_table Data Model - HR

variables Data fields such as: First Name, Middle Name, Last Name, Employment Status, Status Date

collections_bookmark Content Category - Benefits

data_table Data Model - Benefits (Inherits all data from the Content Model's primary Data Model as well extracting its own data such as...)

variables Data Fields: Eligible Date

description Document Type - Health Insurance

data_table Data Model - Health Insurance (Inherits all data from the Content Model and parent Content Category as well as extracting its own data such as...)

variables Data Fields: Enrolled Date, Covered Parties

How does everything work? What exactly is happening here? Are these just three separate Data Models each doing their own thing? Not quite.

Inheritance allows one Data Model to reuse and extend the data structure of another. When a Content Type is based on a parent (such as a Content Model or category), its Data Model automatically includes the parent’s Data Fields, Data Sections, and Data Tables. This creates a layered approach: shared elements are defined once, then refined only where needed.

What is inherited

Data Fields (names, display settings, validation settings, extraction logic)
Data Sections (their contained Data Fields and nested structure)
Data Tables (columns and their properties)
Style resources defined through "Included Style Sheets"
Default behavior such as automatic child extraction controlled by "Run Child Extractors"

Overriding inherited elements

You may change selected properties of inherited elements without altering the parent. Common overrides include:

"Required" – make a previously optional Data Field mandatory
"Display Name" – adjust how a value appears to users
"CSS Class" or styling via local style sheet content
Extraction or validation settings
Visibility or formatting choices

When you edit an inherited element locally:

The modified property becomes an override.
The change applies only to the current Content Type (and its descendants).
The parent remains unchanged.

Clearing an override restores the inherited value.

Why use inheritance

Reduces duplication across multiple similar document types
Keeps naming and structure consistent
Simplifies maintenance: updates to shared elements flow automatically to descendants
Allows precise specialization for edge cases (for example, making one Data Field required only in a regulated form variant)

Data Models in Grooper support inheritance, allowing a child Data Model to inherit structure and properties from a parent. This enables you to define common Data Elements once and reuse them across multiple Document Types. Properties and Behaviors can be overridden in child Data Models, providing flexibility to customize extraction, validation, or appearance for specialized scenarios.

For example, you might have a base Data Model for generic invoices, with child Data Models for different invoice formats from different companies. Each child can override specific properties or add new fields as needed, while still inheriting the shared structure.

Here is an example scenario: suppose we have two types of invoices. A generic invoice whose Data Model consists of the following child elements:

Invoice Number
Invoice Date
Vendor Information (Section)
  Vendor Name
  Vendor Address
Line Items (Table)
  Description
  Quantity
  Unit Price

And a separate invoice that also requires a PO Number in addition to everything listed above. So, something like:

Invoice Number
Invoice Date
Vendor Information (Section)
  Vendor Name
  Vendor Address
Line Items (Table)
  Description
  Quantity
  Unit Price

PO Number

This situation is depicted below. Here, we can see in addition to the parent Data Model, the Envoy Content Type has its own Data Model, making it a child of the Data Model at the top of the hierarchy. As you can see, it inherits the child Data Elements of the parent Data Model in addition to having its own Data Field (PO Number) that it extracts from the Envoy invoice.

Appearance and styling

Data Models can be styled to improve visualization and usability in the Grooper UI. You can customize the appearance of Data Fields, Data Sections, and Data Tables by setting properties such as "Display Width", "Alignment", "Placeholder", and "Caption". Styling options help users review and edit extracted data more efficiently, and can make complex documents easier to navigate.

Resource Files (such as CSS) can be linked to a Data Model to further customize the look and feel of data grids and review screens. This allows for tailored user experiences and supports branding or compliance requirements.

For more information on CSS styling, click here.

Here, we can see how a customized Style Sheet property in Grooper can help manage the visual aspect of extracted data, making it cleaner and easier to read.

How to create and configure a Data Model

Creating a Data Model:

To create a Data Model in Grooper, perform the following:

Right-click the desired Content Type in the Grooper tree.
Select the "Create Data Model" command.
A new Data Model will be added as a child of the Content Type.

Configuring a Data Model (adding child elements):

Right-click the Data Model and choose "Add", then choose between the three child objects (Data Field, Data Section, or Data Table).
Name the object.
Click "Execute" to create the child object in the Node Tree.
Use the property grid to set appearance, validation, and extraction options.

To add multiple child elements:

Right-click the Data Model.
Click "Contents"
Select "Add Multiple Items..."
- Note that you can only add multiples of one child object type - Data Fields, Data Tables, or Data Sections.
Within the Add Window, click "☰" to open the dropdown menu for the Item Type.
Select from Data Field, Data Table, or Data Section.
Next, click "..." to add each entry.
List the names of each object.
Click OK when finished.
Click Execute to add the multiple objects.

Example: Closing Disclosures

Data Models are useful for a wide variety of types of documents, such as invoices, HR documents (such as EOB forms), medical records, etc. Long story short, does your document have text data you want extracted? Then you're going to want a Data Model. Let's walk through the set up, configuration, and testing of extraction with a simple example featuring a Closing Disclosures document.

@@ Line 1: / Line 1: @@
-{{stubs}}
+{{AutoVersion}}
-<section begin="glossary" />
-<blockquote>
-Data Models are digital representations of data targeted for extraction on a document.
-</blockquote>
-<section end="glossary" />
-The Data Model defines the data structure for a [[Content Type]] and can live at varying levels of structure, allowing for inheritance if a hierarchy exists.  This can be a simple list of data fields or a complex hierarchy of sections, subsections, tables and fields.
-The Data Model is leveraged by Grooper to extract data from a [[Batch]].  All extraction logic (i.e. referencing a [[Data Extractor]] to fill a field, performing a database lookup, or generating a calculated field expression) is set on the Data Model or the [[Data Element]]s related to the Data Model.  It also provides information to the [[Data Review]] activity setting expectations for field appearance and behavior (i.e. if a field is required before completing batch validation).
+<blockquote>{{#lst:Glossary|Data Model}}</blockquote>
+{|class="download-box"
+|
+[[File:Asset 22@4x.png]]
+|
+You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains one or more '''Batches''' of sample documents. The second contains one or more '''Projects''' with resources used in examples throughout this article.
+* [[Media:2025_Data_Model_Batch.zip]]
+* [[Media:2025_Data_Model_Project.zip]]
+|}
+== What is a Data Model? ==
+The Data Model defines the data structure for a [[Content Type]] and can live at varying levels of structure, allowing for inheritance if a hierarchy exists.  This can be a simple list of Data Fields or a complex hierarchy of sections, subsections, tables and fields.
+The Data Model is leveraged by Grooper to extract data from a [[Batch]].  All extraction logic (i.e. referencing a [[Data Extractor]] to fill a field, performing a Database Lookup, or generating a calculated Field Expression) is set on the Data Model or the [[Data Element]]s related to the Data Model.  It also provides information to the [[Data Review]] activity setting expectations for field appearance and behavior (i.e. if a field is required before completing batch validation).
 One Data Model can be created for each:
@@ Line 15: / Line 24: @@
 * [[Document Type]]
-Data Models also inherit data elements from parent [[Content Type]]s.  For example, if a [[Content Model]]'s Data Model has a child [[Data Field]] named "Date" and a [[Content Category]]'s Data Model has a child [[Data Field]] named "Time", the [[Content Category]]'s Data Model will actually have both "Date" and "Time" as fields.  It has it's child field "Time" and inherits the parent field "Date" as well. See below for a typical hierarchical structure exemplifying such:
-* Content Model - HR
+A Data Model is a critical component of data hierarchy and extraction in general. You cannot have Data Fields/Tables/Sections without first having a Data Model. You cannot set up extraction without first having a Data Model.
-** Data Model - HR
-*** Data fields such as: First Name, Middle Name, Last Name, Employment Status, Status Date
+== Data Models in Grooper ==
-*** Content Category - Benefits
+Data Models are mainly used in the organization and extraction of data. Inheritance of said data from child elements (Data Fields, Data Tables, Data Sections) plays a vital part of the extraction along with any overrides that may need to be performed. In addition, should users desire a more visually appealing view of their extracted data, they can customize the appearance of any Data Fields, Sections, and/or Tables within the Data Model through the use of CSS Styling. These three topics, extraction, inheritance and overrides, and appearance and styling, are explained below.
-**** Data Model - Benefits (Inherits all data from the Content Model's primary Data Model as well extracting its own data such as...)
+=== Extraction ===
-***** Data Fields: Eligible Date
+A Data Model organizes its data using three child element types:
-**** Document Type - Health Insurance
+* '''Data Field''': Captures a single value, such as an invoice number or date.
-***** Data Model - Health Insurance (Inherits all data from the Content Model and parent Content Category as well as extracting its own data such as...)
+* '''Data Section''': Groups related fields and tables, supporting hierarchical and repeating structures.
-****** Data Fields: Enrolled Date, Covered Parties
+* '''Data Table''': Extracts tabular data, such as line items or transaction logs.
+During extraction, Grooper uses the Data Model to guide the process. Each child element extracts its data from the document, and the results are fed up to the parent Data Model. The extraction process is typically performed by the "Extract" activity on a [[Batch Folder]]. For more information about the Extract Activity, click [[Extract (Activity)|here]]. In essence, a Data Model acts as a container for all Data Elements configured for extraction. Think of it like a bucket that you can throw all of the Data Fields/Tables/Sections that you create into. Each element contains its own data that it extracts (a Data Field configured to extract an invoice number, for example), but each element will feed its data up into the parent Data Model, ultimately creating what you see depicted below.
+Here, we have a Data Model, whose children are comprised of Data Fields, a Data Table, and two Data Sections, all configured to extract employee information on the Information Sheets within the Batch. Each  element extracts its own piece of data and passes it up to the Data Model, resulting in the detailed, structured Data Model being depicted.
+[[file:2025_Data_Model_Data_Models_in_Grooper_Extraction_01(2).png]]
+=== Inheritance and overrides ===
+Suppose you have an instance where you have more than one Data Model. Specifically, let's say your Node Tree looked like this:
+: {{IconName|Content Model}} Content Model - HR
+:: {{IconName|Data Model}} Data Model - HR
+::: {{IconName|Data Field}} Data fields such as: First Name, Middle Name, Last Name, Employment Status, Status Date
+:: {{IconName|Content Category}} Content Category - Benefits
+::: {{IconName|Data Model}} Data Model - Benefits (Inherits all data from the Content Model's primary Data Model as well extracting its own data such as...)
+:::: {{IconName|Data Field}}  Data Fields: Eligible Date
+::: {{IconName|Document Type}} Document Type - Health Insurance
+:::: {{IconName|Data Model}} Data Model - Health Insurance (Inherits all data from the Content Model and parent Content Category as well as extracting its own data such as...)
+::::: {{IconName|Data Field}} Data Fields: Enrolled Date, Covered Parties
+How does everything work? What exactly is happening here? Are these just three separate Data Models each doing their own thing? Not quite.
+Inheritance allows one [[Data Model]] to reuse and extend the data structure of another. When a [[Content Type]] is based on a parent (such as a [[Content Model]] or category), its Data Model automatically includes the parent’s Data Fields, Data Sections, and Data Tables. This creates a layered approach: shared elements are defined once, then refined only where needed.
+<big> What is inherited </big>
+* Data Fields (names, display settings, validation settings, extraction logic)
+* Data Sections (their contained Data Fields and nested structure)
+* Data Tables (columns and their properties)
+* Style resources defined through "Included Style Sheets"
+* Default behavior such as automatic child extraction controlled by "Run Child Extractors"
+<big> Overriding inherited elements </big>
+You may change selected properties of inherited elements without altering the parent. Common overrides include:
+* "Required" – make a previously optional Data Field mandatory
+* "Display Name" – adjust how a value appears to users
+* "CSS Class" or styling via local style sheet content
+* Extraction or validation settings
+* Visibility or formatting choices
+When you edit an inherited element locally:
+# The modified property becomes an override.
+# The change applies only to the current Content Type (and its descendants).
+# The parent remains unchanged.
+Clearing an override restores the inherited value.
+<big> Why use inheritance </big>
+* Reduces duplication across multiple similar document types
+* Keeps naming and structure consistent
+* Simplifies maintenance: updates to shared elements flow automatically to descendants
+* Allows precise specialization for edge cases (for example, making one Data Field required only in a regulated form variant)
+Data Models in Grooper support inheritance, allowing a child Data Model to inherit structure and properties from a parent. This enables you to define common Data Elements once and reuse them across multiple Document Types. Properties and Behaviors can be overridden in child Data Models, providing flexibility to customize extraction, validation, or appearance for specialized scenarios.
+For example, you might have a base Data Model for generic invoices, with child Data Models for different invoice formats from different companies. Each child can override specific properties or add new fields as needed, while still inheriting the shared structure.
+Here is an example scenario: suppose we have two types of invoices. A generic invoice whose Data Model consists of the following child elements:
+<pre>
+Invoice Number
+Invoice Date
+Vendor Information (Section)
+  Vendor Name
+  Vendor Address
+Line Items (Table)
+  Description
+  Quantity
+  Unit Price
+</pre>
+And a separate invoice that also requires a PO Number in addition to everything listed above. So, something like:
+<pre>
+Invoice Number
+Invoice Date
+Vendor Information (Section)
+  Vendor Name
+  Vendor Address
+Line Items (Table)
+  Description
+  Quantity
+  Unit Price
+PO Number
+</pre>
+This situation is depicted below. Here, we can see in addition to the parent Data Model, the Envoy Content Type has its own Data Model, making it a child of the Data Model at the top of the hierarchy. As you can see, it inherits the child Data Elements of the parent Data Model in addition to having its own Data Field (PO Number) that it extracts from the Envoy invoice.
+[[file:2025_Data_Model_Data_Models_in_Grooper_Inheritance_and_Overrides_01.png]]
+=== Appearance and styling ===
+Data Models can be styled to improve visualization and usability in the Grooper UI. You can customize the appearance of Data Fields, Data Sections, and Data Tables by setting properties such as "Display Width", "Alignment", "Placeholder", and "Caption". Styling options help users review and edit extracted data more efficiently, and can make complex documents easier to navigate.
+Resource Files (such as CSS) can be linked to a Data Model to further customize the look and feel of data grids and review screens. This allows for tailored user experiences and supports branding or compliance requirements.
+For more information on CSS styling, click [[CSS Data Viewer Styling|here]].
+Here, we can see how a customized Style Sheet property in Grooper can help manage the visual aspect of extracted data, making it cleaner and easier to read.
+[[file:2025_Data_Model_Data_Models_in_Grooper_Appearance_and_Styling_01.png]]
+== How to create and configure a Data Model ==
+<big>Creating a Data Model:</big>
+To create a Data Model in Grooper, perform the following:
+# Right-click the desired [[Content Type]] in the Grooper tree.
+# Select the "Create Data Model" command.
+# A new Data Model will be added as a child of the Content Type.
+<big>Configuring a Data Model (adding child elements):</big>
+# Right-click the Data Model and choose "Add", then choose between the three child objects (Data Field, Data Section, or Data Table).
+# Name the object.
+# Click "Execute" to create the child object in the Node Tree.
+# Use the property grid to set appearance, validation, and extraction options.
+<big>To add multiple child elements:</big>
+# Right-click the Data Model.
+# Click "Contents"
+# Select  "Add Multiple Items..."
+#* Note that you can only add multiples of one child object type - Data Fields, Data Tables, or Data Sections.
+# Within the Add Window, click "☰" to open the dropdown menu for the Item Type.
+# Select from Data Field, Data Table, or Data Section.
+# Next, click "..." to add each entry.
+# List the names of each object.
+# Click OK when finished.
+# Click Execute to add the multiple objects.
+<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.7777777777777777; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmhtlcw6f000kwj0jit20w7hz?embed_v=2&utm_source=embed" loading="lazy" title="2025 - Adding and configuring a Data Model." allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>
+== Example: Closing Disclosures ==
+Data Models are useful for a wide variety of types of documents, such as invoices, HR documents (such as EOB forms), medical records, etc. Long story short, does your document have text data you want extracted? Then you're going to want a Data Model. Let's walk through the set up, configuration, and testing of extraction with a simple example featuring a Closing Disclosures document.
+<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.7777777777777777; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmhux4s1700hex60ixn6xq7pb?embed_v=2&utm_source=embed" loading="lazy" title="2025 - Data Model Example: Closing Disclosures" allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>
-So, a document classified as a "Health Insurance" Document Type would have eight total Data Fields:  Two from its own Data Model (Enrolled Date and Covered Parties), One from its parent Content Category's (named "Benefits") Data Model (Eligible Date), and five from the Content Model's Data Model (First Name, Middle Name, Last Name, Employment Status, Status Date).
+== Related Articles ==
-Data context can be critical to build the '''Data Type''' and '''Field Class''' extractors to populate a '''Data Model'''.  For more information on this topic, visit the [[Data Context]] article.
+*[[Data Element]]
+*[[Data Field (Node Type)]]
+*[[Data Table (Node Type)]]
+*[[Data Section (Node Type)]]
+*[[Extract (Activity)]]