Data Model

From Grooper Wiki
Jump to navigation Jump to search

Data Models are digital representations of data targeted for extraction on a document.

The Data Model defines the data structure for a Content Type and can live at varying levels of structure, allowing for inheritance if a hierarchy exists.  This can be a simple list of data fields or a complex hierarchy of sections, subsections, tables and fields.  

The Data Model is leveraged by Grooper to extract data from a Batch.  All extraction logic (i.e. referencing a Data Extractor to fill a field, performing a database lookup, or generating a calculated field expression) is set on the Data Model or the Data Elements related to the Data Model.  It also provides information to the Data Review activity setting expectations for field appearance and behavior (i.e. if a field is required before completing batch validation).  

One Data Model can be created for each:

Data Models also inherit data elements from parent Content Types.  For example, if a Content Model's Data Model has a child Data Field named "Date" and a Content Category's Data Model has a child Data Field named "Time", the Content Category's Data Model will actually have both "Date" and "Time" as fields.  It has it's child field "Time" and inherits the parent field "Date" as well. See below for a typical hierarchical structure exemplifying such:

  • Content Model - HR
    • Data Model - HR
      • Data fields such as: First Name, Middle Name, Last Name, Employment Status, Status Date
      • Content Category - Benefits
        • Data Model - Benefits (Inherits all data from the Content Model's primary Data Model as well extracting its own data such as...)
          • Data Fields: Eligible Date
        • Document Type - Health Insurance
          • Data Model - Health Insurance (Inherits all data from the Content Model and parent Content Category as well as extracting its own data such as...)
            • Data Fields: Enrolled Date, Covered Parties

So, a document classified as a "Health Insurance" Document Type would have eight total Data Fields: Two from its own Data Model (Enrolled Date and Covered Parties), One from its parent Content Category's (named "Benefits") Data Model (Eligible Date), and five from the Content Model's Data Model (First Name, Middle Name, Last Name, Employment Status, Status Date).

Data context can be critical to build the Data Type and Field Class extractors to populate a Data Model. For more information on this topic, visit the Data Context article.