Data Model Compiler

From Grooper Wiki

Grooper's Data Model Compiler compiles a data_table Data Model into a dynamic .NET type, enabling strongly-typed access and advanced expression evaluation.

The Data Model Compiler is essential to code expressions that can call fields in the Data Model. This includes (but is not limited to):

  • Data Model Expressions - Calculated Value Expressions, Is Valid Expressions, and Is Required Expressions
  • Data Rules - Trigger expressions and various Data Actions (such as Calculate Value)
  • Import/Export Mapping expressions

Overview

The "Data Model Compiler" transforms a configured Data Model—including its Data Elements (fields, sections, and tables) and Variable Definitions—into a dynamic .NET class at runtime. This enables Grooper to provide strongly-typed, code-friendly access to extracted data, support advanced expression evaluation, and facilitate integration with custom logic or automation.

The compiler generates source code for the data model structure, compiles it into an in-memory assembly, and exposes the resulting types for use in extraction, validation, and scripting scenarios. This approach allows for high performance, type safety, and flexibility when working with complex document schemas.

Role and usage in Grooper

  • Used internally by Grooper to generate dynamic types for each Data Model, supporting property access, variable evaluation, and navigation of hierarchical data.
  • Enables advanced features such as calculated fields, validation rules, and custom expressions that reference model elements by name.
  • Supports inheritance, parent/child relationships, and variable definitions, ensuring that all aspects of the data model are accessible in code.

How it works

  1. The compiler parses the Data Model and its descendants, generating source code that defines a .NET class structure mirroring the model's hierarchy.
  2. Each field (Data Field), section (Data Section), table (Data Table), and variable (Variable Definition) is represented as a property or nested type, with appropriate type information and documentation.
  3. The generated code is compiled into a .NET assembly, which is loaded into memory and made available for use by Grooper's extraction and scripting engines.
  4. At runtime, Grooper uses the compiled types to provide strongly-typed access to data instances, evaluate expressions, and execute custom logic.
  5. The compiler manages versioning, assembly cleanup, and dynamic type resolution to ensure consistency and performance.

Automatic compilation

In some cases it may be necessary to hard refresh browser sessions or restart services after making changes to the Data Model structure.

  • The compiler is invoked automatically when a Data Model is created or modified.
  • The generated types are used by Grooper's expression editors, validation logic, and automation features to provide IntelliSense, error checking, and code navigation.

Exposing parent and sibling data

The "Child Of", "Sibling Of" and "Relative Of" properties on Content Type (i.e. Document Type) enable advanced data access scenarios by exposing related data elements to the expression environment of a Data Model.

Child Of

When a Content Type is configured with a "Child Of" parent, all data elements from the parent type are made available in expressions within the child. This allows you to reference parent fields directly in validation rules, calculated values, export mappings, and other expressions. For example, if a "Benefits Change Form" is a child of "Personnel File", you can reference Personnel_File.Employee_ID in any expression on the child.
  • Use for multi-level Batch structures where parent-child Batch Folder relationships are required.
  • Enables export and processing logic to include data from parent elements.
  • The selected parent cannot be a descendant or ancestor of this type.
Example: Examine the Content Type structure and the Batch structure below.
  • Content Type structure:
    stacks HR Processing
    description Personnel File (Parent Content Type)
    data_table Data Model
    variables Employee_ID
    variables Employee_Name
    description Benefits Change Form (Child Content Type)
    data_table Data Model
    variables Request_Date
    variables Change_Reason
  • Batch strcutre:
    inventory_2 Batch
    folder Personnel File (Batch Folder)
    folder Benefits Change Form (Batch Folder)
With the "Child Of" property configured, the "Benefits Change Form" Document Type is a child Content Type of "Personnel File" Document Type.
  • The "Benefits Change Form" document will have access to all fields in the "Personnel File" Data Model in the expression environment.
  • For example, when the Benefits Change Form is exported it could utilize the "Employee_Name" value of its parent in its Export Mappings.

Sibling Of

Setting the "Sibling Of" property makes the Data Elements of the sibling Content Types directly accessible in the expression environment of this type. This means you can reference fields, sections, or tables from sibling types in calculated values, validation rules, or export mappings. For example, if "Type A" is a sibling of "Type B", expressions in "Type A" can reference data from "Type B" using the syntax Type_B.FieldName.
  • Use "Sibling Of" when a document (Batch Folder) has one or more Secondary Types assigned and must be treated as multiple Document Types for extraction, validation, or export.
  • The "Sibling Of" property should list Content Types are expected to be assigned alongside this type on the same Batch Folder, using Secondary Type assignments.
  • Sibling relationships cannot extend across multiple documents in a Batch. The word "sibling" does not apply to sibling Batch Folders within a Batch. It applies to sibling Content Types assigned to the same Batch Folder.
  • Sibling relationships are not hierarchical; all Content Types listed are considered peers for the purposes of data access and processing.
  • Avoid circular or redundant sibling assignments, as this can complicate configuration and data access logic.

Relative Of

The "Relative Of" property defines a set of related Content Types whose data elements are exposed in the expression environment of this type. Unlike "Child Of" (which establishes a strict parent-child relationship) or "Sibling Of" (which exposes data from peer Secondary Types on the same Batch Folder), "Relative Of" is used for scenarios where related Content Types may appear as siblings, ancestors, or in varying positions within the Batch structure, but their data should always be available for reference in expressions, validation, or export logic.
  • Use "Relative Of" when the Content Type could be a parent or a sibling.
  • This is useful when you need to reference data from other Content Types that are not guaranteed to be direct parents or assigned as Secondary Types.
  • This is especially useful for flexible or variable Batch structures, where the relationship between types may change, but cross-type data access is still required.
  • Data Elements from the listed Content Types are made available in the expression environment, using the code name of each related type as a property.
  • The relationship is non-hierarchical and does not affect Batch organization or processing order.

Example Data Model and generated code environment

The Data Model Compile' generates a strongly-typed .NET class structure that mirrors the hierarchy of your Data Model. This enables intuitive property access and expression evaluation for all fields, sections, tables, and variables.

Example Data Model Structure

Suppose you have the following Data Model for an invoice:

data_table Invoice Data Model
variables Invoice Number (Data Field, string)
variables Invoice Date (Data Field, DateTime)
insert_page_break Vendor (Data Section)
variables Vendor Name (Data Field, string)
variables Vendor Address (Data Field, string)
table Line Items (Data Table)
view_column Description (Data Column, string)
view_column Quantity (Data Column, int)
view_column Unit Price (Data Column, decimal)
view_column Line Total (Data Column, decimal)
variables Total Amount (Data Field, decimal)

Additional Variable Definition defined on the Data Model: Is Paid (Variable Definition, bool)

Generated Code Environment

The compiler transforms this hierarchy into a .NET class with nested types and properties, making each element accessible by name. Any spaces in Data Element names are converted to underscores. For example:

The root class (e.g., Invoice_Type) exposes properties for each top-level field, section, and table:

  • Invoice_Number (string)
  • Invoice_Date (DateTime)
  • Vendor (nested class representing the Vendor section)
  • Line_Items (collection of nested row classes)
  • Total_Amount (decimal)
  • Is_Paid (bool, variable)
  • Nested sections and tables are represented as their own classes or collections:
    • Vendor.Vendor_Name and Vendor.Vendor_Address
    • Line_Items is a collection, with each row exposing Description, Quantity, Unit_Price, and Line_Total
  • Parent and sibling data (if exposed via "Child Of" or "Sibling Of") are available as additional properties.