Data Rule (Node Type)

From Grooper Wiki
(Redirected from Data Rule)

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 20232021


flowsheet Data Rules are used to normalize or otherwise prepare data collected in a data_table Data Model for downstream processes. Data Rules apply conditional logic to data extracted from documents (folder Batch Folders).

  • Each Data Rule executes a "Data Action" which do things like computing a field's value, parse a field into other fields, perform lookups, and more.
  • Data Actions can be conditionally executed based on a Data Rule's "Trigger" expression.
  • A hierarchy of Data Rules can be created to execute multiple Data Actions and perform complex data transformation tasks.

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

Introduction

A Data Rule is a configurable rule that applies conditional logic to extracted data. Data Rules are typically used after extraction to transform values into the format your downstream systems require, or to enforce business requirements that are easier to express as a rule than as extraction logic.

Unlike extractors (which focus on finding data on the document), Data Rules focus on fixing, moving, combining, and validating the data once it has been captured into a Data Model.

What a Data Rule runs against (scope)

A Data Rule runs against instances of a selected container. The container is chosen using the rule's "Scope" property:

  • If the "Scope" is set to a Data Model, the rule executes once on the root document instance.
  • If the "Scope" is set to a Data Section (and the section is multi-instance), the rule executes for each section record (each Section Instance).
  • If the "Scope" is set to a Data Table, the rule executes for each row (each Table Row Instance).

This is important because many Data Actions operate on fields within the current record/row, and the meaning of “source” and “target” depends on what the rule is scoped to.

Conditional execution (Trigger, True Action, False Action)

A Data Rule can optionally use conditional logic to decide what should happen:

  • The "Trigger" property must be configured with a Boolean expression.
  • When "Trigger" evaluates to True:
    • The "True Action" runs.
    • Any child Data Rules (nested under the current rule) also run.
  • When "Trigger" evaluates to False:
    • The "False Action" runs (if configured).
    • Child Data Rules are skipped.
    • If no "False Action" is configured, the Data Rule just does not run.

If the Trigger property is blank, the Data Rule will always run.

How Data Rules differ from other conditional components

Data Rules are purpose-built for manipulating and validating extracted values inside the Data Model hierarchy (document, section record, or table row). There are other Grooper components that also use conditional logic (for example, conditional paths in a Batch Process), but those typically control workflow routing or when a step runs. Data Rules control how extracted data is modified within the document's data structure.

How to configure a Data Rule (step-by-step)

  1. In Grooper Design, create or select a Data Rule (commonly stored in a Content Type's Local Resources Folder).
  2. Select the Data Rule and set:
    1. "Scope" to the Data Model, Data Section, or Data Table you want to process.
    2. (Optional) "Trigger" to define when the rule should apply.
    3. "True Action" to define what happens when the trigger is met (or always, if Trigger is blank).
    4. (Optional) "False Action" to define what happens when the trigger is not met.
    5. (Optional) "Shortcut Keys" if you want a manual shortcut in review tools that support it.
  3. (Optional) Add child Data Rules under the rule to build a multi-step decision tree.


Testing and troubleshooting

  • If you need to test a rule interactively during Review, you can run it from a field container using the Run Rule command:
    • Alt + R
  • If a rule does not appear in the list, verify:
    • The rule’s "Scope" is compatible with the selected container.
    • The rule is stored where it is accessible for the Content Type you are reviewing.
  • If a rule never runs:
    • Confirm "Trigger" is either blank or evaluates to True for the current instance.
    • Temporarily simplify "Trigger" to True to isolate expression issues.
  • If values change but errors do not clear:
    • Data Rules re-validate fields that now have values after the True Action runs. If an error remains, check the Data Field’s "Is Valid", "Required", "Min Confidence", and other validation settings.

Data Rule properties (overview)

The key Data Rule properties are:

  • "Scope"
  • "Trigger"
    • A Boolean expression. If blank, the rule always executes the True Action.
  • "True Action"
    • The Data Action that runs when the Trigger evaluates to True (or always, if Trigger is blank).
  • "False Action"
    • The Data Action that runs when the Trigger evaluates to False. This is only shown when "Trigger" is not blank.
  • "Required Elements" (obsolete)
    • A legacy way to enforce conditional required elements. Prefer the Require Value Data Action.
  • "Shortcut Keys"
    • Optional shortcut for use in the Data Viewer tab on the Review Page.

Data Actions

Data Actions are the building blocks used in "True Action" and "False Action". Actions can be combined and nested to create multi-step logic.

Below are the main Data Actions, in the order commonly presented.

Calculate Value

Purpose Calculates a value using an expression and applies it to a target field. Use this to standardize computed results (totals, derived values, formatted values) after extraction.

How to configure (step-by-step)

  1. Create/select a Data Rule and set its "Scope" to the container where the target field exists (Data Model, Data Section, or Data Table row context).
  2. For the True Action property, choose Calculate Value.
  3. Set the target field (where the result should be written).
  4. Configure the Calculate Value expression to produce the desired output type (string/number/date depending on the Data Field's "Value Type").

Validation usage Use Calculate Value when you want to enforce a known result:

  • Calculate and set the value (then make the field read-only in the Data Field if it should never be edited), or
  • Calculate and compare to an existing value (for example, flagging mismatches via Raise Issue).


Clear Item

Clear Item removes or resets the value of a specified Data Element. Use this action to clear the value of a field, remove all rows from a table, or empty a multi-instance section. For containers, all child elements are cleared; for fields, the value is set to an empty string.

Example usage: Clear the "Comments" field or remove all rows from a table.

Concat

Concat merges or combines items in a collection based on a trigger condition and parameter. This action is typically used to concatenate or aggregate data from multiple items in a collection, such as combining rows in a table or merging section instances, according to a specified rule.

Example usage: Combine adjacent table rows when a certain field matches a condition.

Copy

Copy copies data from a source element to a target element. You can copy values, containers, or collections, with options to prevent overwriting, ignore empty values, or specify a source index. This action is useful for duplicating or transferring data between fields, sections, or tables.

Example usage: Copy the value from "Address Line 1" to "Mailing Address" if the latter is empty.

Append

Append adds one or more elements to a collection, such as a table, multi-instance section, or multi-cardinality field. For each instance of the source element, an entry is added to the target collection. Optional actions can be defined to populate or copy child elements.

Example usage: Append all rows from one table to another, or add values from a multi-cardinality field to a list.

Data Lookup

Purpose Runs a Lookup to populate additional values based on existing criteria. This is typically used to enrich data (e.g., vendor info, customer IDs) after extraction.

Example use cases

  • Look up Vendor ID based on Vendor Name.
  • Look up Address details from a Customer Number.

How to configure (step-by-step)

  1. Create/select a Data Rule and set its "Scope" to the container where you will have access to the target field (Data Model, Data Section, or Data Table row context).
  2. For the True Action property, choose Data Lookup.
  3. Expand the sub properties and set the Lookup property for the type of Lookup you will be performing.
  4. Configure the Lookup sub properties the same way you would for a normal lookup.
  5. Add a Trigger and False Action to your Data Rule as desired.
  6. Save your changes and test the Data Rule.

Important behavior note Lookups run by the Data Lookup Data Action can populate Data Elements, but they do not flag individual Data Fields or Data Columns as errors based on lookup miss/conflict settings. If you need “lookup-based validation”, typically:

  • Use Data Lookup to populate, then
  • Use Require Value and/or Raise Issue to enforce the expected result.



Extract From

Extract From sets a field value by running an extractor on another field. Configure the "Source Field", "Target Field", and "Extractor" properties to define the extraction logic. Use options like "Prevent Overwrite" and "Ignore Miss" to control execution and handling of missing results.

Example usage: Extract the last name from a "Full Name" field and assign it to a "Last Name" field.

Parse Value

Parse Value parses a field value into substrings using a regular expression and assigns them to sibling fields. The "Source Field" is parsed using the "Pattern" property, which should contain named groups matching the target fields.

Example usage: Split "DOE, JOHN Q" in "Full Name" into "Last Name", "First Name", and "MI" fields using a regex pattern.

Raise Issue

Purpose Creates an issue (a validation/problem report) when a condition is met. Use this to alert users during review or to record a problem for downstream handling.

Example use cases

  • Amount is negative.
  • A required business field is missing under certain conditions.
  • A calculated total does not match the sum of line items.

How to configure (step-by-step)

  1. Set an appropriate "Scope" so the rule runs in the correct context (Data Model, Data Table, or Data Section).
  2. Set "Trigger" to the condition that indicates a problem.
    • Not all Data Actions require a Trigger to work, but the Raise Issue does. It must know when to raise the issue.
  3. Set the "True Action" property to Raise Issue.
  4. Set the message details (issue category/source text if available) on the Log Message property.
  5. Save your changes to the Data Rule and test.

The Raise Issue Action does not highlight the Data Element with the issue. You will need to be detailed in your Log Message as to indicate where the problem lies.

Validation usage Raise Issue is primarily a validation action. Use it when you need a clear, user-facing indication that something is wrong, without necessarily changing values.


Remove

Purpose Removes data from the target context. Depending on the container, this may remove an item from a collection (such as an instance in a Data Section or a Data Table row) or remove an element instance.

Example use cases

  • Remove table rows that are blank or are header/footer noise.
  • Remove repeated section records that fail a quality threshold.

How to configure (step-by-step)

  1. Set the Data Rule "Scope" to a Data Element that will give you access to the parent Data Element that contains the records or rows that you want removed.
  2. For the "True Action", select Remove.
  3. Expand the sub properties for the True Action.
  4. Set the Collection property to the parent of the records/rows you want removed.
  5. Use a "Trigger" expression to determine when a row/record should be removed (for example, if a key field is blank).
    • The Remove Trigger property is different than the Data Rule Trigger property. The Data Rule Trigger property determines when the whole Data Rule Runs. The Remove Trigger property indicates which records should be removed when the Data Rule is run.
  6. Save your changes to the Data Rule and test.

Validation usage Remove is not a validation action by itself. It is commonly paired with Trigger logic to remove invalid/empty items so they do not cause downstream validation failures.



Require Value

Purpose Enforces that a Data Field (or Data Column) has a value. If it is missing, it is marked as an error and/or an issue is raised depending on the configured behavior.

Example use cases

  • Require “Invoice Number” when “Document Type” is Invoice.
  • Require “Amount” on each line item row.
  • Require a lookup-populated field after a Data Lookup action.

How to configure (step-by-step)

  1. Set "Scope" so the required field exists within the active instance (document/record/row).
  2. In "True Action", select Require Value.
  3. Expand the True Action sub properties and use the Required Elements property to choose the field(s) to require.
  4. (Optional) Provide a rule-specific message context (for example, include the rule name or field path) in the Log Message property.
  5. Save your changes and test your Data rule.



Run Command

Purpose Runs a command as part of rule execution. Use this when you need to run an existing Grooper command against the current context as part of normalization.

Example use cases

  • Run a utility command that updates values or performs a specialized operation already available as a command.

How to configure (step-by-step)

  1. Choose Run Command as the Data Rule's "True Action".
  2. Expand the True Action sub properties.
  3. Select the Data Element where you want to run the command in the Element property.
  4. Set the Command property to the command you want to run.
  5. Save your changes to the Data Rule.
  6. Test your Data Rule.

The commands you can run depend on your Scope and Element selections. You can run some commands on a Data Model that you can't run on a Data Section or Data Table.


Action List

Action List executes a sequence of Data Actions in a defined order. Use this to group several actions together as a single step in a Data Rule. Each action in the list is executed in sequence, allowing for complex, multi-step data processing.

Example usage: Normalize a value, validate it, and then copy it to another field, all in one step.

Execute Rule

Execute Rule invokes a Data Rule. Use this to modularize and reuse rule logic. The "Rule" property specifies which Data Rule to execute, which must apply to the same or a descendant scope.

Example usage: Invoke a Data Rule for validation or normalization.