Data Validation in Grooper

From Grooper Wiki
Revision as of 14:25, 6 April 2026 by Rpatton (talk | contribs)

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.


This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025


Introduction

Data Validation in Grooper ensures values extracted into a Data Field are correct, complete, and consistent before moving downstream. Validation can:

  • Prevent invalid formats (dates, numbers, emails)
  • Enforce required values
  • Check values against business rules
  • Verify against external systems (e.g., databases)

Grooper validates using properties on the Data Field, expressions, external Lookups, and rule-driven actions. Validation occurs when fields are set, formatted, re-validated in design/test, and during Batch processing.

The different methods of data validation in Grooper discussed in this article include:

  • Value Types
  • Required Fields
  • Is Valid expressions (WIP)
  • Calculated Value expressions (WIP)
  • Database Lookups (WIP)
  • Data Rules and Data Actions (WIP)

Value Types

The "Value Type" determines how a Data Field's value is parsed, formatted, and validated. It controls the primitive type (text, number, date/time, boolean), array support, and formatting rules. Invalid values will fail parsing/validation and can be flagged by the field.

Examples:

  • Validate a "Invoice Date" as a date only. Non-date text will be considered invalid.
  • Validate a "Unit Price" as numeric. Non-numeric text will be considered invalid.

How to set the Value Type

  1. On the Grooper Design Page, navigate to a Data Field or Data Column.
  2. Set "Value Type" property to the desired type (e.g., Date/Time, Numeric, String).
  3. Expand out the "Value Type" sub properties and edit the Format Specifier if desired.
    • The format specifier will change the extracted value to a specified format for that specific Value Type. For example, making sure all dates are returned in the MM/DD/YYYY format.
  4. (Optional) Set "Processing Mode" to Validate for validation-only or Both to format and validate.
  5. (Optional) For multi-values, set "Value Delimiter" (e.g., ;) to validate each item.
  6. Save and test. Enter sample values; invalid inputs will not parse or will show error state.


Value Types properties overview

This section lists the core properties that define a Value Type (Storage Type) used by a Data Field. Each item includes its name, definition, remarks, and typical purpose/use case.

"Processing Mode"

  • Definition: Controls how values are processed by the Value Type (Format, Filter, Both, Validate).
  • Remarks: Determines whether values are reformatted, filtered, both, or only validated. Use Format to reformat without filtering, Filter to filter without formatting, Both to apply both operations, and Validate for validation-only scenarios.
  • Purpose/Use case: Choose the level of processing needed for your field. For example, use Validate when you only need to confirm values without altering their presentation; use Both to standardize display while ensuring values meet constraints.

"Value Delimiter"

  • Definition: The delimiter used to separate multiple values stored in a single field.
  • Remarks: Enables array-style storage when non-empty. Common choices include , or ;, or custom strings such as |. Grooper splits and joins using this delimiter and will parse/format each item independently.
  • Purpose/Use case: Allow multi-value lists (e.g., multiple dates, codes, IDs) in one Data Field while retaining strict per-item validation and formatting.

"Windows Type"

  • Definition: The underlying .NET type (e.g., string, decimal, DateTime) used at runtime by the Value Type.
  • Remarks: Controls parsing and formatting behavior. For arrays, the runtime element type is the item type within the collection.
  • Purpose/Use case: Ensure the Data Field uses the correct primitive type for both validation and formatting (e.g., numbers vs. dates).

"CMIS Type"

  • Definition: The Value Type’s CMIS-compatible property type.
  • Remarks: Used where CMIS integration or property typing is required.
  • Purpose/Use case: Maintain compatibility with CMIS repositories when exporting or syncing metadata.

"SQL Type"

  • Definition: The database type string associated with this Value Type.
  • Remarks: Useful for mapping to relational storage or validating schema alignment when persisting Data Field values.
  • Purpose/Use case: Provide appropriate database typing for storage or integration.

"Name"

  • Definition: The display name of the Value Type.
  • Remarks: Reflects the Storage Type’s friendly name and indicates array status in UI (e.g., "Date Array" when multi-value).
  • Purpose/Use case: Inform users which type (and array status) the field is using during configuration and review.

"Default Value"

  • Definition: The default, typed value for the Value Type.
  • Remarks: Used for testing, formatting demonstrators, or to initialize parsing/format checks (e.g., format validation on edit).
  • Purpose/Use case: Provide a baseline value for format preview or validation diagnostics.

"Is Quoted"

  • Definition: Indicates whether values should be quoted when substituted (e.g., in SQL or external queries).
  • Remarks: Typically false for numeric types, true for string-like types when embedded in queries.
  • Purpose/Use case: Prevent syntax errors or injection issues when building dynamic queries (e.g., Database Lookups).

"Is Formattable"

  • Definition: Specifies if the Value Type supports custom formatting.
  • Remarks: When true, format strings and culture are applied to produce display-appropriate values.
  • Purpose/Use case: Standardize how values appear to users (e.g., currency, dates, number precision).

"Format String"

  • Definition: The format specifier applied when formatting values.
  • Remarks: Validated against the type; invalid specifiers are flagged during property validation. Used only for types that support formatting.
  • Purpose/Use case: Control precision, style, or presentation (e.g., c for currency, date masks).

"Choice Values"

  • Definition: The list of allowed or suggested values exposed by the Value Type (if it is a choice type).
  • Remarks: Used when the type itself provides a choice list (e.g., Booleans, predefined enumerations). Each value is validated against the type.
  • Purpose/Use case: Limit input to a defined set of values and enable dropdowns/autocomplete consistent with the type’s constraints.

"Is Choice"

  • Definition: Indicates whether the Value Type enforces a discrete set of allowed values.
  • Remarks: If true, input is restricted to the provided choices. Otherwise, a field’s Lexicon or list settings may supply choices separately.
  • Purpose/Use case: Enforce controlled vocabularies or simple yes/no style inputs at the type level.

"Validation Notes"

  • Definition: Optional notes shown when validation fails, describing format/rules.
  • Remarks: Included with error messages to guide users when a typed value is invalid (e.g., acceptable range or format).
  • Purpose/Use case: Provide clear guidance during Review when correcting invalid entries.

"Runtime Type"

  • Definition: The effective CLR type used at runtime; array-aware when "Value Delimiter" is set.
  • Remarks: Equals "Array Type" for multi-value fields, otherwise equals "Windows Type".
  • Purpose/Use case: Helps ensure expression engines, lookups, and formatters treat values with the correct runtime type.

"Array Type"

  • Definition: The array type corresponding to the Value Type (e.g., string[], DateTime[]).
  • Remarks: Created by WindowsType.MakeArrayType() when "Value Delimiter" is set.
  • Purpose/Use case: Provide strongly typed multi-value support across parsing/formatting pipelines.

"Is Array"

  • Definition: Indicates whether the Value Type treats field content as an array.
  • Remarks: True when "Value Delimiter" is non-empty. Parsing/formatting operate per item; validation iterates over each value.
  • Purpose/Use case: Enable multi-entry fields (e.g., multiple codes) with true typed validation for each element.

"Join Delimiter"

  • Definition: The delimiter used when joining formatted items back into a single string.
  • Remarks: Derived from "Value Delimiter"; supports escaped control characters (e.g., \r, \n, \t).
  • Purpose/Use case: Control the final presentation of multi-value text after formatting.

"Split Delimiter"

  • Definition: The delimiter used to split incoming text into separate items.
  • Remarks: Trims spaces around the delimiter and supports escaped control characters. Newlines are also treated as separators when splitting.
  • Purpose/Use case: Ensure reliable parsing of multi-item inputs, even with varied line endings or spaced delimiters.

Practical guidance

  • Use "Processing Mode" = Validate when you only want to test correctness without changing user-entered formatting; use Both for consistent display and enforced rules.
  • For multi-value fields, set "Value Delimiter" and rely on "Is Array" behavior to validate each entry independently.
  • For database queries in Lookups, ensure "Is Quoted" matches the underlying type to prevent SQL issues.
  • Use "Format String" with "Is Formattable" types (e.g., Date/Time, Numeric) to present values clearly in Review and exports.
  • When publishing choice-driven fields, prefer "Is Choice" with "Choice Values" or use a Lexicon on the Data Field if choices are field-specific rather than type-level.

Required fields

The "Required" property enforces that a Data Field must contain a value by the end of extraction/validation. Empty values will be flagged for review.

When to use:

  • Key identifiers (e.g., "Invoice Number")
  • Compliance-critical fields (e.g., "Tax ID")
  • Downstream-required fields (e.g., "Customer ID")

Example

  • Require "Vendor ID" on invoices. If empty after extraction, the field is flagged to ensure correction during Review.

Steps

  1. Navigate to the target Data Field or Data Column.
  2. Locate and enable the "Required" property in the property grid.
  3. Save and test extraction. Empty fields will show an error state until a value is assigned.

Calculated Value expressions

A "Calculated Value" expression validates, derives, or normalizes a value. Use it to compute results or to re-check consistency with "Calculate Mode"/tolerance. Calculation can be paired with validation to detect mismatches.

Calculated Value can be used to:

  • Normalize presentation (e.g., format phone numbers)
  • Derive values (e.g., "Amount" = "Quantity" × "Unit Price")
  • Cross-field consistency checks

Example

  • Validate line item totals:

@Calculated_Total == (@Quantity * @Unit_Price)

  • Auto-calculate "Calculated Total":

@Quantity * @Unit_Price

In our case, we are specifically looking at using Calculated Value for validation.

How to configure

  1. Navigate to to the target Data Field (e.g., "Subtotal").
  2. Make sure the Value Type property is set to a valid Value Type, such as Decimal. You cannot calculate string values.
  3. Locate the Calculated Value property in the property grid in the Expressions category.
  4. Write an expression referencing other fields or Data Columns (e.g., Line_Items.SumOf("Extended Price")).
  5. Make sure the Calculate Mode property is set to "Validate".
  6. You can adjust the Calculate Tolerance property for numeric rounding differences.
  7. Save and test. Mismatches will flag the field.

Is valid expressions

An Is Valid expression is a custom validation rule you can add to a Data Field to decide whether the field's value is acceptable. The expression is written as a Visual Basic (VB.NET) expression and must evaluate to True or False. If it evaluates to False, the field is marked invalid and will show an error during review.

Is Valid expressions are useful when simple validation (such as "Value Type" conversion or "Required") is not enough. They let you enforce business rules, compare values across fields, and validate values against document context.

When to use Is Valid

Use an "Is Valid" expression when you need validation that depends on logic rather than formatting. Common scenarios include:

  • Range checks (values must be within a minimum/maximum).
  • Cross-field validation (one field must match or relate to another field).
  • Date rules (dates cannot be in the future, or must be within a specific window).
  • Pattern or character rules beyond basic parsing (for example, disallow certain characters).
  • Consistency checks against calculated totals (without automatically setting the value).

If your goal is to compute an expected value and optionally populate the field, consider using "Calculated Value" instead of (or in addition to) "Is Valid".

How it works

  1. A Data Field gets a value (typically from a Value Extractor, or entered by a user).
  2. Grooper validates the value type (for example, whether it can be parsed as a date or number based on the field's "Value Type").
  3. If the field has an "Is Valid" expression, Grooper evaluates it.
    • If the expression returns True, validation passes.
    • If the expression returns False, the field is set to an error state.
    • If "Validate Message" is configured, Grooper displays that message to the user when validation fails.

Tip: Use "Validate Message" to provide clear instructions to operators about what needs to be corrected.


Configure Is Valid on a Data Field

  1. In Grooper Design, select the Data Model (or a Data Section) containing your Data Field.
  2. Click the Data Field you want to validate.
  3. In the property grid, locate the "Is Valid" property in the Expressions category.
  4. Set the "Is Valid" property to a VB.NET expression that returns a Boolean (True/False).
  5. (Optional) Set the "Validate Message" property to a VB.NET expression that returns a string. This message is shown when "Is Valid" evaluates to False.
  6. Save your changes.
  7. Test the validation by testing extraction on a Test Batch and reviewing the field's status.

Examples

Example 1: equality of two Fields

The goal in this first example is to determine if the value in one field is the same as the value in another field. So, in our case, we are going to make sure that the total on an invoice below the line items table matches the total quoted at the top of the Invoice.

  1. Navigate to the Data Field you want to validate. In our example we are going to set our Is Valid expression on the Please Pay Data Field.
  2. Scroll through the property grid and locate the Is Valid property.
  3. Click the "..." to the right of the property to open the editor.
  4. Type in your VB.NET code for your Is Valid expression. In this example our expression is: Please_Pay = Total_Due
  5. Click "OK" on the Is Valid editor.
  6. (Optional) Insert an expression into the Validate Message property if you want an error message to appear if deemed invalid. In our example we used the following: "Please Pay does not equal Total Due"
  7. Save your changes to the Data Field and test your extraction at the Data Model level.

Example 2: sum of a Data Column

While the general instructions remain the same for every Is Valid use, the expression you choose to use changes how the validation works. Let's look at a second example. In this example we are going to tell Grooper to sum up the Extended Price column in our Line Items table and make sure that total is equal to the extracted value in the Subtotal Data Field.

  1. Navigate to the Data Field you want to validate. In our example we are going to set our Is Valid expression on the Subtotal Data Field.
  2. Scroll through the property grid and locate the Is Valid property.
  3. Click the "..." to the right of the property to open the editor.
  4. Type in your VB.NET code for your Is Valid expression. In this example our expression is: Subtotal = SumFieldInstance("Line Items\Extended Price")
  5. Click "OK" on the Is Valid editor.
  6. (Optional) Insert an expression into the Validate Message property if you want an error message to appear if deemed invalid. In our example we used the following: "Subtotal does not equal sum of Extended Price"
  7. Save your changes to the Data Field and test your extraction at the Data Model level.

Example 3: Date Validation

Is Valid expressions can also do more than just compare two fields to see if they are equal. In this example we're going to use an expression to validate whether or not a date is valid. A date will be considered valid if it takes place in the past. If you receive an invoice with a date in the future, that Invoice would not be valid.

  1. Navigate to the Data Field you want to validate. In our example we are going to set our Is Valid expression on the Invoice Date Data Field.
  2. Scroll through the property grid and locate the Is Valid property.
  3. Click the "..." to the right of the property to open the editor.
  4. Type in your VB.NET code for your Is Valid expression. In this example our expression is: Invoice_Date < Now
  5. Click "OK" on the Is Valid editor.
  6. (Optional) Insert an expression into the Validate Message property if you want an error message to appear if deemed invalid. In our example we used the following: "Date is in the future"
  7. Save your changes to the Data Field and test your extraction at the Data Model level.

How Is Required differs from "Required" (Required Fields)

The "Required" property is a static rule:

  • If "Required" is enabled, the Data Field must always have a value.
  • If the field is empty, the field is set to an error state with a required-value message.

The "Is Required" property is a dynamic rule:

  • If "Is Required" is set, Grooper evaluates the expression for each field instance.
  • If the expression evaluates to True and the field is empty, the field is treated as required and becomes invalid until a value is provided.
  • If the expression evaluates to False, the field is allowed to remain empty without error.

How Is Required differs from "Is Valid" and "Calculated Value"

"Is Required" answers: Must this field have a value?

  • It is primarily an “empty value” rule (required vs optional).
  • It is used when the main question is whether blank is acceptable.

"Is Valid" answers: Is the current value acceptable?

  • It is evaluated when there is a value present.
  • It is used for business-rule validation such as ranges, formats beyond "Value Type", or cross-field checks.

"Calculated Value" answers: What should the value be?

  • It computes an expected value (and can optionally set the field's value depending on "Calculate Mode").
  • It is best for totals, derived values, and consistency checks where Grooper can calculate what the value should be.

In practice:

  • Use "Is Required" when the field may be blank sometimes.
  • Use "Is Valid" when the field has a value but you need to validate correctness.
  • Use "Calculated Value" when the value should be derived from other data (and possibly auto-filled).


How Is Required is used during validation

During validation, Grooper checks whether a Data Field is empty:

  • If the field is empty and "Required" is enabled, the field is invalid.
  • If the field is empty and an "Is Required" expression exists, Grooper evaluates the expression:
    • If it returns True, the field becomes invalid until a value is entered.
    • If it returns False, the field is allowed to remain blank.

This allows the Data Model to enforce mandatory data only when it applies, reducing unnecessary errors in Review while still ensuring required information is captured when needed.

How To

  1. Navigate a Data Field and locate the Is Required property in the Field's property grid.
  2. Click the "..." to the right of the Is Required property.
  3. When the Is Required editor pops up, begin typing in your Is Required expression in the text box.
    1. In our example we are using the following expression: PO_Number <> ""
  4. Click "OK" on the editor when finished.
  5. Save your changes to the Data Field.
  6. Test your extraction to make sure the Is Require expression is having the desired outcome.