Data Validation in Grooper: Difference between revisions

From Grooper Wiki
(No difference)

Revision as of 08:55, 3 February 2026

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.


This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025


Introduction

Data Validation in Grooper ensures values extracted into a Data Field are correct, complete, and consistent before moving downstream. Validation can:

  • Prevent invalid formats (dates, numbers, emails)
  • Enforce required values
  • Check values against business rules
  • Verify against external systems (e.g., databases)

Grooper validates using properties on the Data Field, expressions, external Lookups, and rule-driven actions. Validation occurs when fields are set, formatted, re-validated in design/test, and during Batch processing.

The different methods of data validation in Grooper discussed in this article include:

  • Value Types
  • Required Fields
  • Is Valid expressions (WIP)
  • Calculated Value expressions (WIP)
  • Database Lookups (WIP)
  • Data Rules and Data Actions (WIP)

Value Types

The "Value Type" determines how a Data Field's value is parsed, formatted, and validated. It controls the primitive type (text, number, date/time, boolean), array support, and formatting rules. Invalid values will fail parsing/validation and can be flagged by the field. Helpful concepts:

  • "Processing Mode": Format, Filter, Both, Validate — controls whether values are reformatted, filtered, and/or validated.
  • "Value Delimiter": enables multi-value entries; each entry is parsed/validated individually.

Example

  • Validate a "Invoice Date" as a date only. Non-date text will be considered invalid.
  • Validate a "Unit Price" as numeric. Non-numeric text will be considered invalid.

How to set the Value Type

  1. On the Grooper Design Page, navigate to a Data Field or Data Column.
  2. Set "Value Type" property to the desired type (e.g., Date/Time, Numeric, String).
  3. Expand out the "Value Type" sub properties and edit the Format Specifier if desired.
    • The format specifier will change the extracted value to a specified format for that specific Value Type. For example, making sure all dates are returned in the MM/DD/YYYY format.
  4. (Optional) Set "Processing Mode" to Validate for validation-only or Both to format and validate.
  5. (Optional) For multi-values, set "Value Delimiter" (e.g., ;) to validate each item.
  6. Save and test. Enter sample values; invalid inputs will not parse or will show error state.


Value Types properties overview

This section lists the core properties that define a Value Type (Storage Type) used by a Data Field. Each item includes its name, definition, remarks, and typical purpose/use case.

"Processing Mode"

  • Definition: Controls how values are processed by the Value Type (Format, Filter, Both, Validate).
  • Remarks: Determines whether values are reformatted, filtered, both, or only validated. Use Format to reformat without filtering, Filter to filter without formatting, Both to apply both operations, and Validate for validation-only scenarios.
  • Purpose/Use case: Choose the level of processing needed for your field. For example, use Validate when you only need to confirm values without altering their presentation; use Both to standardize display while ensuring values meet constraints.

"Value Delimiter"

  • Definition: The delimiter used to separate multiple values stored in a single field.
  • Remarks: Enables array-style storage when non-empty. Common choices include , or ;, or custom strings such as |. Grooper splits and joins using this delimiter and will parse/format each item independently.
  • Purpose/Use case: Allow multi-value lists (e.g., multiple dates, codes, IDs) in one Data Field while retaining strict per-item validation and formatting.

"Windows Type"

  • Definition: The underlying .NET type (e.g., string, decimal, DateTime) used at runtime by the Value Type.
  • Remarks: Controls parsing and formatting behavior. For arrays, the runtime element type is the item type within the collection.
  • Purpose/Use case: Ensure the Data Field uses the correct primitive type for both validation and formatting (e.g., numbers vs. dates).

"CMIS Type"

  • Definition: The Value Type’s CMIS-compatible property type.
  • Remarks: Used where CMIS integration or property typing is required.
  • Purpose/Use case: Maintain compatibility with CMIS repositories when exporting or syncing metadata.

"SQL Type"

  • Definition: The database type string associated with this Value Type.
  • Remarks: Useful for mapping to relational storage or validating schema alignment when persisting Data Field values.
  • Purpose/Use case: Provide appropriate database typing for storage or integration.

"Name"

  • Definition: The display name of the Value Type.
  • Remarks: Reflects the Storage Type’s friendly name and indicates array status in UI (e.g., "Date Array" when multi-value).
  • Purpose/Use case: Inform users which type (and array status) the field is using during configuration and review.

"Default Value"

  • Definition: The default, typed value for the Value Type.
  • Remarks: Used for testing, formatting demonstrators, or to initialize parsing/format checks (e.g., format validation on edit).
  • Purpose/Use case: Provide a baseline value for format preview or validation diagnostics.

"Is Quoted"

  • Definition: Indicates whether values should be quoted when substituted (e.g., in SQL or external queries).
  • Remarks: Typically false for numeric types, true for string-like types when embedded in queries.
  • Purpose/Use case: Prevent syntax errors or injection issues when building dynamic queries (e.g., Database Lookups).

"Is Formattable"

  • Definition: Specifies if the Value Type supports custom formatting.
  • Remarks: When true, format strings and culture are applied to produce display-appropriate values.
  • Purpose/Use case: Standardize how values appear to users (e.g., currency, dates, number precision).

"Format String"

  • Definition: The format specifier applied when formatting values.
  • Remarks: Validated against the type; invalid specifiers are flagged during property validation. Used only for types that support formatting.
  • Purpose/Use case: Control precision, style, or presentation (e.g., c for currency, date masks).

"Choice Values"

  • Definition: The list of allowed or suggested values exposed by the Value Type (if it is a choice type).
  • Remarks: Used when the type itself provides a choice list (e.g., Booleans, predefined enumerations). Each value is validated against the type.
  • Purpose/Use case: Limit input to a defined set of values and enable dropdowns/autocomplete consistent with the type’s constraints.

"Is Choice"

  • Definition: Indicates whether the Value Type enforces a discrete set of allowed values.
  • Remarks: If true, input is restricted to the provided choices. Otherwise, a field’s Lexicon or list settings may supply choices separately.
  • Purpose/Use case: Enforce controlled vocabularies or simple yes/no style inputs at the type level.

"Validation Notes"

  • Definition: Optional notes shown when validation fails, describing format/rules.
  • Remarks: Included with error messages to guide users when a typed value is invalid (e.g., acceptable range or format).
  • Purpose/Use case: Provide clear guidance during Review when correcting invalid entries.

"Runtime Type"

  • Definition: The effective CLR type used at runtime; array-aware when "Value Delimiter" is set.
  • Remarks: Equals "Array Type" for multi-value fields, otherwise equals "Windows Type".
  • Purpose/Use case: Helps ensure expression engines, lookups, and formatters treat values with the correct runtime type.

"Array Type"

  • Definition: The array type corresponding to the Value Type (e.g., string[], DateTime[]).
  • Remarks: Created by WindowsType.MakeArrayType() when "Value Delimiter" is set.
  • Purpose/Use case: Provide strongly typed multi-value support across parsing/formatting pipelines.

"Is Array"

  • Definition: Indicates whether the Value Type treats field content as an array.
  • Remarks: True when "Value Delimiter" is non-empty. Parsing/formatting operate per item; validation iterates over each value.
  • Purpose/Use case: Enable multi-entry fields (e.g., multiple codes) with true typed validation for each element.

"Join Delimiter"

  • Definition: The delimiter used when joining formatted items back into a single string.
  • Remarks: Derived from "Value Delimiter"; supports escaped control characters (e.g., \r, \n, \t).
  • Purpose/Use case: Control the final presentation of multi-value text after formatting.

"Split Delimiter"

  • Definition: The delimiter used to split incoming text into separate items.
  • Remarks: Trims spaces around the delimiter and supports escaped control characters. Newlines are also treated as separators when splitting.
  • Purpose/Use case: Ensure reliable parsing of multi-item inputs, even with varied line endings or spaced delimiters.

Practical guidance

  • Use "Processing Mode" = Validate when you only want to test correctness without changing user-entered formatting; use Both for consistent display and enforced rules.
  • For multi-value fields, set "Value Delimiter" and rely on "Is Array" behavior to validate each entry independently.
  • For database queries in Lookups, ensure "Is Quoted" matches the underlying type to prevent SQL issues.
  • Use "Format String" with "Is Formattable" types (e.g., Date/Time, Numeric) to present values clearly in Review and exports.
  • When publishing choice-driven fields, prefer "Is Choice" with "Choice Values" or use a Lexicon on the Data Field if choices are field-specific rather than type-level.

Required fields

The "Required" property enforces that a Data Field must contain a value by the end of extraction/validation. Empty values will be flagged for review.

When to use:

  • Key identifiers (e.g., "Invoice Number")
  • Compliance-critical fields (e.g., "Tax ID")
  • Downstream-required fields (e.g., "Customer ID")

Example

  • Require "Vendor ID" on invoices. If empty after extraction, the field is flagged to ensure correction during Review.

Steps

  1. Navigate to the target Data Field.
  2. Locate and enable the "Required" property in the property grid.
  3. Save and test extraction. Empty fields will show an error state until a value is assigned.