Data Model Expressions: Difference between revisions

From Grooper Wiki
No edit summary
Line 145: Line 145:
'''''Calculated Value Expressions''''' must return a result that is compatible with the '''Data Field/Data Column's''' data type (set using their '''''Data Type''''' property).
'''''Calculated Value Expressions''''' must return a result that is compatible with the '''Data Field/Data Column's''' data type (set using their '''''Data Type''''' property).
* Ex: If a '''Data Field's''' data type is set to ''Integer'', its '''''Calculated Value Expression''''' must evaluate to an integer, or else an error will occur.
* Ex: If a '''Data Field's''' data type is set to ''Integer'', its '''''Calculated Value Expression''''' must evaluate to an integer, or else an error will occur.
==== Referencing Data Fields and Data Columns by Name ====
Peer fields ('''Data Fields''' at the same level of the '''Data Model''') and peer columns (sibling '''Data Columns''' of a '''Data Table''') can be referenced by name in '''''Calculated Value Expressions'''''.
* Ex: A '''Data Field's''' expression could reference a peer '''Data Field''' named "Subtotal" simply by writing <code>Subtotal</code> into the expression.
{|class="attn-box"
|
&#9888;
|
You must substitute underscores for spaces and special characters in '''Data Column''' and '''Data Field''' names. 
* Ex:  A '''Data Field''' named "Social Security Number" would need to be typed out <code>Social_Security_Number</code>
Multiple spaces or special characters in a row must be substituted with a single underscore.
* Ex:  A '''Data Column''' named "Expenses - Sch. A" would need to be typed out <code>Expenses_Sch_A</code>
If a '''Data Field''' or '''Data Column's''' name begins with a number, you must prepend an underscore to the name.
* Ex: A '''Data Field''' named "911" would need to be typed out <code>_911</code>
If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.
|}
'''Data Fields''' in peer ''single instance'' '''Data Sections''' may be referenced, using dot notation.
* Ex: You could access the "DOB" '''Data Field''' inside a "Employee" '''Data Section''' with the expression <code>Employee.DOB</code>
* Be aware, this expression will fail if the '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in non-peer ''single instance'' '''Data Sections''' can also be referenced using dot notation.  However, you must reference the '''Data Model's''' parent '''Content Type''' ''first''.
* Ex: Given a '''Data Model''' for a '''Document Type''' named "Student Records":
**  A '''Data Field''' named "Average" in a '''Data Section''' named "Math" could access the "GPA" '''Data Field''' inside a "Totals" '''Data Section''' with the expression <code>Student_Records.Totals.GPA</code>
* Be aware, the expression will fail if the referenced '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in multi-instance '''Data Sections''' ''cannot'' be referenced whatsoever. 
* If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.


==== Calculate Modes ====
==== Calculate Modes ====
Line 304: Line 268:
** You may write a custom error message by configuring the '''''Validate Message''''' property.
** You may write a custom error message by configuring the '''''Validate Message''''' property.


==== Referencing Data Fields and Data Columns by Name ====


Peer fields ('''Data Fields''' at the same level of the '''Data Model''') and peer columns (sibling '''Data Columns''' of a '''Data Table''') can be referenced by name in '''''Is Valid Expressions'''''.
* Ex: A '''Data Field's''' expression could reference a peer '''Data Field''' named "Subtotal" simply by writing <code>Subtotal</code> into the expression.
{|class="attn-box"
|
&#9888;
|
You must substitute underscores for spaces and special characters in '''Data Column''' and '''Data Field''' names. 
* Ex:  A '''Data Field''' named "Social Security Number" would need to be typed out <code>Social_Security_Number</code>
Multiple spaces or special characters in a row must be substituted with a single underscore.
* Ex:  A '''Data Column''' named "Expenses - Sch. A" would need to be typed out <code>Expenses_Sch_A</code>
If a '''Data Field''' or '''Data Column's''' name begins with a number, you must prepend an underscore to the name.
* Ex: A '''Data Field''' named "911" would need to be typed out <code>_911</code>
If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.
|}
'''Data Fields''' in peer ''single instance'' '''Data Sections''' may be referenced, using dot notation.
* Ex: You could access the "DOB" '''Data Field''' inside a "Employee" '''Data Section''' with the expression <code>Employee.DOB</code>
* Be aware, this expression will fail if the '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in non-peer ''single instance'' '''Data Sections''' can also be referenced using dot notation.  However, you must reference the '''Data Model's''' parent '''Content Type''' ''first''.
* Ex: Given a '''Data Model''' for a '''Document Type''' named "Student Records":
**  A '''Data Field''' named "Average" in a '''Data Section''' named "Math" could access the "GPA" '''Data Field''' inside a "Totals" '''Data Section''' with the expression <code>Student_Records.Totals.GPA</code>
* Be aware, the expression will fail if the referenced '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in multi-instance '''Data Sections''' ''cannot'' be referenced whatsoever. 
* If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.


==== Example Is Valid Expression ====
==== Example Is Valid Expression ====
Line 366: Line 296:
* If the expression returns “True,” the field becomes required.  
* If the expression returns “True,” the field becomes required.  
* If the expression returns “False,” it remains optional.
* If the expression returns “False,” it remains optional.
==== Referencing Data Fields and Data Columns by Name ====
Peer fields ('''Data Fields''' at the same level of the '''Data Model''') and peer columns (sibling '''Data Columns''' of a '''Data Table''') can be referenced by name in '''''Is Required Expressions'''''.
* Ex: A '''Data Field's''' expression could reference a peer '''Data Field''' named "Subtotal" simply by writing <code>Subtotal</code> into the expression.
{|class="attn-box"
|
&#9888;
|
You must substitute underscores for spaces and special characters in '''Data Column''' and '''Data Field''' names. 
* Ex:  A '''Data Field''' named "Social Security Number" would need to be typed out <code>Social_Security_Number</code>
Multiple spaces or special characters in a row must be substituted with a single underscore.
* Ex:  A '''Data Column''' named "Expenses - Sch. A" would need to be typed out <code>Expenses_Sch_A</code>
If a '''Data Field''' or '''Data Column's''' name begins with a number, you must prepend an underscore to the name.
* Ex: A '''Data Field''' named "911" would need to be typed out <code>_911</code>
If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.
|}
'''Data Fields''' in peer ''single instance'' '''Data Sections''' may be referenced, using dot notation.
* Ex: You could access the "DOB" '''Data Field''' inside a "Employee" '''Data Section''' with the expression <code>Employee.DOB</code>
* Be aware, this expression will fail if the '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in non-peer ''single instance'' '''Data Sections''' can also be referenced using dot notation.  However, you must reference the '''Data Model's''' parent '''Content Type''' ''first''.
* Ex: Given a '''Data Model''' for a '''Document Type''' named "Student Records":
**  A '''Data Field''' named "Average" in a '''Data Section''' named "Math" could access the "GPA" '''Data Field''' inside a "Totals" '''Data Section''' with the expression <code>Student_Records.Totals.GPA</code>
* Be aware, the expression will fail if the referenced '''Data Section's''' '''''Miss Disposition''''' property is set to ''No Instance'' and no section instance is extracted.
'''Data Fields''' in multi-instance '''Data Sections''' ''cannot'' be referenced whatsoever. 
* If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.


==== Example Is Required Expression ====
==== Example Is Required Expression ====

Revision as of 11:50, 9 October 2025

"Data Model Expressions" are a subset of code expressions in Grooper.

  • Code expressions are snippets of .NET code that are evaluated at runtime to compute values, determine workflow logic, or validate data.
  • Data Model expressions refer to expressions added to Data Fields and Data Columns that are used to calculate and validate field values.

FYI

For brevity's sake, we will refer to "Data Fields" and "Data Column" cells as "fields" throughout this article.

Data Model expressions are configured using the following Data Field and Data Column properties:

  • "Default Value" - Generates a default value for the field.
  • "Calculated Value" - Calculates an expected value for the field based on other field values. This can be used to validate or automatically populate the field.
  • "Is Valid" - A Boolean expression that determines if the field's value is valid.
  • "Is Required" - A Boolean expression that determines if the field is required (meaning it cannot be empty).


Default Value Expressions

"Default Value" expressions determine the default value for a Data Field or Data Column cell.

  • Default values will be overridden by any subsequent data collection event: Value Extractor results, Fill Method (AI Extract) results, or Lookup Specification results
  • You cannot reference Variable Definitions or other Data Field or Data Column values in a Default Value expression. Use "Calculated Value" expressions instead.

Default Value expressions are useful for pre-populating fields with:

  • Static values
  • System-derived values
  • Document metadata

Return Type

Default Value expressions must produce a value compatible with the Data Field/Data Column's "Value Type". Otherwise, the field will throw an error or produce no value on extraction.

  • Ex: The Data Field's "Value Type" is set to String. The expression must produce a string value.
  • Ex: The Data Field's "Value Type" is set to DateTime. The expression must produce a datetime value.

Examples

Purpose Expression
Default to a literal string value. Value must be enclosed in quotes.
"Literal value"
Default to a literal numeric value.
25.00
Global/System Variables
Default to the current date and time
Now
Default to the current date and time, formatted.
Now.ToString("yyyy-MM-dd")
Default to date and time 30 days from now.
DateAdd("d", 30, Now)
Default to date and time 30 days from now, formatted.
DateAdd("d", 30, Now).ToString("yyyy-MM-dd")
Default to the name of the current user.
My.User.Name
Generate a unique identifier
Guid.NewGuid
Document metadata - links
Returns the name of the link (e.g. "Import" or "Export") from a Batch Folder's content link.
Link.LinkName
Returns the attached file's path, including the file's name from a Batch Folder's content link.
Link.FullPath
Returns the attached file's path, not including the file's name from a Batch Folder's content link.
Link.Path

Returns the first segment in a file path from a Batch Folder's content link.

For example, "servername" in "servername\folder\subfolder\file.pdf"
Link.PathSegments(0)
Returns the linked file's full filename, extension included (the "linked object") from a Batch Folder's content link.
Link.ObjectName
Returns the linked file's filename without the extension from a Batch Folder's content link.
IO.Path.GetFileNameWithoutExtension(Link.ObjectName)

Return metadata associated with a known Content Link type (FileSystemLink, MailLink, SftpLink, Cmis.CmisLink, etc).

Replace "FileSystemLink" with the Content Link type
DirectCast(Link,FileSystemLink).Filename
DirectCast(Link,FileSystemLink).CreatedBy
DirectCast(Link,FileSystemLink).CreatedTime
DirectCast(Link,FileSystemLink).LastModifiedTime
Document metadata - Batch Folder attachment
Returns the attachment's filename, with extension.
Folder.AttachmentFileName
Returns the attachment's filename, without extension.
IO.Path.GetFileNameWithoutExtension(Folder.AttachmentFileName)
Returns the attachment's MIME type
Folder.AttachmentMimeType
Returns the attachment's file extension
Folder.AttachmentFileExtension
Document metadata - MIME type specific data

Return metadata associated with email messages

DirectCast(Handler,MailMimeTypeHandler).Subject
DirectCast(Handler,MailMimeTypeHandler).To
DirectCast(Handler,MailMimeTypeHandler).From
DirectCast(Handler,MailMimeTypeHandler).Date

Return metadata associated with PDF files

DirectCast(Handler, PdfMimeTypeHandler).Author
DirectCast(Handler, PdfMimeTypeHandler).Creator
DirectCast(Handler, PdfMimeTypeHandler).CreationDate
DirectCast(Handler, PdfMimeTypeHandler).Title
DirectCast(Handler, PdfMimeTypeHandler).Subject
Document metadata - more Batch Folder info
Default to the Content Type (Document Type) assigned to the document (Batch Folder)
ContentTypeName

Default to the parent Content Type of the Content Type (Document Type) assigned to the document (Batch Folder).

This is helpful for users trying to populate a value for a Document Type's parent Content Category.
Folder.ContentType.ParentNode.DisplayName
Default to the current document's Batch Folder ID (GUID).
Folder.Id
Default to the current document's Batch ID (GUID).
Folder.Batch.Id


Calculated Value Expressions

A Calculated Value Expression is a VB.Net code snippet that calculates a value of a field based on the values of other fields, much like how a formula defines a relationship between various cells in a spreadsheet. In addition to mathematical operations and text string manipulation, these expressions can inspect the Grooper node tree, batch, and environment variables or file paths to calculate the desired value.

These expressions can be used in two ways:

  • To populate empty fields with calculated (or manipulated) values
  • To validate existing field values

Return Type

Calculated Value Expressions must return a result that is compatible with the Data Field/Data Column's data type (set using their Data Type property).

  • Ex: If a Data Field's data type is set to Integer, its Calculated Value Expression must evaluate to an integer, or else an error will occur.

Calculate Modes

Calculated Value Expressions can execute using one of three modes, set by configuring the Calculate Mode property:

  • Validate
  • Set If Empty
  • Always Set

Mode

Description

Validate

Validate will check that the field's value mathematically satisfies the Calculated Value Expression.

  • If it does not, puts the field in an error state. The error message on the field will show the difference between the field’s value and the expected result of the expression.
  • As this mode pertains to mathematical validation only, it only works with numerical data types (Int, Decimal, Double).
    • For non-mathematical validation, use an Is Valid Expression.

Set If Empty

Set If Empty will only populate the field with the Calculated Value Expression's result if no value was collected during the Extract step of a Batch Process.

  • In other words, if the field is still blank after extraction, the expression will run and fill the field with its result. If there's anything at all in the field, the expression does nothing.
  • Some Grooper users think of a Calculated Value Expression in Set If Empty mode as a more robust version of a Default Value Expression.
    • Calculated Value Expressions can reference Data Elements (whereas Default Value Expressions cannot) and more methods than Default Value Expressions.

Always Set

Always Set will always populate the field with the Calculated Value Expression's result (unless the expression fails to produce a result).

  • When you use these types of expressions, you may not even configure the Data Field or Data Column with an extractor, instead generating the calculated field's value using using other field values in the Data Model, system or document metadata or a combination of thereof.
    • If an extractor is configured you may be using the expression to manipulate the extracted value to get a desired result (such as performing substring matching or some kind of mathematical operation).
  • If any of the component values that make up the expression are modified, the Calculated Value Expression will update automatically.
    • Ex: Imagine a calculated field's result is populated using a Calculated Value Expression that simply adds the values of two other fields together. If one of those referenced field's value is changed manually during user review, the calculated field's value will automatically be updated by the Calculated Value Expression.

Examples

Math related calculations
Addition of multiple fields IntegerField1 + IntegerField2
DecimalField1 + DecimalField2 + DecimalField3
Rounding Math.Round(DecimalField1, 4)
Non-integer addition (e.g. of date values) DateAdd("d", 30, DateField1)
DateAdd("yyyy", 1, DateField1)
DateAdd("m", -3, DateField1)
String concatenation and manipulation
Concatenation of multiple fields String.Concat(StringField1, StringField2)
String.Concat(StringField1, StringField2, StringField3)
String.Concat(StringField1, StringField2, StringField3, StringField4)
Reformatting / Normalization of values StringField1.Replace("\", "_")
StringField1.Replace("\", "")
Substring calculation

Given the string ABC123456XXXX654321YYY:

StringField1.Substring(0, 3) returns ABC
StringField1.Substring(3, 6) returns 123456
StringField1.Substring(9, 4) returns XXXX
StringField1.Substring(StringField1.Length - 3) returns YYY
CMIS Content Links

Get properties of a CMIS Content Link.

  • Use this to return property values for a document linked in a CMIS Repository.
  • Replace propertyName with the property's name. Example: The Subject property for an email in an Exchange CMIS Repository.
CurrentDocument.ContentLink.GetCustomValue("propertyName").ToString
Misc expressions

Getting the location coordinates of a field on the document

  • This could be used to determine the coordinates and size of an extracted value on a document.
  • Note: This returns a logical rectangle's location in inches.
GetFieldInstance("Field Name").Location.ToString
Example Calculated Value expressions used in Validate mode
Date in past / future DateField1 < Now
DateField1 >= DateAdd("d", 30, Now)
Equality / inequality of two fields StringField1 = StringField2
IntegerField1.Equals(IntegerField2)
IntegerField1 <> DecimalField1
Not DecimalField1.Equals(DecimalField2)
Summing fields and comparing to another field IntegerField1 + IntegerField2 = IntegerField3
DecimalField1 + DecimalField2 = DecimalField3
DecimalField1 = SumFieldInstance("Table1\AmountColumn")
Running regular expression against field Text.RegularExpressions.Regex.IsMatch(StringField1, "[0-9]{6}")
Inspecting field-level confidence scores Instance.Confidence > 0.8


Is Valid Expression

An Is Valid Expression is a snippet of VB.Net code that determines whether a Data Field or Data Column’s value is valid.

Return Type

Is Valid Expressions must return a Boolean (True/False) value:

  • The expression must evaluate to “True” to be considered valid.
  • If the expression returns “False,” an error is thrown stating "Validation Expression failed". The field's background color also changes to red to visually indicate the error.
    • You may write a custom error message by configuring the Validate Message property.


Example Is Valid Expression

!!SCREENSHOTS FORTHCOMING!!

For this example, we’ll be using the Regex.IsMatch method. This method compares a string input to a regular expression, returning “True” if they match.

The field “OKDL” is meant to capture Oklahoma Driver’s License Numbers.

  • OK driver license have a strict pattern, consisting of one capital letter followed by 9 numerical digits.
  • We could use the following Is Valid Expression to verify the result matches the driver license pattern (or in other words, is valid data).
    • Regex.IsMatch(OKDL, "[A-Z][0-9]{9}")

If the field’s value does not match the regular expression, Grooper will mark it as invalid and the field turns red:

Change the value to a valid ID number, and viola! The field is no longer in error:

Is Required Expressions

You can set a field to be required in Grooper by setting the Required property to true. If the field is not extracted during the Extract step (in other words, "blank"), the field will be flagged as invalid. However, what if whether or not a field is required is based on the value of some other field? You would want that field to be conditionally required, based on some set criteria or parameter.

That's what Is Required Expressions are for. An Is Required Expression is a snippet of VB.Net code that sets a field’s “Required” status conditionally.

Return Type

Is Required Expressions must return a Boolean (True/False) value:

  • If the expression returns “True,” the field becomes required.
  • If the expression returns “False,” it remains optional.

Example Is Required Expression

In the following example, there are two fields.

  • “Marital Status” is a Boolean field with a true value of “Married” and a false value of “Single.”
  • “Spouse Name” is a string field. It has the the following Is Required Expression:
    • Marital_Status = True

When the "Marital Status" field is set to "Single" (therefore "False"), the "Spouse Name" field is not required.

  • The condition set by the Is Required Expression Marital_Status = True is met. The "Marital Status" field's value equates to "False". So, the expression returns "False"
  • The field remains an optional field. No value is required to be entered.

If “Marital Status” is “Married” (therefore "True"), the “Spouse Name” field then becomes required.

  • The condition set by the Is Required Expression Marital_Status = True is met. The "Marital Status" field's value equates to "True". So, the expression returns "True"
  • The field will be in error and turn red until a value is entered.