2023:Code Expressions (Concept): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 34: Line 34:
|}
|}


== Glossary ==
<u><big>'''Batch Process Step'''</big></u>: {{#lst:Glossary|Batch Process Step}}
<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
<u><big>'''Behavior'''</big></u>: {{#lst:Glossary|Behavior}}
<u><big>'''Binarize'''</big></u>: {{#lst:Glossary|Binarize}}
<u><big>'''CMIS Export'''</big></u>: {{#lst:Glossary|CMIS Export}}
<u><big>'''CMIS'''</big></u>: {{#lst:Glossary|CMIS}}
<u><big>'''Code Expressions'''</big></u>: {{#lst:Glossary|Code Expressions}}
<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
<u><big>'''Data Column'''</big></u>: {{#lst:Glossary|Data Column}}
<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
<u><big>'''Data Export'''</big></u>: {{#lst:Glossary|Data Export}}
<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
<u><big>'''Data Section'''</big></u>: {{#lst:Glossary|Data Section}}
<u><big>'''Data Table'''</big></u>: {{#lst:Glossary|Data Table}}
<u><big>'''Data Type'''</big></u>: {{#lst:Glossary|Data Type}}
<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
<u><big>'''Execute'''</big></u>: {{#lst:Glossary|Execute}}
<u><big>'''Export Behavior'''</big></u>: {{#lst:Glossary|Export Behavior}}
<u><big>'''Export'''</big></u>: {{#lst:Glossary|Export}}
<u><big>'''Expressions Cookbook'''</big></u>: {{#lst:Glossary|Expressions Cookbook}}
<u><big>'''Expressions'''</big></u>: {{#lst:Glossary|Expressions}}
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
<u><big>'''IP Group'''</big></u>: {{#lst:Glossary|IP Group}}
<u><big>'''IP Profile'''</big></u>: {{#lst:Glossary|IP Profile}}
<u><big>'''IP Step'''</big></u>: {{#lst:Glossary|IP Step}}
<u><big>'''Line Removal'''</big></u>: {{#lst:Glossary|Line Removal}}
<u><big>'''LINQ to Grooper Objects'''</big></u>: {{#lst:Glossary|LINQ to Grooper Objects}}
<u><big>'''Node Tree'''</big></u>: {{#lst:Glossary|Node Tree}}
<u><big>'''OCR'''</big></u>: {{#lst:Glossary|OCR}}
<u><big>'''Recognize'''</big></u>: {{#lst:Glossary|Recognize}}
<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
<u><big>'''SharePoint'''</big></u>: {{#lst:Glossary|SharePoint}}
<u><big>'''Visual'''</big></u>: {{#lst:Glossary|Visual}}


== Data Model Expressions ==
== Data Model Expressions ==
'''''Data Model Expressions''''' are VB.Net code snippets that modify the behavior of '''Data Fields''' and '''Data Columns''' and their values.  These expressions are commonly used to validate and/or manipulate extracted data, populate fields with system data, and sometimes even document metadata.
'''''Data Model Expressions''''' are VB.Net code snippets that modify the behavior of '''Data Fields''' and '''Data Columns''' and their values.  These expressions are commonly used to validate and/or manipulate extracted data, populate fields with system data, and sometimes even document metadata.


Line 606: Line 535:
'''''!!EXAMPLE FORTHCOMING!!'''''
'''''!!EXAMPLE FORTHCOMING!!'''''


[[Category:Articles]]
== Glossary ==
[[Category:Version 2023]]
<u><big>'''Batch Process Step'''</big></u>: {{#lst:Glossary|Batch Process Step}}
[[Category:Stub]]
 
<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
 
<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
 
<u><big>'''Behavior'''</big></u>: {{#lst:Glossary|Behavior}}
 
<u><big>'''Binarize'''</big></u>: {{#lst:Glossary|Binarize}}
 
<u><big>'''CMIS Export'''</big></u>: {{#lst:Glossary|CMIS Export}}
 
<u><big>'''CMIS'''</big></u>: {{#lst:Glossary|CMIS}}
 
<u><big>'''Code Expressions'''</big></u>: {{#lst:Glossary|Code Expressions}}
 
<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
 
<u><big>'''Data Column'''</big></u>: {{#lst:Glossary|Data Column}}
 
<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
 
<u><big>'''Data Export'''</big></u>: {{#lst:Glossary|Data Export}}
 
<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
 
<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
 
<u><big>'''Data Section'''</big></u>: {{#lst:Glossary|Data Section}}
 
<u><big>'''Data Table'''</big></u>: {{#lst:Glossary|Data Table}}
 
<u><big>'''Data Type'''</big></u>: {{#lst:Glossary|Data Type}}
 
<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
 
<u><big>'''Execute'''</big></u>: {{#lst:Glossary|Execute}}
 
<u><big>'''Export Behavior'''</big></u>: {{#lst:Glossary|Export Behavior}}
 
<u><big>'''Export'''</big></u>: {{#lst:Glossary|Export}}
 
<u><big>'''Expressions Cookbook'''</big></u>: {{#lst:Glossary|Expressions Cookbook}}
 
<u><big>'''Expressions'''</big></u>: {{#lst:Glossary|Expressions}}
 
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
 
<u><big>'''IP Group'''</big></u>: {{#lst:Glossary|IP Group}}
 
<u><big>'''IP Profile'''</big></u>: {{#lst:Glossary|IP Profile}}
 
<u><big>'''IP Step'''</big></u>: {{#lst:Glossary|IP Step}}
 
<u><big>'''Line Removal'''</big></u>: {{#lst:Glossary|Line Removal}}
 
<u><big>'''LINQ to Grooper Objects'''</big></u>: {{#lst:Glossary|LINQ to Grooper Objects}}
 
<u><big>'''Node Tree'''</big></u>: {{#lst:Glossary|Node Tree}}
 
<u><big>'''OCR'''</big></u>: {{#lst:Glossary|OCR}}
 
<u><big>'''Recognize'''</big></u>: {{#lst:Glossary|Recognize}}
 
<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
 
<u><big>'''SharePoint'''</big></u>: {{#lst:Glossary|SharePoint}}
 
<u><big>'''Visual'''</big></u>: {{#lst:Glossary|Visual}}

Revision as of 09:23, 26 August 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023

WIP

This article is a work-in-progress. This article is subject to change and/or expansion. It may be incomplete or stop abruptly. Notably, this article is missing screenshots.

This tag will be removed upon draft completion.

Code Expressions (not to be confused with regular expressions) are snippets of VB.NET code that expand Grooper's core functionality.


There are four types of code expressions (or just "expressions" for short):

  • Data Model Expressions modify the behavior of Data Fields and Data Columns and their values.
  • Batch Process Expressions modify the behavior of Batch Process steps.
  • IP Profile Expressions modify the behavior of image processing steps (IP Steps) in an IP Profile.
  • Mapping Expressions modify the behavior of Import/Export Mappings.

FYI

This article has general information on what the various Grooper expressions are and how they work. If you are looking for more specific examples of expressions, check out our Expressions Cookbook and LINQ articles.

😎

Special thanks to BIS team member Dave Hanon for contributing this article!


Data Model Expressions

Data Model Expressions are VB.Net code snippets that modify the behavior of Data Fields and Data Columns and their values. These expressions are commonly used to validate and/or manipulate extracted data, populate fields with system data, and sometimes even document metadata.

There are four types of data model expressions:

  • Default Value Expressions
  • Calculated Value Expressions
  • Is Valid Expressions
  • Is Required Expressions

FYI

Data Model Expressions are configurable for Data Field and Data Column objects in a Data Model. However, for the purposes of brevity, we will often use the generic term "field" throughout this article, referring both to Data Fields and Data Column cells.

Default Value Expressions

A Default Value Expression is a VB.Net code snippet that determines the default value for a Data Field or Data Column. This can be as simple as a default text string, or it could be an expression that returns system variables or Batch object information.

  • If the Data Field or Data Column is configured with an extractor, and that extractor returns a result during extraction, the default value will be overwritten.

You cannot reference Data Fields or Data Columns by name using a Default Value expression.

If you do need to reference Data Fields or Data Columns in some kind of expression that populates a field with a default value, you need to use a Calculated Value Expression set in Set If Empty mode.

Return Type

Default Value Expressions must return a result that is compatible with the data type of the Data Field or Data Column on which it is configured (set using their Data Type property).

  • Ex: If a Data Field's type is set to String, the expression must return a string value, or the field will throw an error.

Example Default Value Expression

The following expression would generate a random globally unique identifier (GUID):

  • Guid.NewGuid

Calculated Value Expressions

A Calculated Value Expression is a VB.Net code snippet that calculates a value of a field based on the values of other fields, much like how a formula defines a relationship between various cells in a spreadsheet. In addition to mathematical operations and text string manipulation, these expressions can inspect the Grooper node tree, batch, and environment variables or file paths to calculate the desired value.

These expressions can be used in two ways:

  • To populate empty fields with calculated (or manipulated) values
  • To validate existing field values

Return Type

Calculated Value Expressions must return a result that is compatible with the Data Field/Data Column's data type (set using their Data Type property).

  • Ex: If a Data Field's data type is set to Integer, its Calculated Value Expression must evaluate to an integer, or else an error will occur.

Referencing Data Fields and Data Columns by Name

Peer fields (Data Fields at the same level of the Data Model) and peer columns (sibling Data Columns of a Data Table) can be referenced by name in Calculated Value Expressions.

  • Ex: A Data Field's expression could reference a peer Data Field named "Subtotal" simply by writing Subtotal into the expression.

You must substitute underscores for spaces and special characters in Data Column and Data Field names.

  • Ex: A Data Field named "Social Security Number" would need to be typed out Social_Security_Number

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A Data Column named "Expenses - Sch. A" would need to be typed out Expenses_Sch_A

If a Data Field or Data Column's name begins with a number, you must prepend an underscore to the name.

  • Ex: A Data Field named "911" would need to be typed out _911


If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Data Fields in peer single instance Data Sections may be referenced, using dot notation.

  • Ex: You could access the "DOB" Data Field inside a "Employee" Data Section with the expression Employee.DOB
  • Be aware, this expression will fail if the Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in non-peer single instance Data Sections can also be referenced using dot notation. However, you must reference the Data Model's parent Content Type first.

  • Ex: Given a Data Model for a Document Type named "Student Records":
    • A Data Field named "Average" in a Data Section named "Math" could access the "GPA" Data Field inside a "Totals" Data Section with the expression Student_Records.Totals.GPA
  • Be aware, the expression will fail if the referenced Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in multi-instance Data Sections cannot be referenced whatsoever.

  • If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.

Calculate Modes

Calculated Value Expressions can execute using one of three modes, set by configuring the Calculate Mode property:

  • Validate
  • Set If Empty
  • Always Set

Mode

Description

Validate

Validate will check that the field's value mathematically satisfies the Calculated Value Expression.

  • If it does not, puts the field in an error state. The error message on the field will show the difference between the field’s value and the expected result of the expression.
  • As this mode pertains to mathematical validation only, it only works with numerical data types (Int, Decimal, Double).
    • For non-mathematical validation, use an Is Valid Expression.

Set If Empty

Set If Empty will only populate the field with the Calculated Value Expression's result if no value was collected during the Extract step of a Batch Process.

  • In other words, if the field is still blank after extraction, the expression will run and fill the field with its result. If there's anything at all in the field, the expression does nothing.
  • Some Grooper users think of a Calculated Value Expression in Set If Empty mode as a more robust version of a Default Value Expression.
    • Calculated Value Expressions can reference Data Elements (whereas Default Value Expressions cannot) and more methods than Default Value Expressions.

Always Set

Always Set will always populate the field with the Calculated Value Expression's result (unless the expression fails to produce a result).

  • When you use these types of expressions, you may not even configure the Data Field or Data Column with an extractor, instead generating the calculated field's value using using other field values in the Data Model, system or document metadata or a combination of thereof.
    • If an extractor is configured you may be using the expression to manipulate the extracted value to get a desired result (such as performing substring matching or some kind of mathematical operation).
  • If any of the component values that make up the expression are modified, the Calculated Value Expression will update automatically.
    • Ex: Imagine a calculated field's result is populated using a Calculated Value Expression that simply adds the values of two other fields together. If one of those referenced field's value is changed manually during user review, the calculated field's value will automatically be updated by the Calculated Value Expression.

Example Calculated Value Expression

!!EXAMPLE FORTHCOMING!!

Is Valid Expression

An Is Valid Expression is a snippet of VB.Net code that determines whether a Data Field or Data Column’s value is valid.

Return Type

Is Valid Expressions must return a Boolean (True/False) value:

  • The expression must evaluate to “True” to be considered valid.
  • If the expression returns “False,” an error is thrown stating "Validation Expression failed". The field's background color also changes to red to visually indicate the error.
    • You may write a custom error message by configuring the Validate Message property.

Referencing Data Fields and Data Columns by Name

Peer fields (Data Fields at the same level of the Data Model) and peer columns (sibling Data Columns of a Data Table) can be referenced by name in Is Valid Expressions.

  • Ex: A Data Field's expression could reference a peer Data Field named "Subtotal" simply by writing Subtotal into the expression.

You must substitute underscores for spaces and special characters in Data Column and Data Field names.

  • Ex: A Data Field named "Social Security Number" would need to be typed out Social_Security_Number

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A Data Column named "Expenses - Sch. A" would need to be typed out Expenses_Sch_A

If a Data Field or Data Column's name begins with a number, you must prepend an underscore to the name.

  • Ex: A Data Field named "911" would need to be typed out _911


If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Data Fields in peer single instance Data Sections may be referenced, using dot notation.

  • Ex: You could access the "DOB" Data Field inside a "Employee" Data Section with the expression Employee.DOB
  • Be aware, this expression will fail if the Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in non-peer single instance Data Sections can also be referenced using dot notation. However, you must reference the Data Model's parent Content Type first.

  • Ex: Given a Data Model for a Document Type named "Student Records":
    • A Data Field named "Average" in a Data Section named "Math" could access the "GPA" Data Field inside a "Totals" Data Section with the expression Student_Records.Totals.GPA
  • Be aware, the expression will fail if the referenced Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in multi-instance Data Sections cannot be referenced whatsoever.

  • If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.

Example Is Valid Expression

!!SCREENSHOTS FORTHCOMING!!

For this example, we’ll be using the Regex.IsMatch method. This method compares a string input to a regular expression, returning “True” if they match.

The field “OKDL” is meant to capture Oklahoma Driver’s License Numbers.

  • OK driver license have a strict pattern, consisting of one capital letter followed by 9 numerical digits.
  • We could use the following Is Valid Expression to verify the result matches the driver license pattern (or in other words, is valid data).
    • Regex.IsMatch(OKDL, "[A-Z][0-9]{9}")

If the field’s value does not match the regular expression, Grooper will mark it as invalid and the field turns red:

Change the value to a valid ID number, and viola! The field is no longer in error:

Is Required Expressions

You can set a field to be required in Grooper by setting the Required property to true. If the field is not extracted during the Extract step (in other words, "blank"), the field will be flagged as invalid. However, what if whether or not a field is required is based on the value of some other field? You would want that field to be conditionally required, based on some set criteria or parameter.

That's what Is Required Expressions are for. An Is Required Expression is a snippet of VB.Net code that sets a field’s “Required” status conditionally.

Return Type

Is Required Expressions must return a Boolean (True/False) value:

  • If the expression returns “True,” the field becomes required.
  • If the expression returns “False,” it remains optional.

Referencing Data Fields and Data Columns by Name

Peer fields (Data Fields at the same level of the Data Model) and peer columns (sibling Data Columns of a Data Table) can be referenced by name in Is Required Expressions.

  • Ex: A Data Field's expression could reference a peer Data Field named "Subtotal" simply by writing Subtotal into the expression.

You must substitute underscores for spaces and special characters in Data Column and Data Field names.

  • Ex: A Data Field named "Social Security Number" would need to be typed out Social_Security_Number

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A Data Column named "Expenses - Sch. A" would need to be typed out Expenses_Sch_A

If a Data Field or Data Column's name begins with a number, you must prepend an underscore to the name.

  • Ex: A Data Field named "911" would need to be typed out _911


If you’re unsure how to format a field’s name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Data Fields in peer single instance Data Sections may be referenced, using dot notation.

  • Ex: You could access the "DOB" Data Field inside a "Employee" Data Section with the expression Employee.DOB
  • Be aware, this expression will fail if the Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in non-peer single instance Data Sections can also be referenced using dot notation. However, you must reference the Data Model's parent Content Type first.

  • Ex: Given a Data Model for a Document Type named "Student Records":
    • A Data Field named "Average" in a Data Section named "Math" could access the "GPA" Data Field inside a "Totals" Data Section with the expression Student_Records.Totals.GPA
  • Be aware, the expression will fail if the referenced Data Section's Miss Disposition property is set to No Instance and no section instance is extracted.


Data Fields in multi-instance Data Sections cannot be referenced whatsoever.

  • If you try to reference fields in multi-instance sections, you will find Grooper's IntelliSense will not show the field's name and the expression will throw an error if you manually type in the name.

Example Is Required Expression

!!SCREENSHOTS FORTHCOMING!!

In the following example, there are two fields.

  • “Marital Status” is a Boolean field with a true value of “Married” and a false value of “Single.”
  • “Spouse Name” is a string field. It has the the following Is Required Expression:
    • Marital_Status = True

When the "Marital Status" field is set to "Single" (therefore "False"), the "Spouse Name" field is not required.

  • The condition set by the Is Required Expression Marital_Status = True is met. The "Marital Status" field's value equates to "False". So, the expression returns "False"
  • The field remains an optional field. No value is required to be entered.

If “Marital Status” is “Married” (therefore "True"), the “Spouse Name” field then becomes required.

  • The condition set by the Is Required Expression Marital_Status = True is met. The "Marital Status" field's value equates to "True". So, the expression returns "True"
  • The field will be in error and turn red until a value is entered.

Batch Process Expressions

Batch Process Expressions are snippets of VB.Net code that alter the behavior of Batch Process Steps. Normally, steps in a Batch Process follow a sequential order. One step finishes, the next starts up, and so on until all steps are completed. With Batch Process Expressions you can conditionally apply steps or route or re-route Batch content to defined steps based on certain processing criteria.

There are two types of Batch Process Expressions:

  • Should Submit Expressions
    • A Batch Process Step with a Should Submit Expression will only execute if the expression evaluates to “True” for the current Batch object, skipping documents or pages that do not satisfy the expression.
  • Next Step Expressions
    • Next Step Expressions determine which step of the Batch Process to advance to once the current step is completed.

Should Submit Expressions

A Should Submit Expression is a snippet of VB.net code that determines whether a Batch Process Step should execute. These expressions can inspect image properties, Batch object attributes, or the Grooper Node Tree. These properties can be parameters to apply conditional logic to the Batch Process.

  • Should Submit Expressions are a great solution for applications where only some--but not all--documents in a Batch need to be processed by a given Batch Process Step.

Return Type

Should Submit Expressions are Boolean.

  • If the expression evaluates to “True,” the Batch Process Step will execute.
  • If it evaluates to “False,” the step will be skipped.

Example Should Submit Expression

!!EXAMPLE FORTHCOMING!!

Next Step Expressions

A Next Step Expression is a snippet of VB.net code that determines which step in the Batch Process to advance the Batch to once the current step is completed.

  • Next Step Expressions can be a conditional workflow tool, routing Batches to different steps based on certain processing parameters.
  • Ex: If a step flagged any of the documents with an error, you might send that Batch to a Review step. If no documents are flagged, you could bypass the Review step, instead sending it to the next automated step.

Return Type

Next Step Expressions must evaluate to the name of a Batch Process Step or "Nothing".

  • Commonly, If statements are used to return the name of one step if the expression is true, and another step if it is false.
  • If an expression returns a value of Nothing, no further steps are processed and Batch will complete.

Using If Statements to Route Steps

If statements take the following structure:

  • If(Expression, TrueValue, FalseValue)
    • Expression is the expression to be evaluated.
    • TrueValue is the name of the Batch Process Step to advance to if the expression returns "True".
    • FalseValue is the name of the Batch Process Step to advance to if the expression returns "False".


The names of all steps in the Batch Process will be accessible in the Next Step Expression IntelliSense menu.

You must substitute underscores for spaces and special characters in Batch Process Step names.

  • Ex: A step named "Data Review" would need to be typed out Data_Review

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A step named "Recognize - OCR" would need to be typed out Recognize_OCR

If a Batch Process Step's name begins with a number, you must prepend an underscore to the name.

  • Ex: A step named "2nd OCR Pass" would need to be typed out _2nd_OCR_Pass


If you’re unsure how to format a step's name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Example Next Step Expression

!!SCREENSHOTS FORTHCOMING!!

Picture an Export step in a Batch Process. One of two things is going to happen at the end of that step:

  • Either all documents in the Batch will be extracted successfully, with no errors.
  • Or, one or more fields in one or more of the documents will throw an error and need to be reviewed.

For Batches without any errors, you may just want to export all their documents in an Export step. For Batches that had extraction errors you may want a user to review them in a Review step. This is exactly what Next Step Expressions are for. The following expression:

  • If(Batch.HasDataErrors, Review, Export)


Batch.HasDataErrors is a Boolean attribute of a Batch. It returns "True" if the Batch contains invalid index data. So in this case, if the Batch contains invalid index data, it moves the batch to the Review step. If the batch does not contain data errors, it moves ahead to the Export step.

IP Profile Expressions

IP Profile Expressions allow for conditional handling of image processing, allowing single IP Profiles to accommodate multiple contingencies without redundant processing or the need to create multiple IP Profiles. These expressions are configured for the IP Steps (or IP Groups) in an IP Profile


IP Profiles can employ two types of expressions (similar to Batch Processes and their expressions):

  • Should Execute Expressions
    • These expressions determine if an IP Step should be executed on an image at all.
  • Next Step Expressions.
    • These expressions determine which IP Step in the IP Profile to advance the image to after the current step is completed.

Should Execute Expressions

An IP Step's (or IP Group's) Should Execute Expression determines whether or not the step should be executed on the input image (or whether the group of steps should be executed in the case of IP Groups).

Return Type

Should Execute Expressions must return a Boolean (True/False) value.

  • If the expression evaluates to "True", the step processes the image.
  • If it evaluates to "False", no processing occurs and the image is advanced to the following step.

Referencing Prior IP Steps By Name

Names of all IP Steps that occur before the step in question will be accessible in the Should Execute Expression IntelliSense menu.

  • Commonly, you will conditionally execute a step based on the results of a previous step. You'd use the Results method to do this, using Results.IPStepName as part of the expression.
  • FYI: Technically speaking IntelliSense will show you steps after the step whose Should Execute Expression you are configuring. However, it won't make any sense to use them in the expression if they have not been applied in the IP Profile up to that point.

You must substitute underscores for spaces and special characters in IP Step names.

  • Ex: A step named "Line Removal" would need to be typed out Line_Removal

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A step named "Line Removal - Vert. Only" would need to be typed out Line_Removal_Vert_Only

If an IP Step's name begins with a number, you must prepend an underscore to the name.

  • Ex: A step named "2nd Pass Line Removal" would need to be typed out _2nd_Pass_Line_Removal


If you’re unsure how to format a step's name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Example Should Execute Expression

!!SCREENSHOTS FORTHCOMING!!

For this example, I’ve set up a Binarize step in my IP Profile, which will convert color images to black and white. I will use a Should Execute Expression to only execute the Binarize step on non-binary images (color, greyscale, anything not a true bitonal black and white image).

We can use the Image.IsBinary method to inspect the image’s color scheme and negating it with NOT to prevent the step from firing on already-binarized images.

When the IP Step encounters a color image, the expression evaluates to "True" and the step executes.

When the IP Step encounters a black and white image, the expression evaluates to "False" and the step does not execute.

Next Step Expressions

An IP Step’s Next Step Expression determines which IP Step (or IP Group) to advance to upon completion of the current step. These expressions commonly consist of an If statement, which consists of an expression, one IP Step to advance to when that expression returns true, and another IP Step if the expression returns false.

Return Type

Next Step Expressions must evaluate to the name of an IP Step, IP Group or "Nothing".

  • Commonly, If statements are used to return the name of one step if an expression is true, and another if it is false.
  • If the expression returns a value of Nothing, the IP Profile will cease execution entirely.

Using If Statements to Route IP Steps

If statements take the following structure:

  • If(Expression, TrueValue, FalseValue)
    • Expression is the expression to be evaluated.
    • TrueValue is the name of the IP Step to advance to if the expression returns "True".
    • FalseValue is the name of the IP Step to advance to if the expression returns "False".

Referencing IP Steps By Name

The names of IP Steps and IP Groups in the IP Profile will be accessible in the Next Step Expression IntelliSense menu.

  • You can reference other IP Steps via the Steps namespace.
    • Ex: Steps.Speck_Removal

You must substitute underscores for spaces and special characters in IP Step names.

  • Ex: A step named "Line Removal" would need to be typed out Line_Removal

Multiple spaces or special characters in a row must be substituted with a single underscore.

  • Ex: A step named "Line Removal - Vert. Only" would need to be typed out Line_Removal_Vert_Only

If an IP Step's name begins with a number, you must prepend an underscore to the name.

  • Ex: A step named "2nd Pass Line Removal" would need to be typed out _2nd_Pass_Line_Removal


If you’re unsure how to format a step's name, simply begin typing it and Grooper’s built-in IntelliSense menu will show you the correct way.

Example Next Step Expression

!!SCREENSHOTS FORTHCOMING!!

Here, we have an IP Profile with three steps:

  • Auto Border Crop
  • Binarize
  • Speck Removal

I've entered the following Next Step Expression for the first step (Auto Border Crop):

  • If(Image.IsBinary, Steps.Speck_Removal, Steps.Binarize)

Once the Auto Border Crop step finishes, it will do the following according to the Next Step Expression.

  1. Evaluate the expression.
    • Here, it will inspect the image properties to see if the image is binary (black and white) using the Image.IsBinary method.
  2. If the expression evaluates true, move the image to the Speck Removal step.
    • So, if the image is already binary (in other words, if Image.IsBinary = True), then the image will be advanced to the Speck Removal step (skipping the Binarize step entirely).
  3. If the expression evaluates to false, move the image to the Binarize step.
    • If the image is not binary (color, greyscale, etc), it will be advanced to the Binarize' step first.

Mapping Expressions

Mapping Expressions are snippets of VB.Net code that calculate, format or generate field values as they are imported into Grooper or exported to a database.

Import Mapping Expressions

Import Mapping Expressions are snippets of VB.Net code that live on Import Behavior Definitions. These expressions can calculate, format, or generate field values as the document is imported into Grooper, using document metadata from the import source.

  • Ex: You can use Import Mapping Expressions to parse the username from an email's "Sender" property directly to a field before extraction ever takes place.

Return Type

Import Mapping Expressions must return a value of the same data type as the Data Field or Data Column they are populating.

Example Import Mapping Expression

!!EXAMPLE FORTHCOMING!!

Export Mapping Expressions

Export Mapping Expressions are snippets of VB.Net code that live on Export Behavior Definitions and can calculate, format, or generate field values at the time of the document’s export.

Previously in Grooper, placeholder fields would need to be created in a Data Model and Data Model Expressions would calculate and/or generate values at the time of data extraction. This created unnecessary overhead, making the Data Model more difficult to manage, and could increase the time the Extract step took to process.

Export Mapping Expressions give users a simple way to manipulate Grooper extracted data upon export, before it lands in a database table or content management system (CMS) field or metadata property. No need to bloat your Data Model with unnecessary placeholder fields!

Export Mapping Expressions must evaluate to the same data type as the table column (in the case of database exports) or field or metadata property (in the case of exports to content management systems) they are being exported to.

Return Type

Export Mapping Expressions must return a value compatible with the data type of the export destination.

  • In the case of database exports (using a Data Export), the expression's data type must match the table column's data type.
    • Ex: If a SQL column has a data type of DATE, the expression must return a date value in an acceptable format.
  • In the case of content management system exports (using a CMIS Export), the expression's data type must match the field or metadata property's data type.
    • Ex: If a SharePoint Document Library column has a column type of "Number", the expression must return a decimal or integer value.

Example Export Mapping Expression

!!EXAMPLE FORTHCOMING!!

Glossary

Batch Process Step: edit_document Batch Process Steps are specific actions within a settings Batch Process sequence. Each Batch Process Step performs an "Activity" specific to some document processing task. These Activities will either be a "Code Activity" or "Review" activities. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Process Steps are frequently referred to as simply "steps".
  • Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".

Batch Process: settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
  • A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch nodes are fundamental in Grooper's architecture. They are containers of documents that are moved through workflow mechanisms called settings Batch Processes. Documents and their pages are represented in Batches by a hierarchy of folder Batch Folders and contract Batch Pages.

Behavior: A "Behavior" is one of several features applied to a Content Type (such as a description Document Type). Behaviors affect how certain Activities and Commands are executed, based how a document (folder Batch Folder) is classified. They behave differently, according to their Document Type. This includes how they are exported (how Export behaves), if and how they are added to a document search index (how the various indexing commands behave), and if and how Label Sets are used (how Classify and Extract behave in the presence of Label Sets).

  • Each Behavior is enabled by adding it to a Content Type. They are configured in the Behaviors editor.
  • Behaviors extend to descendent Content Types, if the descendent Content Types has no Behavior configuration of its own.
    • For example, all Document Types will inherit their parent Content Model's Behaviors.
    • However, if a Document Type has its own Behavior configuration, it will be used instead.

Binarize: Binarize is an IP Command that converts a color or grayscale image to a bi-tonal (black and white) image using various thresholding methods.

CMIS Export: CMIS Export is an Export Definition available when configuring an Export Behavior. It exports content over a cloud CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.

CMIS: CMIS (Content Management Interoperability Services) is open standard allowing different content management systems to "interoperate", sharing files, folders and their metadata as well as programmatic control of the platform over the internet.

Code Expressions: Code Expressions (not to be confused with regular expressions) are snippets of VB.NET code that expand Grooper's core functionality.

Content Type: Content Types are a class of node types used used to classify folder Batch Folders. They represent categories of documents (stacks Content Models and collections_bookmark Content Categories) or distinct types of documents (description Document Types). Content Types serve an important role in defining Data Elements and Behaviors that apply to a document.

Data Column: view_column Data Columns represent columns in a table extracted from a document. They are added as child nodes of a table Data Table. They define the type of data each column holds along with its data extraction properties.

  • Data Columns are frequently referred to simply as "columns".
  • In the context of reviewing data in a Data Viewer, a single Data Column instance in a single Data Table row, is most frequently called a "cell".

Data Element: Data Elements are a class of node types used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Data Export: Data Export is an Export Definition available when configuring an Export Behavior. It exports extracted document data over a database Data Connection, allowing users to export data to a Microsoft SQL Server or ODBC compliant database.

Data Field: variables Data Fields represent a single value targeted for data extraction on a document. Data Fields are created as child nodes of a data_table Data Model and/or insert_page_break Data Sections.

  • Data Fields are frequently referred to simply as "fields".

Data Model: data_table Data Models are leveraged during the Extract activity to collect data from documents (folder Batch Folders). Data Models are the root of a Data Element hierarchy. The Data Model and its child Data Elements define a schema for data present on a document. The Data Model's configuration (and its child Data Elements' configuration) define data extraction logic and settings for how data is reviewed in a Data Viewer.

Data Section: A insert_page_break Data Section is a container for Data Elements in a data_table Data Model. variables They can contain Data Fields, table Data Tables, and even Data Sections as child nodes and add hierarchy to a Data Model. They serve two main purposes:

  1. They can simply act as organizational buckets for Data Elements in larger Data Models.
  2. By configuring its "Extract Method", a Data Section can subdivide larger and more complex documents into smaller parts to assist in extraction.
    • "Single Instance" sections define a division (or "record") that appears only once on a document.
    • "Multi-Instance" sections define collection of repeating divisions (or "records").

Data Table: A table Data Table is a Data Element specialized in extracting tabular data from documents (i.e. data formatted in rows and columns).

  • The Data Table itself defines the "Table Extract Method". This is configured to determine the logic used to locate and return the table's rows.
  • The table's columns are defined by adding view_column Data Column nodes to the Data Table (as its children).

Data Type: pin Data Types are nodes used to extract text data from a document. Data Types have more capabilities than quick_reference_all Value Readers. Data Types can collect results from multiple extractor sources, including a locally defined extractor, child extractor nodes, and referenced extractor nodes. Data Types can also collate results using Collation Providers to combine, sift and manipulate results further.

Document Type: description Document Type nodes represent a distinct type of document, such as an invoice or a contract. Document Types are created as child nodes of a stacks Content Model or a collections_bookmark Content Category. They serve three primary purposes:

  1. They are used to classify documents. Documents are considered "classified" when the folder Batch Folder is assigned a Content Type (most typically, a Document Type).
  2. The Document Type's data_table Data Model defines the Data Elements extracted by the Extract activity (including any Data Elements inherited from parent Content Types).
  3. The Document Type defines all "Behaviors" that apply (whether from the Document Type's Behavior settings or those inherited from a parent Content Type).

Execute: tv_options_edit_channels Execute is an Activity that runs one or more specified object commands. This gives access to a variety of Grooper commands in a settings Batch Process for which there is no Activity, such as the "Sort Children" command for Batch Folders or the "Expand Attachments" command for email attachments.

Export Behavior: An Export Behavior defines the parameters for exporting classified folder Batch Folder content from Grooper to other systems. This includes where they are exported to (what content management system, file system, database etc), what content is exported (attached files, images, and/or data), how it is formatted (PDF, CSV, XML etc), folder pathing, file naming and data mappings (for Data Export and CMIS Export).

Export: output Export is an Activity that transfers documents and extracted information to external file systems and content management systems, completing the data processing workflow.

Expressions Cookbook: The "Expressions Cookbook" is a reference list for commonly used Code Expressions in Grooper.

Expressions: Expressions (not to be confused with regular expressions) are snippets of VB.NET code that expand Grooper's core functionality.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

IP Group: gallery_thumbnail IP Groups are containers of image IP Steps and/or IP Groups that can be added to perm_media IP Profiles. IP Groups add hierarchy to IP Profiles. They serve two primary purposes:

  1. They can be used simply to organize IP Steps for IP Profiles with large numbers of steps.
  2. They are often used with "Should Execute Expressions" and "Next Step Expressions" to conditionality execute a sequence of IP Steps.

IP Profile: perm_media IP Profiles are a step-by-step list of image processing operations (IP Commands). They are used for several image processing related operations, but primarily for:

  1. Permanently enhancing an image during the Image Processing activity (usually to get rid of defects in a scanned image, such as skewing or borders).
  2. Cleaning up an image in-memory during the Recognize activity without altering the image to improve OCR accuracy.
  3. Computer vision operations that collect layout data (table line locations, OMR checkboxes, barcode value and more) utilized in data extraction.

IP Step: image IP Steps are the basic units of an perm_media IP Profile. They define a single image processing operation, called an IP Command in Grooper.

Line Removal: Line Removal is an IP Command that locates and removes horizontal and vertical lines from documents. The detected line locations are stored as part of page's layout data.

LINQ to Grooper Objects: LINQ is Microsoft .NET component that provides data querying capabilities to the .NET framework. In Grooper, you can use the LINQ syntax in Code Expressions to "LINQ to Grooper Objects". This allows expressions to access information from collections of data, such as from multi-instance Data Sections or Data Tables.

Node Tree: The Node Tree is the hierarchical list of Grooper node objects found in the left panel in the Design Page. It is the basis for navigation and creation in the Design Page.

OCR: OCR is stands for Optical Character Recognition. It allows text on paper documents to be digitized, in order to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text.

Recognize: format_letter_spacing_wide Recognize is an Activity that obtains machine-readable text from contract Batch Pages and folder Batch Folders. When properly configured with an library_booksOCR Profile, Recognize will selectively perform OCR for images and native-text extraction for digital text in PDFs. Recognize can also reference an perm_mediaIP Profile to collect "layout data" like lines, checkboxes, and barcodes. Other Activities then use this machine-readable text and layout data for document analysis and data extraction.

Review: person_search Review is an Activity that allows user attended review of Grooper's results. This allows human operators to validate processed contract Batch Page and folder Batch Folder content using specialized user interfaces called "Viewers". Different kinds of Viewers assist users in reviewing Grooper's image processing, document classification, data extraction and operating document scanners.

SharePoint: SharePoint is a connection option for cloud CMIS Connections. It Grooper to Microsoft SharePoint, providing access to content stored in "document libraries" and "picture libraries" for import and export operations.

Visual: "Visual" is a Classify Method that uses image analysis instead of text data to determine the description Document Type assigned to a folder Batch Folder during classification. Instead of using text-based extractors, an "Extract Features" IP Command in an perm_media IP Profile is used to collect image-based data from a Batch Folder's image(s). This image-based data is compared against that of previously trained document examples of each Document Type to classify the Batch Folder.