Expressions Cookbook (Concept)

From Grooper Wiki
(Redirected from Expressions Cookbook)

This article is a running list of code expressions used in Grooper, such as Default Value Expressions, Calculated Value Expressions, Should Submit Expressions and more.

Expressions are snippets of .NET code, allowing Grooper to do various things outside its "normal" parameters. This includes calculating or validating extracted Data Field values in a Data Model, applying conditional execution of a Batch Process or IP Profile, and more! This article collects examples of common (and maybe not so common) uses of expressions in Grooper.

😎

Special thanks to BIS team member Brian Godwin (and others on the Professional Services team) for contributing this article!

Data Model Expressions

Default Value Expressions

Purpose Expression
Default to a literal string value. Value must be enclosed in quotes.
"Literal value"
Default to a literal numeric value.
25.00
Global/System Variables
Default to the current date and time
Now
Default to the current date and time, formatted.
Now.ToString("yyyy-MM-dd")
Default to date and time 30 days from now.
DateAdd("d", 30, Now)
Default to date and time 30 days from now, formatted.
DateAdd("d", 30, Now).ToString("yyyy-MM-dd")
Default to the name of the current user.
My.User.Name
Generate a unique identifier
Guid.NewGuid
Document metadata - links
Returns the name of the link (e.g. "Import" or "Export") from a Batch Folder's content link.
Link.LinkName
Returns the attached file's path, including the file's name from a Batch Folder's content link.
Link.FullPath
Returns the attached file's path, not including the file's name from a Batch Folder's content link.
Link.Path

Returns the first segment in a file path from a Batch Folder's content link.

For example, "servername" in "servername\folder\subfolder\file.pdf"
Link.PathSegments(0)
Returns the linked file's full filename, extension included (the "linked object") from a Batch Folder's content link.
Link.ObjectName
Returns the linked file's filename without the extension from a Batch Folder's content link.
IO.Path.GetFileNameWithoutExtension(Link.ObjectName)

Return metadata associated with a known Content Link type (FileSystemLink, MailLink, SftpLink, Cmis.CmisLink, etc).

Replace "FileSystemLink" with the Content Link type
DirectCast(Link,FileSystemLink).Filename
DirectCast(Link,FileSystemLink).CreatedBy
DirectCast(Link,FileSystemLink).CreatedTime
DirectCast(Link,FileSystemLink).LastModifiedTime
Document metadata - Batch Folder attachment
Returns the attachment's filename, with extension.
Folder.AttachmentFileName
Returns the attachment's filename, without extension.
IO.Path.GetFileNameWithoutExtension(Folder.AttachmentFileName)
Returns the attachment's MIME type
Folder.AttachmentMimeType
Returns the attachment's file extension
Folder.AttachmentFileExtension
Document metadata - MIME type specific data

Return metadata associated with email messages

DirectCast(Handler,MailMimeTypeHandler).Subject
DirectCast(Handler,MailMimeTypeHandler).To
DirectCast(Handler,MailMimeTypeHandler).From
DirectCast(Handler,MailMimeTypeHandler).Date

Return metadata associated with PDF files

DirectCast(Handler, PdfMimeTypeHandler).Author
DirectCast(Handler, PdfMimeTypeHandler).Creator
DirectCast(Handler, PdfMimeTypeHandler).CreationDate
DirectCast(Handler, PdfMimeTypeHandler).Title
DirectCast(Handler, PdfMimeTypeHandler).Subject
Document metadata - more Batch Folder info
Default to the Content Type (Document Type) assigned to the document (Batch Folder)
ContentTypeName

Default to the parent Content Type of the Content Type (Document Type) assigned to the document (Batch Folder).

This is helpful for users trying to populate a value for a Document Type's parent Content Category.
Folder.ContentType.ParentNode.DisplayName
Default to the current document's Batch Folder ID (GUID).
Folder.Id
Default to the current document's Batch ID (GUID).
Folder.Batch.Id

Calculated Value Expressions

Math related calculations
Addition of multiple fields IntegerField1 + IntegerField2
DecimalField1 + DecimalField2 + DecimalField3
Rounding Math.Round(DecimalField1, 4)
Math.Round(DecimalField1 * DecimalField2, 2)
Non-integer addition (e.g. of date values) DateAdd("d", 30, DateField1)
DateAdd("yyyy", 1, DateField1)
DateAdd("m", -3, DateField1)
String concatenation and manipulation
Concatenation of multiple fields String.Concat(StringField1, StringField2)
String.Concat(StringField1, StringField2, StringField3)
String.Concat(StringField1, StringField2, StringField3, StringField4)
Reformatting / Normalization of values StringField1.Replace("\", "_")
StringField1.Replace("\", "")
Substring calculation

Given the string ABC123456XXXX654321YYY:

StringField1.Substring(0, 3) returns ABC
StringField1.Substring(3, 6) returns 123456
StringField1.Substring(9, 4) returns XXXX
StringField1.Substring(StringField1.Length - 3) returns YYY
CMIS Content Links

Get properties of a CMIS Content Link.

  • Use this to return property values for a document linked in a CMIS Repository.
  • Replace propertyName with the property's name. Example: The Subject property for an email in an Exchange CMIS Repository.
CurrentDocument.ContentLink.GetCustomValue("propertyName").ToString
Misc expressions

Getting the location coordinates of a field on the document

  • This could be used to determine the coordinates and size of an extracted value on a document.
  • Note: This returns a logical rectangle's location in inches.
GetFieldInstance("Field Name").Location.ToString
Examples that could be used in Validate mode
Verify a "Total" field adds up to the sum of the "Subtotal" and "Sales Tax" fields. Subtotal + Sales_Tax
Verify a "Total" field adds up to the sum of all "Line Total" cell values in a "Line Items" Data Table Line_Items.SumOf("Line Total")
Verify a "Total Hours" field adds up to all the "Earned Hours" values in a multi-instance "Semester" Data Section Semester.SumOf("Earned Hours")

Is Valid Expressions

Purpose Expression
Date in past / future
DateField1 < Now
DateField1 >= DateAdd("d", 30, Now)
Equality / inequality of two fields
StringField1 = StringField2
IntegerField1.Equals(IntegerField2)
IntegerField1 <> DecimalField1
Not DecimalField1.Equals(DecimalField2)
Summing fields and comparing to another field
IntegerField1 + IntegerField2 = IntegerField3
DecimalField1 + DecimalField2 = DecimalField3
DecimalField1 = SumFieldInstance("Table1\AmountColumn")
Running regular expression against field
Regex.IsMatch(StringField1, "[0-9]{6}")
Inspecting field-level confidence scores
Instance.Confidence > 0.8


Batch Processing Expressions

Should Submit Expression

Inspecting flagged status

  • These examples would submit the task when the object (i.e. folder, page) is flagged or not flagged (2nd example)
    • Item.Flagged
    • Not Item.Flagged
    • BE AWARE: You cannot use these expressions or others that inspect flag status if the Batch Process Step's "Error Dispossition" (found in the Activity's properties) has the "Flag" disposition enabled. Doing so will clear the flag from the previous step before the Should Submit Expression evaluates (rendering the expression useless).
  • This example would submit the task when the object (folder) contains one or more flagged pages
    • DirectCast(Item, BatchFolder).FlaggedPages.Any()


Inspecting flagged message

  • Item.FlagReason = "Needs classification"
  • Item.FlagReason <> "Bypass review"


Inspecting presence of local copy in Grooper

  • DirectCast(Item, BatchFolder).HasLocalCopy


Inspecting existence of native version

  • DirectCast(Item, BatchFolder).HasAttachment


Inspecting MIME type

  • This example would submit the task when the object's (folder) represents a native PDF or the second if its mime type is PDF
    • DirectCast(Item, BatchFolder).IsNativePDF
    • DirectCast(Item, BatchFolder).AttachmentMimeType = "application/pdf"


Inspecting content type / parent content category

  • DirectCast(Item, BatchFolder).ContentTypeName = "MyContentType"
  • DirectCast(DirectCast(Item, BatchFolder).ContentType.ParentNode, ContentCategory).Name = "MyContentCategory"


Inspecting if a field is blank / populated

  • DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value <> ""
  • Not String.IsNullOrEmpty(DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value)


Inspecting image properties (resolution, color mode, aspect ratio, size (in bytes), pixel count, etc.)

  • DirectCast(Item, BatchPage).PrimaryImage.ResolutionX < 240
  • DirectCast(Item, BatchPage).PrimaryImage.IsBinary
  • DirectCast(Item, BatchPage).PrimaryImage.IsColor
  • DirectCast(Item, BatchPage).PrimaryImage.IsLandscape
  • DirectCast(Item, BatchPage).PrimaryImage.AspectRatio > 1.25
  • DirectCast(Item, BatchPage).PrimaryImage.Size > 40960
  • DirectCast(Item, BatchPage).PrimaryImage.PixelCount > 3500000


Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)

  • DirectCast(Item, BatchFolder).HasLayoutData


Does page / document have OCR text?

  • DirectCast(Item, BatchFolder).HasRuntimeOCR
  • DirectCast(Item, BatchPage).HasRuntimeOCR


Inspecting classification candidates and classification scores, incl. alternate candidate scores

  • DirectCast(Item, BatchFolder).ContentTypeName = "Document Type Name"

Functions and Should Submits

Grooper can now use lambda functions in expressions (and not just Should Submits, all expressions!). This gives you some really advanced capabilities if you have more advanced .NET programing skills.

This example determines if a page scoped task, like Recognize or Execute > Rasterize should be submitted depending on how many text segments are present on a PDF page. If the PDF page has less than 15 text segments, the tasks submits, otherwise the PDF page is not processed.

  • This is useful when dealing with poorly formed PDFs that must be forced to be treated like an image when Grooper otherwise thinks they are a native text document.
Function() As Boolean
  If DirectCast(Item, BatchPage).IsPDF
    Dim doc As Grooper.PDF.PdfDoc = New Grooper.PDF.PdfDoc(DirectCast(Item, BatchPage).GetImageVersion, True)
    Dim info As Grooper.PDF.PdfPageInfo = doc.Sharp.GetPageInfo(0)
    Return (info.DrawTextOps.Count < 15)
  End If
  
End Function

You could change what property values determine if the task is submitted by changing the Return statement in the function. Here are some examples:

  • Return info.PageType = PDF.PdfPageInfo.PageTypes.Mixed - Tasks would submit if the PDF's page type is "Mixed"
  • Return info.RenderResolution = "Color @ 300 DPI" - Tasks would submit if the PDF's render format is Color @ 300 DPI.
  • Return info.PageSize = "8.50"" x 11.00""" - Tasks would submit if the PDF's page size is 8.5 x 11.
  • Return info.Images.Count = 4 - Tasks would submit if PDF has exactly 4 images embedded in it.
  • Return info.PathSegments.Count > 257 - Tasks would submit if the PDF has more than 257 vector drawing paths.

Next Step Expressions

Inspecting batch creator

  • If(Batch.CreatedBy.ToLower() = "domain\jusername", TrueStepName, FalseStepName)
  • If(Batch.CreatedByDisplayName = "Joe Username", TrueStepName, FalseStepName)


Inspecting creation time (range, day of week)

  • If(DatePart(DateInterval.Month, Batch.Created) = 6, TrueStepName, FalseStepName)
  • If(DatePart(DateInterval.Day, Batch.Created) > 15, TrueStepName, FalseStepName)

IP Profile Expressions

IP Command Should Execute Expressions

Inspecting image properties (resolution, color mode, aspect ratio, size, pixel count, etc.)

  • Image.ResolutionX < 240
  • Image.IsBinary
  • Image.IsColor
  • Image.IsLandscape
  • Image.AspectRatio > 1.25
  • Image.Size > 40960
  • Image.PixelCount > 3500000


Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)

  • Results.Line_Detection.HorizontalLines.Any()
  • Results.Line_Detection.VerticalLines.Any()
  • Results.Box_Detection.Boxes.Any()
  • Results.Patch_Code_Detection.PatchCodes.Any()


Decisioning based on image classification (Results.ClassifyImage.whatever)

  • Results.Classify_Image.ClassName = "Sample 1"


Accessing and inspecting results log of prior IP commands

  • Results.Measure_Entropy.Entropy > 0.85


Inspecting whether prior commands modified image(s)

  • ResultList.IsImageSourceImage

Mapping Expressions

Import Mapping Expressions

Value concatenation

  • String.Concat(field1, field2)
  • String.Concat(field1, " ", field2)


Value padding (adding or removing)

  • These examples show how to left-pad a value with zeroes for 20 characters, right-pad a value with spaces for 40 characters, and finally trim a padded value of spaces.
    • field1.PadLeft(20, "0"c)
    • field2.PadRight(40)
    • field3.Trim()


Adding environment variables (date, user, etc.)

  • Now
  • Environment.MachineName
  • Environment.UserName
  • Environment.UserDomainName
  • Environment.OSVersion
  • Environment.ProcessorCount

Export Mapping Expressions

Addition of multiple fields

  • IntegerField1 + IntegerField2
  • DecimalField1 + DecimalField2 + DecimalField3


Concatenation of multiple fields

  • String.Concat(StringField1, StringField2)
  • String.Concat(StringField2, ", ", StringField1, ": ", StringField3)


How to access Grooper attributes (content type name, GUID, index data, etc.)

  • CurrentDocument.ContentTypeName
  • CurrentDocument.Id
  • CurrentDocument.IndexData.Sections("Section1").Fields("Field1").Value
  • CurrentDocument.IndexData.Sections("Section1").Sections("SectionA").Fields("Field1A").Value
  • CurrentDocument.IndexData.Tables("Table1").Rows.First().Cells("Column1").Value


Naming based on original file name

  • IO.Path.GetFileNameWithoutExtension(CurrentDocument.ContentLink.Name)


Converting a date field to a string in a "year-month-day" format

  • DateField.ToString("yyyy-MM-dd")

Misc Expression Snippets

These expressions may or may not be useful by themselves. It's most likely they are used as part of a larger expression. They are documented here to keep track of previously requested solutions.

Count the number of children at a certain level. This would count the number of Batch Folders that are direct children of a Batch Folder being processed.

  • ChildrenAtLevel(1).Count

Count the number of children at a certain level of a parent folder. This would count the number of Batch Folders that are direct children of the parent Batch Folder relative to the Batch Folder being processed.

  • ParentFolder.ChildrenAtLevel(1).Count

General

WIP

This section is a work-in-progress. It needs to be expanded for completeness.

Understanding how to traverse hierarchy of, e.g. batch or content model

Understanding how to parse tables by row & column

Identifying Sections by instance number

How to inspect properties of node

Dynamic referencing vs. GUID referencing

Conditional expressions with IIF / IF

Using LINQ in Expressions

Direct Casting: when to (Cast)