Expressions Cookbook (Concept): Difference between revisions

From Grooper Wiki
Created page with "Expressions are snippets of .NET code, allowing Grooper to do various things outside its "normal" parameters. This includes calculating or validating extracted '''Data Field'..."
 
No edit summary
Line 86: Line 86:
== Batch Processing Expressions ==
== Batch Processing Expressions ==
Should Submit Expressions
=== Should Submit Expression ===
Inspecting flagged status
These examples would submit the task when the object (i.e. folder, page) is flagged or not flagged (2nd example)
Inspecting flagged status
Item.Flagged
* These examples would submit the task when the object (i.e. folder, page) is flagged or not flagged (2nd example)
Not Item.Flagged
** <code>Item.Flagged</code>
This example would submit the task when the object (folder) contains one or more flagged pages
** <code>Not Item.Flagged</code>
DirectCast(Item, BatchFolder).FlaggedPages.Any()
* This example would submit the task when the object (folder) contains one or more flagged pages
Inspecting flagged message
** <code>DirectCast(Item, BatchFolder).FlaggedPages.Any()</code>
Item.FlagReason = "Needs classification"
 
Item.FlagReason <> "Bypass review"
Inspecting flagged message
Inspecting presence of local copy in Grooper
* <code>Item.FlagReason = "Needs classification"</code>
DirectCast(Item, BatchFolder).HasLocalCopy
* <code>Item.FlagReason <> "Bypass review"</code>
Inspecting existence of native version
 
DirectCast(Item, BatchFolder).HasAttachment
Inspecting presence of local copy in Grooper
Inspecting MIME type
* <code>DirectCast(Item, BatchFolder).HasLocalCopy</code>
This example would submit the task when the object's (folder) represents a native PDF or the second if its mime type is PDF
 
DirectCast(Item, BatchFolder).IsNativePDF
Inspecting existence of native version
DirectCast(Item, BatchFolder).AttachmentMimeType = "application/pdf"
* <code>DirectCast(Item, BatchFolder).HasAttachment</code>
Inspecting content type / parent content category
 
DirectCast(Item, BatchFolder).ContentTypeName = "MyContentType"
Inspecting MIME type
DirectCast(DirectCast(Item, BatchFolder).ContentType.ParentNode, ContentCategory).Name = "MyContentCategory"
* This example would submit the task when the object's (folder) represents a native PDF or the second if its mime type is PDF
Inspecting if a field is blank / populated
** <code>DirectCast(Item, BatchFolder).IsNativePDF</code>
DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value <> ""
** <code>DirectCast(Item, BatchFolder).AttachmentMimeType = "application/pdf"</code>
Not String.IsNullOrEmpty(DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value)
 
Inspecting image properties (resolution, color mode, aspect ratio, size (in bytes), pixel count, etc.)
Inspecting content type / parent content category
DirectCast(Item, BatchPage).PrimaryImage.ResolutionX < 240
* <code>DirectCast(Item, BatchFolder).ContentTypeName = "MyContentType"</code>
DirectCast(Item, BatchPage).PrimaryImage.IsBinary
* <code>DirectCast(DirectCast(Item, BatchFolder).ContentType.ParentNode, ContentCategory).Name = "MyContentCategory"</code>
DirectCast(Item, BatchPage).PrimaryImage.IsColor
 
DirectCast(Item, BatchPage).PrimaryImage.IsLandscape
Inspecting if a field is blank / populated
DirectCast(Item, BatchPage).PrimaryImage.AspectRatio > 1.25
* <code>DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value <> ""</code>
DirectCast(Item, BatchPage).PrimaryImage.Size > 40960
* <code>Not String.IsNullOrEmpty(DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value)</code>
DirectCast(Item, BatchPage).PrimaryImage.PixelCount > 3500000
 
Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)
Inspecting image properties (resolution, color mode, aspect ratio, size (in bytes), pixel count, etc.)
DirectCast(Item, BatchFolder).HasLayoutData
* <code>DirectCast(Item, BatchPage).PrimaryImage.ResolutionX < 240</code>
Does page / document have OCR text?
* <code>DirectCast(Item, BatchPage).PrimaryImage.IsBinary</code>
DirectCast(Item, BatchFolder).HasRuntimeOCR
* <code>DirectCast(Item, BatchPage).PrimaryImage.IsColor</code>
DirectCast(Item, BatchPage).HasRuntimeOCR
* <code>DirectCast(Item, BatchPage).PrimaryImage.IsLandscape</code>
Inspecting classification candidates and classification scores, incl. alternate candidate scores
* <code>DirectCast(Item, BatchPage).PrimaryImage.AspectRatio > 1.25</code>
* <code>DirectCast(Item, BatchPage).PrimaryImage.Size > 40960</code>
* <code>DirectCast(Item, BatchPage).PrimaryImage.PixelCount > 3500000</code>
 
Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)
* <code>DirectCast(Item, BatchFolder).HasLayoutData</code>
 
Does page / document have OCR text?
* <code>DirectCast(Item, BatchFolder).HasRuntimeOCR</code>
* <code>DirectCast(Item, BatchPage).HasRuntimeOCR</code>
 
Inspecting classification candidates and classification scores, incl. alternate candidate scores
*
Next Step Expressions
=== Next Step Expressions ===
Inspecting batch creator
Inspecting batch creator
If(Batch.CreatedBy.ToLower() = "domain\jusername", TrueStepName, FalseStepName)
* <code>If(Batch.CreatedBy.ToLower() = "domain\jusername", TrueStepName, FalseStepName)</code>
If(Batch.CreatedByDisplayName = "Joe Username", TrueStepName, FalseStepName)
* <code>If(Batch.CreatedByDisplayName = "Joe Username", TrueStepName, FalseStepName)</code>
Inspecting creation time (range, day of week)
 
If(DatePart(DateInterval.Month, Batch.Created) = 6, TrueStepName, FalseStepName)
Inspecting creation time (range, day of week)
If(DatePart(DateInterval.Day, Batch.Created) > 15, TrueStepName, FalseStepName)
* <code>If(DatePart(DateInterval.Month, Batch.Created) = 6, TrueStepName, FalseStepName)</code>
* <code>If(DatePart(DateInterval.Day, Batch.Created) > 15, TrueStepName, FalseStepName)</code>
 
== IP Profile Expressions ==
=== IP Command Should Execute Expressions ===
Inspecting image properties (resolution, color mode, aspect ratio, size, pixel count, etc.)
* <code>Image.ResolutionX < 240</code>
* <code>Image.IsBinary</code>
* <code>Image.IsColor</code>
* <code>Image.IsLandscape</code>
* <code>Image.AspectRatio > 1.25</code>
* <code>Image.Size > 40960</code>
* <code>Image.PixelCount > 3500000</code>
 
Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)
* <code>Results.Line_Detection.HorizontalLines.Any()</code>
* <code>Results.Line_Detection.VerticallLines.Any()</code>
* <code>Results.Box_Detection.Boxes.Any()</code>
* <code>Results.Patch_Code_Detection.PatchCodes.Any()</code>
 
Decisioning based on image classification (Results.ClassifyImage.whatever)
* <code>Results.Classify_Image.ClassName = "Sample 1"</code>
 
Accessing and inspecting results log of prior IP commands
* <code>Results.Measure_Entropy.Entropy > 0.85</code>
 
Inspecting whether prior commands modified image(s)
* <code>ResultList.IsImageSourceImage</code>
 
== Mapping Expressions ==
=== Import Mapping Expressions ===
Value concatenation
* <code>String.Concat(field1, field2)</code>
* <code>String.Concat(field1, " ", field2)</code>
 
Value padding (adding or removing)
* These examples show how to left-pad a value with zeroes for 20 characters, right-pad a value with spaces for 40 characters, and finally trim a padded value of spaces.
** <code>field1.PadLeft(20, "0"c)</code>
** <code>field2.PadRight(40)</code>
** <code>field3.Trim()</code>
 
Adding environment variables (date, user, etc.)
* <code>Now</code>
* <code>Environment.MachineName</code>
* <code>Environment.UserName</code>
* <code>Environment.UserDomainName</code>
* <code>Environment.OSVersion</code>
* <code>Environment.ProcessorCount</code>
IP Command Should Execute Expressions
=== Export Mapping Expressions ===
• Inspecting image properties (resolution, color mode, aspect ratio, size, pixel count, etc.)
Addition of multiple fields
○ Image.ResolutionX < 240
* <code>IntegerField1 + IntegerField2</code>
○ Image.IsBinary
* <code>DecimalField1 + DecimalField2 + DecimalField3</code>
○ Image.IsColor
 
○ Image.IsLandscape
Concatenation of multiple fields
○ Image.AspectRatio > 1.25
* <code>String.Concat(StringField1, StringField2)</code>
○ Image.Size > 40960
* <code>String.Concat(StringField2, ", ", StringField1, ": ", StringField3)</code>
○ Image.PixelCount > 3500000
 
• Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)
How to access Grooper attributes (content type name, GUID, index data, etc.)
○ Results.Line_Detection.HorizontalLines.Any()
* <code>CurrentDocument.ContentTypeName</code>
○ Results.Line_Detection.VerticallLines.Any()
* <code>CurrentDocument.Id</code>
○ Results.Box_Detection.Boxes.Any()
* <code>CurrentDocument.IndexData.Sections("Section1").Fields("Field1").Value</code>
○ Results.Patch_Code_Detection.PatchCodes.Any()
* <code>CurrentDocument.IndexData.Sections("Section1").Sections("SectionA").Fields("Field1A").Value</code>
• Decisioning based on image classification (Results.ClassifyImage.whatever)
* <code>CurrentDocument.IndexData.Tables("Table1").Rows.First().Cells("Column1").Value</code>
○ Results.Classify_Image.ClassName = "Sample 1"
 
• Accessing and inspecting results log of prior IP commands
Naming based on original file name
○ Results.Measure_Entropy.Entropy > 0.85
* <code>IO.Path.GetFileNameWithoutExtension(CurrentDocument.ContentLink.Name)</code>
• Inspecting whether prior commands modified image(s)
○ ResultList.IsImageSourceImage
Import Mapping Expressions
== General ==
• Value concatenation
Understanding how to traverse hierarchy of, e.g. batch or content model
○ String.Concat(field1, field2)
*
○ String.Concat(field1, " ", field2)
 
• Value padding (adding or removing)
Understanding how to parse tables by row & column
These examples show how to left-pad a value with zeroes for 20 characters, right-pad a value with spaces for 40 characters, and finally trim a padded value of spaces.
*
○ field1.PadLeft(20, "0"c)
 
○ field2.PadRight(40)
Identifying Sections by instance number
○ field3.Trim()
*
• Adding environment variables (date, user, etc.)
 
○ Now
How to inspect properties of node
○ Environment.MachineName
*
○ Environment.UserName
 
○ Environment.UserDomainName
Dynamic referencing vs. GUID referencing
○ Environment.OSVersion
*
○ Environment.ProcessorCount
 
Conditional expressions with IIF / IF
Export Mapping Expressions
*
• Addition of multiple fields
 
○ IntegerField1 + IntegerField2
Using LINQ in Expressions
○ DecimalField1 + DecimalField2 + DecimalField3
* [[LINQ to Grooper Objects]]
• Concatenation of multiple fields
 
○ String.Concat(StringField1, StringField2)
Direct Casting: when to (Cast)
○ String.Concat(StringField2, ", ", StringField1, ": ", StringField3)
*
• How to access Grooper attributes (content type name, GUID, index data, etc.)
○ CurrentDocument.ContentTypeName
○ CurrentDocument.Id
○ CurrentDocument.IndexData.Sections("Section1").Fields("Field1").Value
○ CurrentDocument.IndexData.Sections("Section1").Sections("SectionA").Fields("Field1A").Value
○ CurrentDocument.IndexData.Tables("Table1").Rows.First().Cells("Column1").Value
• Naming based on original file name
○ IO.Path.GetFileNameWithoutExtension(CurrentDocument.ContentLink.Name)
General
Understanding how to traverse hierarchy of, e.g. batch or content model
Understanding how to parse tables by row & column
Identifying Sections by instance number
How to inspect properties of node
Dynamic referencing vs. GUID referencing
Conditional expressions with IIF / IF
Using LINQ in Expressions
Grooper Wiki Article
Direct Casting: when to (Cast)

Revision as of 10:43, 31 July 2020

Expressions are snippets of .NET code, allowing Grooper to do various things outside its "normal" parameters. This includes calculating or validating extracted Data Field values in a Data Model, applying conditional execution of a Batch Process or IP Profile, and more! This article collects examples of common (and maybe not so common) uses of expressions in Grooper.

Data Model Expressions

Default Value Expressions

Current date

  • Now

Inspecting portions of link path for original file (path, filename, metadata)

  • These examples extract information (full path & filename, filename, path, extension) from a batch folder's content link
    • Link.FullPath
    • Link.Name
    • Link.Path
    • IO.Path.GetFileNameWithoutExtension(Link.Name)
    • IO.Path.GetExtension(Link.Name)
  • These examples extract specific path segments (drive letter, first folder name) from a batch folder's content link
    • Link.PathSegments(0)
    • Link.PathSegments(1)

Populating fields with specific values (i.e. strings, numbers, dates)

  • "Hello world!"
  • 123.45
  • DateAdd("d", 30, Now)
  • Now.ToString("yyyyMMddhhmmss")

Calculate Expressions

Addition of multiple fields

  • IntegerField1 + IntegerField2
  • DecimalField1 + DecimalField2 + DecimalField3

Concatenation of multiple fields

  • String.Concat(StringField1, StringField2)
  • String.Concat(StringField1, StringField2, StringField3)
  • String.Concat(StringField1, StringField2, StringField3, StringField4)

Rounding

  • This example rounds a decimal value to a precision of 4 digits (e.g. 2.34567891 to 2.3457)
    • Math.Round(DecimalField1, 4)

Non-integer addition (e.g. of date values)

  • These examples increment a date by 30 days ("d"), 1 year ("yyyy"), and the last decrements the date by 3 months ("m")
    • DateAdd("d", 30, DateField1)
    • DateAdd("yyyy", 1, DateField1)
    • DateAdd("m", -3, DateField1)

Reformatting / Normalization of values

  • This example replaces any backslashes with underscores
    • StringField1.Replace("\", "_")
  • This example removes any backslashes
    • StringField1.Replace("\", "")

Substring calculation

  • These examples extract information contained within a string "ABC123456XXXX654321YYY" by designating the 0-based starting index and desired number of characters
    • ABC (first 3 characters): StringField1.Substring(0, 3)
    • 123456 (6 characters within the string): StringField1.Substring(3, 6)
    • XXXX (4 characters within the string): StringField1.Substring(9, 4)
    • YYY (last 3 characters): StringField1.Substring(StringField1.Length - 3)

Validate Expressions

Date in past / future

  • This example ensures the date value is a past date
    • DateField1 < Now
  • This example ensures the date value is at least 30 days in the future
    • DateField1 >= DateAdd("d", 30, Now)

Equality / inequality of two fields (multiple options)

  • StringField1 = StringField2
  • IntegerField1.Equals(IntegerField2)
  • IntegerField1 <> DecimalField1
  • Not DecimalField1.Equals(DecimalField2)

Summing fields and comparing to another field

  • IntegerField1 + IntegerField2 = IntegerField3
  • DecimalField1 + DecimalField2 = DecimalField3
  • DecimalField1 = SumFieldInstance("Table1\AmountColumn")

Running regular expression against field

  • Text.RegularExpressions.Regex.IsMatch(StringField1, "[0-9]{6}")

Inspecting field-level confidence scores

  • Instance.Confidence > 0.8

Batch Processing Expressions

Should Submit Expression

Inspecting flagged status

  • These examples would submit the task when the object (i.e. folder, page) is flagged or not flagged (2nd example)
    • Item.Flagged
    • Not Item.Flagged
  • This example would submit the task when the object (folder) contains one or more flagged pages
    • DirectCast(Item, BatchFolder).FlaggedPages.Any()

Inspecting flagged message

  • Item.FlagReason = "Needs classification"
  • Item.FlagReason <> "Bypass review"

Inspecting presence of local copy in Grooper

  • DirectCast(Item, BatchFolder).HasLocalCopy

Inspecting existence of native version

  • DirectCast(Item, BatchFolder).HasAttachment

Inspecting MIME type

  • This example would submit the task when the object's (folder) represents a native PDF or the second if its mime type is PDF
    • DirectCast(Item, BatchFolder).IsNativePDF
    • DirectCast(Item, BatchFolder).AttachmentMimeType = "application/pdf"

Inspecting content type / parent content category

  • DirectCast(Item, BatchFolder).ContentTypeName = "MyContentType"
  • DirectCast(DirectCast(Item, BatchFolder).ContentType.ParentNode, ContentCategory).Name = "MyContentCategory"

Inspecting if a field is blank / populated

  • DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value <> ""
  • Not String.IsNullOrEmpty(DirectCast(Item, BatchFolder).IndexData.Fields("StringField1").Value)

Inspecting image properties (resolution, color mode, aspect ratio, size (in bytes), pixel count, etc.)

  • DirectCast(Item, BatchPage).PrimaryImage.ResolutionX < 240
  • DirectCast(Item, BatchPage).PrimaryImage.IsBinary
  • DirectCast(Item, BatchPage).PrimaryImage.IsColor
  • DirectCast(Item, BatchPage).PrimaryImage.IsLandscape
  • DirectCast(Item, BatchPage).PrimaryImage.AspectRatio > 1.25
  • DirectCast(Item, BatchPage).PrimaryImage.Size > 40960
  • DirectCast(Item, BatchPage).PrimaryImage.PixelCount > 3500000

Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)

  • DirectCast(Item, BatchFolder).HasLayoutData

Does page / document have OCR text?

  • DirectCast(Item, BatchFolder).HasRuntimeOCR
  • DirectCast(Item, BatchPage).HasRuntimeOCR

Inspecting classification candidates and classification scores, incl. alternate candidate scores

Next Step Expressions

Inspecting batch creator

  • If(Batch.CreatedBy.ToLower() = "domain\jusername", TrueStepName, FalseStepName)
  • If(Batch.CreatedByDisplayName = "Joe Username", TrueStepName, FalseStepName)

Inspecting creation time (range, day of week)

  • If(DatePart(DateInterval.Month, Batch.Created) = 6, TrueStepName, FalseStepName)
  • If(DatePart(DateInterval.Day, Batch.Created) > 15, TrueStepName, FalseStepName)

IP Profile Expressions

IP Command Should Execute Expressions

Inspecting image properties (resolution, color mode, aspect ratio, size, pixel count, etc.)

  • Image.ResolutionX < 240
  • Image.IsBinary
  • Image.IsColor
  • Image.IsLandscape
  • Image.AspectRatio > 1.25
  • Image.Size > 40960
  • Image.PixelCount > 3500000

Inspecting presence of layout data (of a certain type: lines, OMR boxes, etc.)

  • Results.Line_Detection.HorizontalLines.Any()
  • Results.Line_Detection.VerticallLines.Any()
  • Results.Box_Detection.Boxes.Any()
  • Results.Patch_Code_Detection.PatchCodes.Any()

Decisioning based on image classification (Results.ClassifyImage.whatever)

  • Results.Classify_Image.ClassName = "Sample 1"

Accessing and inspecting results log of prior IP commands

  • Results.Measure_Entropy.Entropy > 0.85

Inspecting whether prior commands modified image(s)

  • ResultList.IsImageSourceImage

Mapping Expressions

Import Mapping Expressions

Value concatenation

  • String.Concat(field1, field2)
  • String.Concat(field1, " ", field2)

Value padding (adding or removing)

  • These examples show how to left-pad a value with zeroes for 20 characters, right-pad a value with spaces for 40 characters, and finally trim a padded value of spaces.
    • field1.PadLeft(20, "0"c)
    • field2.PadRight(40)
    • field3.Trim()

Adding environment variables (date, user, etc.)

  • Now
  • Environment.MachineName
  • Environment.UserName
  • Environment.UserDomainName
  • Environment.OSVersion
  • Environment.ProcessorCount

Export Mapping Expressions

Addition of multiple fields

  • IntegerField1 + IntegerField2
  • DecimalField1 + DecimalField2 + DecimalField3

Concatenation of multiple fields

  • String.Concat(StringField1, StringField2)
  • String.Concat(StringField2, ", ", StringField1, ": ", StringField3)

How to access Grooper attributes (content type name, GUID, index data, etc.)

  • CurrentDocument.ContentTypeName
  • CurrentDocument.Id
  • CurrentDocument.IndexData.Sections("Section1").Fields("Field1").Value
  • CurrentDocument.IndexData.Sections("Section1").Sections("SectionA").Fields("Field1A").Value
  • CurrentDocument.IndexData.Tables("Table1").Rows.First().Cells("Column1").Value

Naming based on original file name

  • IO.Path.GetFileNameWithoutExtension(CurrentDocument.ContentLink.Name)

General

Understanding how to traverse hierarchy of, e.g. batch or content model

Understanding how to parse tables by row & column

Identifying Sections by instance number

How to inspect properties of node

Dynamic referencing vs. GUID referencing

Conditional expressions with IIF / IF

Using LINQ in Expressions

Direct Casting: when to (Cast)