2023.1:Apply Rules (Activity)

From Grooper Wiki
Revision as of 08:34, 18 July 2024 by Rpatton (talk | contribs) (// via Wikitext Extension for VSCode)

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

flowsheet Apply Rules is an Activity that runs flowsheet Data Rules on data that has previously been extracted from documents (folder Batch Folders).

  • The Apply Rules activity will always need to run after an Extract activity runs (An Extract step must come before an Apply Rules step in the order of edit_document Batch Process Steps in a settings Batch Process).

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

Glossary

Activity: Grooper Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. In a settings Batch Process, each edit_document Batch Process Step executes a single Activity (determined by the step's "Activity" property).

  • Batch Process Steps are frequently referred by the name of their configured Activity followed by the word "step". For example: "Classify step".

Apply Rules: flowsheet Apply Rules is an Activity that runs flowsheet Data Rules on data that has previously been extracted from documents (folder Batch Folders).

  • The Apply Rules activity will always need to run after an Extract activity runs (An Extract step must come before an Apply Rules step in the order of edit_document Batch Process Steps in a settings Batch Process).

Batch: inventory_2 Batch nodes are fundamental in Grooper's architecture. They are containers of documents that are moved through workflow mechanisms called settings Batch Processes. Documents and their pages are represented in Batches by a hierarchy of folder Batch Folders and contract Batch Pages.

Batch Process Step: edit_document Batch Process Steps are specific actions within a settings Batch Process sequence. Each Batch Process Step performs an "Activity" specific to some document processing task. These Activities will either be a "Code Activity" or "Review" activities. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Process Steps are frequently referred to as simply "steps".
  • Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".

Batch Process: settings Batch Process nodes are crucial components in Grooper's architecture. A Batch Process is the step-by-step processing instructions given to a inventory_2 Batch. Each step is comprised of a "Code Activity" or a Review activity. Code Activities are automated by Activity Processing services. Review activities are executed by human operators in the Grooper user interface.

  • Batch Processes by themselves do nothing. Instead, they execute edit_document Batch Process Steps which are added as children nodes.
  • A Batch Process is often referred to as simply a "process".

Content Model: stacks Content Model nodes define a classification taxonomy for document sets in Grooper. This taxonomy is defined by the collections_bookmark Content Categories and description Document Types they contain. Content Models serve as the root of a Content Type hierarchy, which defines Data Element inheritance and Behavior inheritance. Content Models are crucial for organizing documents for data extraction and more.

Data Rule: flowsheet Data Rules are used to normalize or otherwise prepare data collected in a data_table Data Model for downstream processes. Data Rules define data manipulation logic for data extracted from documents (folder Batch Folders) to ensure data conforms to expected formats or meets certain standards.

  • Each Data Rule executes a "Data Action" which do things like computing a field's value, parse a field into other fields, perform lookups, and more.
  • Data Actions can be conditionally executed based on a Data Rule's "Trigger" expression.
  • A hierarchy of Data Rules can be created to execute multiple Data Actions and perform complex data transformation tasks.
  • Data Rules can be applied by:
    • The Apply Rules activity (must be done after data is collected by the Extract activity)
    • The Extract activity (will run after the Data Model extraction)
    • The Convert Data activity when converting document to another Document Type
    • They can be applied manually in a Data Viewer with the "Run Rule" command.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

Scope: The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

About

Once you have extracted data from a Batch you might want to do something to manipulate that data. Data Rules allow you to automatically manipulate data through use of .NET, LINQ, and/or lambda expressions. For information on how to set these up, please visit our Data Rules wiki article.

However, just having Data Rule objects set up in your Content Model will not actually run those rules on your extracted data. To automatically apply these Data Rules you will need to include an Apply Rules Batch Process Step to your Batch Process after your Extract Batch Process Step.

How To

Creating and configuring the Apply Rules Activity is fairly simple and requires that you add an Apply Rules Step to your Batch Process.

Data Rules can only be applied on data that has already been extracted, so the Apply Rules Step must ALWAYS come after an Extract Step within a Batch Process.

Adding the Apply Rules Step

The first thing we need to do is actually add an Apply Rules Step to our Batch Process and assign the Scope to the Step.

  1. Right-click on the Batch Process.
  2. Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Apply Rules..."
  3. When the "Add Activity" window pops up, you can change the name in the Step Name if you like, but in this tutorial we're going to keep the default of "Apply Rules".
  4. Click "EXECUTE" located in the top right corner of the pop up window.


  1. Now you should have an Apply Rules Batch Process Step in your Batch Process.
  2. You will need to set your Scope to the folder level your documents reside at. For our example, our Scope is set to Folder and the Folder Level is 1. You will never need to set this property to a Page Folder Level.


Configuring the Apply Rules Activity

Now that we have added our Apply Rules Step and set the Scope we need to tell Grooper what to actually do in this step. We will be configuring the Batch Process Step in the right-most property grid. We need to tell Grooper which Data Rules to apply and if we want issues to raise a flag on the document.

  1. Click on the ellipsis icon to the right of the Rules Property.


  1. When the "Rules" window pops up, in the right panel of the pop up, navigate to and click the check boxes next to the Data Rule objects you want to apply to the extracted data.
  2. The selected Data Rules will show up in a list on the left side of the window. They will be displayed in the order they will be applied.
  3. Click the up and down icons located above the Rules list to change the order of the selected rule in the list.
  4. When you are finished adding and reordering your Data Rules, click "OK" located in the top right of the pop up window.


  1. Check the box next to the Flag Issues property if you want the folder to be flagged when the Raise Issue action is fired. This will make it easier to find the issues during review.