2023.1:Control Sheet Separation (Separation Provider)

From Grooper Wiki

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.120232.90

Control Sheet Separation is a Separation Provider that uses Grooper document_scanner Control Sheets to separate documents.

Grooper Control Sheets are special pages that can be printed out and placed in between documents before scanning. These sheets use patch code barcodes to direct Grooper to perform certain actions, such as creating new Batch Folders in a Batch.

The Control Sheet Separation provider will then create a new Batch Folder (and thus new document) every time it encounters a Control Sheet (if it is configured to do so). All subsequent Batch Pages are placed in that folder until it encounters a new Control Sheet, at which point a new folder is created. The process repeats until the end of the Batch.


You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

Control Sheet Separation utilizes Control Sheets to determine how folders are created during document separation.

Control Sheets are printable pages used to automate document separation. These pages are printed and placed at the beginning of a new batch or document in a stack of loose pages before they are scanned in to Grooper. They contain specialized barcodes called "patch codes", which Grooper can read and process its instructions.


Below is an example of a Grooper Control Sheet:


Control Sheets can perform three main actions:

  1. Create a Batch Folder in a Batch and add subsequent Batch Pages to that folder.
  2. Assign that Batch Folder's folder level in the Batch's folder hierarchy.
  3. Assign that Batch Folder's Content Type property.

How it works: Basic Separation

So, how does Control Sheet Separation actually work?

The Control Sheet Separation provider and Control Sheets in general are designed to be used while scanning physical paper documents. These sheets are printed from Grooper and placed before the first page of each new document.

In the screenshot below, you can see a Batch that was scanned into Grooper and processed no further. There are just loose pages in the Batch. Document separation has not occurred yet.

Imagine the Batch as a physical stack of papers. We have placed a Control Sheet at the beginning of each new document in the stack. This will tell Grooper where to separate.


After separation, Grooper will create folders and separate where it finds a Control Sheet (see the below screenshot).



How it works: Separation and Classification

You can also Classify your documents while you are Separating them when using Control Sheet Separation. All it takes is configuring a few settings, and then making sure your printed Control Sheets are placed in the correct order in your stack of documents before scanning.

  1. When setting up Control Sheets to Separate and Classify documents, we need to create multiple Control Sheets: one for each Document Type in our Content Model.
  2. In addition to the Create Folder and the Folder Level properties, we need to set the Content Type property. Just click on the hamburger icon to access the drop down menu.
  3. In the drop down menu, navigate through the Content Models and select the Document Type you want to apply to the Control Sheet.

  1. In the screenshot below, we have placed a Control Sheet before the start of each document in the Batch according to that document's intended Document Type. You will need to print out your Control Sheets after they have been created and place them in your physical stack of documents in a similar fashion.
  2. The Document Type should appear on the Control Sheet under the folder level indicating where Grooper will create the new folder.

  1. Now when the documents are separated, not only will they be separated into the different folders, but those folders will also automatically be classified according to the Control Sheets that were placed before each document.

How To

Creating a New Control Sheet

  1. In the node tree in your Project right click on the folder where you want to add a Control Sheet.
  2. Hover over "Add" in the menu that pops up, then click on "Control Sheet...".

  1. In the "Add" window that pops up, enter in a name for your Control Sheet.
  2. Click the "EXECUTE" button at the top of the "Add" window to create the Control Sheet.

  1. You should now see the new Control Sheet as an object in the node tree.
  2. With the Control Sheet object selected, you can see a preview of the sheet on the right.

  1. You can edit the Control Sheet properties based on your needs. In this example, we have turned the Create Folder property to "True" and set the Folder Level to 1.
  2. On the Control Sheet preview, we can see there is now an icon with the words "new folder Level 1" to tell us (the human) what Grooper will do when it encounters this Control Sheet.
  3. At the bottom of the control Sheet preview, it might be difficult to tell but the barcode has now changed. The barcode is what Grooper actually reads to understand what to do when it encounters the Control Sheet.

  1. Once you are satisfied with your Control Sheet, make sure to save out your properties by clicking the save icon at the top of the "Properties" section.
  2. Now you can click on the printer icon located in the top right of the preview panel to print the Control Sheet. You can then place the paper copies of the Control Sheets within the batch of documents you will be scanning into Grooper based on where you want Separation to occur.

Glossary

Batch Folder: folder Batch Folder objects are defined as container objects within a inventory_2 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or contract Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.

  • Batch Folders are frequently referred to simply as "documents".

Batch Page: contract Batch Page objects represent individual pages within a inventory_2 Batch. The Batch Page object is the most granular unit in the hierarchy of Batch Objects in Grooper.

  • Batch Pages are frequently referred to simply as "pages".

Batch Process: settings Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each inventory_2 Batch of documents is managed systematically and efficiently.

  • Batch Processes by themselves do nothing. Instead, the workflows they execute are designed by adding child edit_document Batch Process Steps.
  • A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as settings Batch Processes.

Classification: Classification is the process of identifying and organizing documents into categorical types based on their content or layout. Classification is key for efficient document management and data extraction workflows. Grooper has different methods for classifying documents. These include methods that use machine learning and text pattern recognition. In a Grooper Batch Process, the Classify Activity will assign a Content Type to a folder Batch Folder.

Classify: unknown_document Classify is an Activity that "classifies" folder Batch Folders in a inventory_2 Batch by assigning them a Content Type (e.g. a description Document Type) using patterns, lexical understanding, or rules as defined by a stacks Content Model.

Content Model: stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.

Content Type: Content Type refers to objects in Grooper used to classify folder Batch Folders. These include: stacks Content Models, collections_bookmark Content Categories, and description Document Types.

Control Sheet Separation: Control Sheet Separation is a Separation Provider that uses Grooper document_scanner Control Sheets to separate documents.

Document Type: description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.

Project: package_2 Project node objects are the primary containers for configuration nodes within Grooper. The Project is where various processing objects such as stacks Content Models, settings Batch Processes, profile objects, and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.

Separate: insert_page_break Separate is an Activity that sorts contract Batch Pages into individual folder Batch Folders. This distinguishes "loose pages" from the documents formed by those pages. Once loose pages are separated into Batch Folder documents, they can be further processed by unknown_document Classify, export_notes Extract, output Export and other Activities that need to run on the folder (i.e. document) level.

Separation Provider: The Provider property of the Separate Activity defines the type of separation to be performed at the designated Scope.

Separation: Separation is the process of taking an unorganized inventory_2 Batch of loose contract Batch Pages and organizing them into documents represented by folder Batch Folders in Grooper. This is done so Grooper can later assign a description Document Type to each document folder in a process known as "classification".