Grooper Basics - Overview

From Grooper Wiki

This article was migrated from an older version and has not been updated for the current version of Grooper.

This tag will be removed upon article review and update.

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 2023.1

This article serves as a beginner's guide to Grooper. In this article we will go over the most basic concepts and activities within the Grooper software so you can start using it for your company's needs.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article. The third contains PDFs to be used when generating a new project (do not try to upload into Grooper; just unzip and use the PDFs per the tutorial instructions).

Introduction

Grooper is a powerful document processing software that can be tailored to your company's individual needs.

When first diving into Grooper, the software can be a bit overwhelming, especially if you have not worked with document processing software before. The purpose of this tutorial is to familiarize users with the fundamental basics of Grooper. By the end of this tutorial you should understand how to take a set of documents from beginning to end through a very simple process and export out of Grooper.

The 5 Phases of Grooper

From the time that documents are pulled in to Grooper to when those same documents are exported from Grooper, all those documents must go through what we call the "5 Phases of Grooper".

The 5 Phases of Grooper are as follows:

  1. Acquire
  2. Condition
  3. Organize
  4. Collect
  5. Deliver

This tutorial is structured to go through each of the phases one by one. Other tutorials and courses will build on what we learn here. For now, let's break down each of these phases and see what we can expect as we build out our project.


Phase 1: Acquire
The "Acquire" phase involves bringing the documents to be processed into Grooper.


Phase 2: Condition
The "Condition" phase is where we take the documents we have brought into Grooper and edit them until we have something Grooper can work with. Before Grooper can do anything with a document, it needs to be able to Recognize the text on the document. That is one of the most important parts of the Condition phase.


Phase 3: Organize
Once Grooper can recognize text it then needs to be able to tell what the document is. Without our input, Grooper just sees the document as pages in a folder. We need to give the document a name or a "classification". We classify documents in the "Organize" phase.


Phase 4: Collect
In the "Collect" phase, Grooper can finally collect or "extract" information from the documents.


Phase 5: Deliver
The final phase of Grooper is the "Deliver" phase where we export the documents and metadata out of the Grooper software to wherever you wish to store the files.


These five phases encompass a full project in Grooper. In the next sections, we are going to put together a project step-by-step, going through each phase.

Navigating the Web Client

FYI

At the top of this page you will find some downloadable zip files that contain a Project, a Batch, and some PDFs. If you would like to follow along in this section of the tutorial, download these zip files and upload the Project and Batch to your Grooper environment. For instructions on how to do this, visit our Download or Upload Grooper Objects wiki page. The PDFs will be used in a later section when we start to build out a Project from scratch.

Before we continue, let's take a moment to look at Grooper's interface. For now we are only going to take a look at the Home page and the Design page as that is where we will be spending most of our time in this tutorial.

The Home Page

  1. Context Toolbar: At the top of the screen you'll find a toolbar with several icons to the left, the page name, repository name, and Licensee information in the middle, and a few extra icons to the right. The leftmost icons will take you to different Grooper pages, with the first icon taking you to the Grooper home page. The rest of the icons are described more below as they are also found under "Navigation Links". The icons on the right allow you to change the Grooper Repository you are working in, look at your user information, and bring up the in-app Grooper help.
  1. Navigation Links: In the first panel in Grooper at the top left you will find the Navigation Links. These icons will take you to the different Grooper pages:
    • Design: This is where we will be working for the majority of this tutorial. The Design page is where you configure your Grooper Project.
    • Batches: On the Batches page you can view all Batches that are currently in production. You can also add new Batches from this page.
    • Tasks: Here you can view any production Batch that is pending review.
    • Imports: You can view a list of recent imports on the Imports page.
    • Jobs: Here you can see what jobs have been submitted and what stage each job is in.
    • Stats: The Stats page will take you to a place where you can view various Grooper statistics.
    • Learn: This icon will take you to the home page of our Grooper University courses at learn.grooper.com.
    • Wiki: This icon will take you to the home page of our Grooper Wiki website at wiki.grooper.com.
  1. Repository Info: In the second panel on the home page you can get a lot of information about what is currently contained within and is being processed in the repository.
  1. Recent Events: The final panel of the home page will show information regarding recent processing events and errors.


The Design Page

For now, we are going to focus on the Design page, as that is where we will be doing most of our work in this tutorial. In this section, we will look at the UI of the Design page and explain how things are organized to make it easier to navigate and follow along with the rest of this tutorial.

  1. To navigate to the Design page, click on the hammer and wrench icon in the Context Toolbar or you can click on the same icon located on the home screen in the "Navigation Links" panel.
  2. At the top-middle of the window, the page you are currently on will be displayed.
  3. The first panel on the left side of the screen is the "Node Tree". This is where you will navigate through the different elements in your repository.
  4. To the right of the Node Tree you will find different panels of information and configuration options for the element you have selected in the Node Tree. These panels will be different depending on the type of object you have selected.


  1. The first element at the top of the Node Tree is called the "Root Node". Here you can find various information about your repository, including your licensing information.
  2. For each element in the Node Tree selected, there will be multiple tabs you can access with different information and options. These tabs are located at the top of the configuration panels.

The Batches Folder

  1. The Batches folder in the Node Tree is where we will find all documents brought into Grooper as a Batch.
  2. Inside of the Batches folder are two folders: Production and Test. We will be working in the Test folder.


  1. Inside of the Test folder is where all Batches of documents you are using to test your project will be kept. You can create subfolders inside the Test folder to keep Batches organized. For now we are going to look at our first object in Grooper: a Batch object.
  2. With the Batch object selected, click on the "Viewer" tab.


FYI

You might notice that the icon to the right of "2023.1_Grooper-Basics_Batch" looks different than the other folder icons. Different elements in the Node Tree have different icons depending upon what type they are. Folders have a folder icon where objects have an icon that represent the object itself. This icon represents a Batch object.

It is recommended that you familiarize yourself with the different icons and what objects they represent as you learn Grooper. It will make navigating the Node Tree much easier and will help the object hierarchy make sense.


  1. The first panel to the right of the Node Tree in the Viewer tab is the Batch Contents panel. Here you can see the various folders and pages of your Batch.
  2. To the right of the Batch Contents panel is the Document Viewer. Here you can see a preview of the selected page.


  1. If we open up the various folders in the Node Tree under the Batch object, we can see the same folders and pages shown in the Batch Contents panel.

The Projects Folder

  1. The Projects folder will contain all of your various Grooper projects.
  2. You can create multiple projects at once and make folders to keep things organized.


  1. Inside of your project object, you will add various objects and folders that will build out your project.
  2. Just like in the Batches folder, when you select an object, you will have different tabs you can access.
  3. You will have a properties panel and various other panels depending on what tab you are accessing on which object.
  4. At the bottom of the screen you will see a panel that contains the Grooper in-app help. It will display definitions, tips, and explanations of whatever object or property you have selected.

Building the Project

Now we are going to actually build out a project from start to finish showing the most basic functionality of Grooper. This will give you a basic understanding of Grooper's fundamentals and you can build out your knowledge from there using other tutorials and courses.

FYI

If you wish to follow along with this portion of the course, you will need the downloadable PDFs from the top of this article. Those are the documents we will be processing in this tutorial

Before we start building, let's take a look at what we hope to accomplish in our project. For this project we have a few invoices we want to bring into Grooper. We want Grooper to take these documents and move them to a folder, but name them in such a way that they are organized and easier to search through. We want Grooper to create a folder named after the company listed on the invoice and then name the file based on the Invoice Number.


We want Grooper to do all the work for us in an automated fashion, so we are going to put together a Batch Process, which essentially tells Grooper how to process the document.

  1. Within the Batch process we will have various Batch Process Steps which are the individual ordered instructions of what to do with a Batch.

According to our Batch Process, Grooper will perform the following in order:

  1. Split Pages: Grooper will take the documents in the Batch and expose, or "split" out, the individual pages of those documents as child Batch Page objects.
  2. Recognize: For Grooper to be able to do anything with the documents, it first needs to understand and be able to "recognize" the text on the document. That happens in this step.
  3. Classify: While we as humans might easily understand a document is an "invoice", Grooper must have explicit instructions that allow it to identify a type of document. In Grooper, the act of identifying and assigning a "type" to a document, like "invoice", is know as classification.
  4. Extract: Grooper will then collect information from the document. For our needs we will be extracting the company name and invoice number.
  5. Export: Finally, the documents will be exported into a new folder named after the company and the pdf file names will reflect the invoice number.

The file set below is our expected outcome:

Phase One: Acquire

Let's get started building our Project.

We need to start with the first phase of Grooper: Acquire.

Before we can do anything with Grooper, we first need to give Grooper something to work with. We need to bring documents into Grooper. The documents are brought into Grooper into a Batch object. So the first thing we need to do is create a Batch object where we can keep our documents.

  1. In the Batches folder, right-click on the Test subfolder.
  2. Hover over "Add" and then click on "Batch...".


  1. When the "Add" window pops up, type in a name for your Batch.
  2. Click "EXECUTE" to create the Batch object.


  1. Now we should have a new Batch object in our Node Tree.
  2. With the Batch object selected, click on the "Viewer" tab.


  1. Now we have an empty Batch. In the viewer tab we can see we have a Batch Folder, but nothing inside.


  1. We want to drag and drop the PDF documents from a file on our computer or server to the Batch Folder.


  1. Now our documents are copied into Grooper. This completes the "Acquire" phase of Grooper for this project.

Folder Levels

Before we go any further, we need to talk about a concept called "Folder Levels" in a Batch.

When you first create a Batch, there's only a single empty folder inside. When you drag and drop documents into that folder, more folders are created with those document inside. These new created folders are considered to be "inside" of the original Batch Folder.

These first folders inside of the root folder are considered to be at a Folder Level of 1. A folder that exists inside of a folder at Level 1 would be considered Level 2. A folder inside of a Level 2 folder would be considered at Level 3, and so on.

Inside these folders, there can also be a new type of object called a Page object. A Page can exist at any level, but will always be referred to as being at a "Page Level". You do not need to keep track of what level the pages are at.

This concept can be confusing to grasp at first, so below we have some graphics to try and make this more clear.





In the Batch we just created, the original folder that was part of the Batch when it was created is at the "Batch Level" and the folders that were created when we added the documents to the Batch are at "Folder Level 1".

This will become important as we continue through the rest of the Grooper phases.

Phase Two: Condition

We have completed the first phase of Grooper. Now it is time to move on to the second phase: Condition.

Conditioning documents can involve a lot of different things. The point of conditioning the documents is to turn the documents into something that Grooper can better understand and work with. In this tutorial we are going to do two things with our documents:

  1. We are going to split out the pages of our documents so that Grooper will understand what a page is. After we split the pages, we will have a "page object" in Grooper.
  2. Once we have page objects in Grooper, we are going to run an activity called "Recognize" on the page objects. Once we have run that activity, Grooper will be able to understand or "recognize" the text present on the document.

Creating a Project

Now that we have a Batch to process, we must start building our Project. The first step is, of course, creating a new Project in Grooper.

  1. In the Node Tree, right-click on the "Projects" folder.
  2. In the menu that pops up, hover over "Add" and then select "Project...".


  1. When the "Add" window pops up on the screen, enter in a name for your project.
  2. When finished, click "EXECUTE" in the top right corner of the pop-up window.


  1. Now you should have a new Project in your Node Tree.


The Batch Process

What is a Batch Process?

One of the main advantages to using Grooper is being able to automate your document processing. The Batch Process is how your Grooper project is automated. The Batch Process is the set of instructions that tells Grooper what to do with the documents we give it. We will tell Grooper step-by-step what to do.

Throughout the rest of the tutorial, we will be adding these individual Steps to our Batch Process. The first Batch Process Step will will be adding is called Split Pages.

FYI

The Batch Process Steps are "objects" in Grooper. The Batch Process Steps apply what is called an "activity" to the Batch. So, the Split Pages Step is applying the "split pages activity" to the Batch. This distinction will become more important as you learn more about Grooper.

Creating a Batch Process

Let's go ahead and add the Batch Process object to our Grooper Project.

  1. Right-click on your project.
  2. Hover over "Add" and then click on "Batch Process..."


  1. When the "Add" window pops up, type in a name for your Batch Process.
  2. Click "EXECUTE" in the top right-hand corner of the pop-up window.


  1. Now you should have a Batch Process object in your node tree under your Project.


Split Pages

Let's now add our first Batch Process Step: "Split Pages".

  1. Right-click on your Batch Process object.
  2. Hover over "Add Activity", then hover over "Transform". Finally, click on "Split Pages...".


  1. When the "Add Activity" window pops up, you can change the name of the Batch Process Step if you like. We are going to leave it as the default "Split Pages".
  2. Click "EXECUTE" located in the top right-hand corner of the "Add Activity" window.


  1. Now we have a Split Pages Batch Process Step in our node tree.
  2. We can see in the property grid (the properties panel) that by default the Scope of the Batch Process Step is set to Folder.
    • This is where our earlier conversation about folder levels becomes important.
  3. Also by default, the Folder Level property is set to 1. Since our documents are at Folder Level 1, this will work for us nicely.
  4. With everything properly configured, let's click over to the "Activity Tester" tab.


Selecting Your Batch

  1. Right now we do not have a Batch selected so our Batch Viewer (the middle panel labeled "TEST BATCH") is blank.
  2. Click on the "Browse Batches" icon.
    • This is the left-most icon at the top-right of the Batch Viewer.


  1. Navigate to and select the Batch you want to apply the Batch Process Step to.
  2. Click "OK" at the top right of the pop-up window.


  • Now we have a Batch we can navigate through in our Batch Viewer.
    1. On the right in the Document Viewer we can see a preview of the document. We can see what the document looks like.


    Testing the Split Pages Activity
    1. Click the first document in the Batch, hold down shift, then click the last document in the Batch to select all of the documents.
    2. With all documents selected, click the Test icon that looks like a play button inside of a circle in the top right of the Batch Viewer. This will run the Split Pages Step on the documents selected.


    1. The documents have now been split out into individual pages. Each Document Folder now has a Page object.

    FYI

    Page objects are important as certain activities can only be run on a page and not a folder. Likewise, there are certain activities that can only be run on a folder and not a page.


    Recognize

    The next step is for Grooper to recognize the text on the documents. It is recommended to run the Recognize Activity on a page object rather than a folder. That is why we ran the Split Pages Activity before we run Recognize. Now that we have pages in our batch, let's add a Recognize Batch Process Step.

    Add the Recognize Step

    1. Right-click on the Batch Process object in the node tree.
    2. Hover over "Add Activity", then hover over "Cleanup & Recognition". Finally, click on "Recognize".


    1. When the "Add Activity" window pops up, you can change the name of the step if you like. However, we are going to leave it as the default "Recognize".
    2. Click "EXECUTE" in the top right corner of the pop-up window.

    1. Select the Recognize Batch Process Step in the node tree.
    2. If we look at the Property Grid, we can see that the Scope is set to Page by default. Since we want to run Recognize at a page level, we will leave this property as is.
    3. With the Batch Process Step properly configured, click on the "Activity Tester" tab.

    FYI

    The documents we are working with are PDFs with native text embedded in the document. If we were working with scanned documents or images with no native text, we would have to configure the Recognize Batch Process Step to run OCR on the documents. With native text, there is nothing more we needed to configure.


    1. If you select the folder at "Folder Level 1"...
    2. ... notice that the play button at the top right of the Batch Viewer is grayed out. This is because the Batch Process Step is set to a "Page Level" Scope.


    1. If we select one of the pages in the batch...
    2. ... the play button will no longer be grayed out and we can run the recognize activity.


    Check the Text

    1. After running Recognize on the page, if we click on the PDF icon in the top right of the Document Viewer, we can now see we have a "Text" option in the drop-down.
    2. Click "Text" from the drop-down.


    1. Now we can see the full text that was recognized by Grooper in the Document Viewer.


    1. To get back to the original page, click on the text icon to access the drop-down.
    2. Click on "Page" in the drop-down.


    Repeat the Steps

    1. Repeat the previous steps for each page object in the batch to recognize the text on every page.

  • Phase Three: Organize

    With phase one and two complete, it is time to move on to phase 3: Organize.

    The most important part of the Organize phase is Classification. Before Grooper can do anything more with a document, we first must tell Grooper what the document is. Without explicit instructions to define a "type" of document, Grooper cannot tell the difference between documents such as an invoice or a college transcript. These two documents would need to be processed very differently.

    In our case, our batch only has one type of document so that makes our process of classification simple. We just need to tell Grooper that all of the documents are invoices.

    Adding a Content Model

    Before we can classify a document, we need to first introduce a new object in Grooper. This object is called a Content Model.

    A Content Model is a Grooper object that houses other objects like Data Elements (these are used in the Collect Phase of Grooper) and Document Types. A Document Type is another object we will be using that essentially gives a document a classification or "name", but we will get to that later in this tutorial.

    For now what you need to know is that in order to classify a document, we first need a Content Model in our Project. So, let's add one.

    1. Right-click on your Project.
    2. Hover over "Add" and then click on "Content Model..."


    1. When the "Add" window pops up, enter in a name for your Content Model. Here we have named it "Grooper Basics Content Model".
    2. Click "EXECUTE" in the top right hand corner of the pop-up window.


    1. Now you should have a Content Model object in your node tree.


    Adding a Document Type

    Now we're going to add a Document Type to our Content Model. This Document Type will eventually be assigned to the documents in our Batch to tell Grooper what type of document we are working with.

    1. Right-click on the Content Model.
    2. Hover over "Add" then click on "Document Type...".


    1. Give a name to your Document Type. It should reflect the type of document you will be processing through Grooper. Here we have named our Document Type "Generic Invoice".
    2. Click "EXECUTE" in the top right hand corner of the pop-up window.


    1. Now we should have a Document Type in our Content Model.


    Setting the Default Content Type

    Now we need to tell Grooper how to classify a document. Since all of the documents in our batch are going to be assigned the same Document Type, we can simply tell Grooper that all of the documents need to be classified with the "Generic Invoice" Document Type.

    1. Select the Content Model in your node tree.
    2. Click on the hamburger icon to the right of the Default Content Type property.
    3. Navigate to and select the Document Type we just created.


    1. Now we should have the Generic Invoice showing as the Default Content Type. This means that all documents referencing this Content Model will all be classified as a "Generic Invoice".
    2. Any time you make a change to the properties of an object, you must either save or cancel your changes before you can navigate to another object in the node tree. There is a save icon and a cancel icon located in the top right corner of the property grid. Here we are going to click the save icon to save our changes.


    The Classify Step

    Ok, we've now done all the set up needed to classify our documents. Let's set up our Classify Batch Process Step in our Batch Process.

    1. Right-click on your Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Classify..."


    1. When the "Add Activity" window pops up, you can change the name of the step if you like. We are going to leave it as the default of "Classify".
    2. Click "EXECUTE" located in the top right of the pop-up window.


    1. Now you should have a Classify Batch Process Step in your node tree.
    2. By default, the Scope is set to a Folder Level of 1. Classification can only be run on a folder level. Page objects cannot be classified. We are going to leave these properties as the default settings.


    1. Click on the hamburger icon to the right of the Content Model Scope property.
    2. Navigate to and select the Content Model we will be using for classification from the drop down.


    1. Click the save icon in the top right of the property grid.
    2. Click on the "Activity Tester" tab.


    Testing the Classify Step

    1. Notice that on the "Activity Tester" tab, we currently have a page in our batch selected in the Batch Viewer.
    2. The play button is grayed out because our Classify Step is set to Folder Level 1. The step cannot be tested on a Page level.


    1. Instead we are going to select the folder at level 1. Classification must be run on a folder level.
    2. Now the play button is no longer grayed out.


    1. Click on the first level 1 folder, hold the shift key, then click on the last level 1 folder to select all of the level 1 folders.
    2. Click the play button to run the classify activity on the batch.


    1. Now all of the documents are classified. The word "Document" has been replaced with "Generic Invoice".

    Phase Four: Collect

    Now it's time to enter into Phase Four: the Collect phase. Here we're actually going to extract some information from the document. All the prior phases have been preparing the documents for extraction.

    Let's remember what our overall goal is for this project. When Grooper exports our documents, we want them to be filed into a folder named for the company name on the invoice and then we want each file to be named after the invoice number. That way, the documents will be well organized in our file system.

    So, let's collect the company name and invoice numbers from these documents.

    The Data Model

    A Data Model is an object in Grooper that acts as the container for holding all other objects that extract the data from the documents. Before we can start collecting information from the documents, we need to add a Data Model to our project.

    1. Right-click on the Content Model.
    2. Click on "Create Data Model".


    1. Now you should have a Data Model object in your node tree.


    The Data Field

    There are multiple extraction objects you can use to collect data from the documents, but we are going to focus on just one: The Data Field. Let's go ahead an add our first data field to collect the company name from these documents.

    1. Right-click on your Data Model in the node tree.
    2. Hover over "Add" and then click on "Data Field".


    1. Enter in a name for your Data Field. We want to collect the company name with this Data Field so we have named it appropriately as "Company Name".
    2. Click "EXECUTE" located in the top right corner of the pop-up window.


    1. Now you should have a Data Field in your node tree.
    2. There are various properties you can configure for a Data Field located in the Property Grid.
    3. In the right panel, you can see a preview of what the name and field will look like.


    The List Match Extractor

    There are many different Value Extractors we can use to collect data off of a document. To collect the company name we are going to use a List Match.

    1. Click the hamburger icon next to the Value Extractor property.
    2. Select List Match from the drop-down menu.


    1. Click on the ellipsis icon to the right of the Value Extractor property.


    The Value Extractor Window

    No matter what Value Extractor you decide to use, you will be configuring the extractor using the Value Extractor window. There are several parts to the Value Extractor window.

    1. In the box under "LOCAL ENTRIES" is where you will type out what you want to extract.
    2. In the Batch Viewer you can select the folder or page you want to look at.
    3. In the Document Viewer, you can see a preview of the document you selected in the Batch Viewer.
    4. In the right hand corner we have the Results Viewer or Results List panel. You will be able to see a list of what Grooper returns from your extractor.


    Configuring the Extractor

    You need to make sure you have the correct batch pulled up in your Batch Viewer/Document Viewer so you can see what is being extracted.

    1. If you do not have the right batch selected, click on the browse button in the top right of the Batch Viewer.
    2. When the window pops up, navigate to and select the batch you want to extract from.
    3. Click "OK" in the top right of the pop-up window.


    1. Type what you want returned from the document into the "LOCAL ENTRIES" field.
    2. Anything that is being returned by our List Match will be highlighted in green in the Document Viewer.
    3. In the Results List we are getting two text segments returned by our extractor.
    4. Click "OK" in the top right corner of the "Value Extractor" window.


    1. Click the save icon in the top right of the property grid to save the changes made to the Data Field object.


    Testing the Extraction

    1. Click over to the "Tester" tab.
    2. Select the document you want to test.
    3. Click on the test icon. This should look like a play button in the top right hand corner of the Batch Viewer.


    1. Now we can see what is being extracted. The text is difficult to see though.


    1. Click over to the "Data Field" tab.
    2. The Display Width property refers to the width of the Data Field in the extraction preview.


    1. Let's change the value of the Display Width to 175.
    2. Click the save icon to save the changes to the Data Field.


    1. If we go to the "Tester" tab and test the extraction again...
    2. Now we can read the full name of the company being extracted.


    Pattern Match

    Now that we have a List Match extractor collecting the company name from the documents, let's collect the invoice number.

    The company name is static. We can extract the exact name. An invoice number is highly variable. It is a sequence of numbers that is different on every document. We can use something called a Pattern Match to extract these invoice numbers.

    1. Add a new Data Field to your Data Model to collect the invoice number on our invoices.


    1. Click on the hamburger icon to the right of the Value Extractor property.
    2. Select Pattern Match from the drop down menu.


    1. Click the ellipsis icon to the right of the Value Extractor property.


    1. When the "Value Extractor" window pops up, you may notice that it looks very similar to using a List Match. For a Pattern Match, we can use regex to match a value on the document.
      • The invoice numbers on these invoices have a consistent syntactic context. Two numbers, followed by a hyphen, followed by four numbers.
      • We can write the regex pattern \d{2}-\d{4} to collect the invoice numbers. We have entered that as the "Value Pattern".
    2. In the Document Viewer we can see that the invoice number is being returned.
    3. In the Results List we can see Grooper found two matches on the document. The values are the same, so it doesn't matter which one is returned.
    4. Click "OK".


    1. Click the save icon in the top right of the property grid.


    Testing on the Data Model

    1. Click on the Data Model.
    2. In the preview we can see both Data Fields that are child objects of the Data Model.
    3. Click over the "Tester" tab.


    1. Select a document from the batch in the Batch Viewer.
    2. Click on the test icon at the top right of the Batch Viewer.
    3. We can see what is being returned for all of the Data Fields that are child objects of the Data Model in the Data Element Tester panel located above the Document Viewer panel.


    The Extract Step

    Now that we have our extraction configured, let's go ahead and add an Extract Batch Process Step to our Batch Process.

    1. Right-click on the Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Extract..."


    1. When the "Add Activity" window pops up, feel free to change the Step Name if you like. We are going to leave it as Extract.
    2. Click "EXECUTE".


    1. Now you should have an Extract Batch Process Step in your node tree.
    2. The scope for this step is set to a Folder level by default.
    3. The Folder Level property is set to 1 by default. Since our documents are at a Folder Level 1, this will work for us.


    1. Click over to the "Activity Tester" tab.
    2. Select all of the documents at Folder Level 1 in your Batch Viewer.
    3. Click on the test icon to run the Extract Activity on your Batch.


    1. Click over to the Data Model in your node tree.
    2. In the "Tester" tab, click on one of your documents at Folder Level 1.
    3. The extracted information should appear in your Data Fields without needing to click on the test icon now because we have already extracted the information through the Extract Activity in our Batch Process.

    Phase Five: Deliver

    Ok, we have "Acquired" our documents by bringing them into Grooper. We then "Conditioned" the documents by splitting out pages and running the Recognize Activity on the page objects. After that we "Organized" the documents by assigning a Document Type to each document in a process called "classification". Once all that was done, we "Collected" the data from the documents by setting up Data Fields to extract the data we wanted.

    Now in our fifth and final phase of Grooper, it's time to take all the data we collected and use that to "Deliver" or export the documents into organized folders.

    Adding an Export Behavior

    First, we need to add something called an "Export Behavior". This is where we give Grooper the instructions on how we want it to export our documents such as where to export to, what to name the folders and files, and what file format to export.

    1. Select the Content Model from the node tree.
    2. Click on the ellipsis icon to the right of the Behaviors property.


    1. When the "Behaviors" window pops up, click on the "+" icon located at the top of the pop-up window.
    2. Select "Export Behavior" from the drop down menu.


    Now you should see "Export Behavior" listed on the left under the list of behaviors. With that selected you should see a property called Export Definitions show up in the right side of the "Behaviors" window.

    1. Click on the ellipsis icon to the right of the Export Definitions property.


    1. When the "Export Definitions" window pops up, click on the "+" icon at the top to access the drop down menu.
    2. Select "File Export" from the drop down menu.


    Setting a Target Folder

    Now that we have added the Export Behavior, we can start configuring the behavior with instructions on what to do with our documents. Let's start by telling Grooper where to put the documents.

    1. In the Target Folder property, we need to put a UNC path to where we want Grooper to send the exported documents.


    1. In your windows explorer, navigate to the folder where you want Grooper to export your documents. Right-click and select "Properties" from the pop-up menu to access the folder's properties.
    2. Click on the "Sharing" tab in the "Properties" pop-up window.
    3. Under "Network Path:" you will find the UNC path. Copy this to your clipboard.

    FYI

    The folder you wish to export to must be shared to the same server that Grooper is running on if it is not located on the same server. If Grooper cannot access that folder, it cannot export to it.


    1. Back in Grooper, paste the UNC path into the Target Folder property.


    Defining a Relative Path

    When we export our documents, we don't want to just export all of them into the folder we selected. We want to export them in an organized fashion. For this example we want to do the following:

    • Create a folder titled "Invoices" for all of our documents.
    • Create a second subfolder named after the company the specific invoice is from.
    • Save the document named after the extracted invoice number.

    Using the Relative Path property, we can tell Grooper to create numerous folders and subfolders and then select a name for our documents. The folders and subfolders can be named based on text that you specify, or it can be named based on data extracted from the documents.

    We do this by writing an expression using a method called "string interpolation".

    1. Click on the ellipsis icon to the right of the Relative Path property.


    We are going to write our expression in the Relative Path window that pops up.

    1. Start with a dollar sign ($). Then add open quotation marks (").


    The first folder we want to make is going to simply be titled "Invoices". So we can just type the name we want into the expression.

    1. Now we're going to create our first subfolder. We're going to call the folder we create "Invoices". So, we type "Invoices" and follow with a backslash (\).


    The next folder we want to create is going to be using extracted data to name the folder. We have to include the placeholder for the extracted name inside of curly brackets or {}.

    1. Next, type an open curly bracket ({).
      • This will bring up a drop down menu called "intellisense". The drop down menu with give you options for things you can use to name folders and files.
    2. Since we want to name the folder after the company name extracted from the document, select "Company_Name" from the intellisense drop down.


    1. Close the curly bracket at the end of "Company Name" and then type in a backslash (\) to indicate a new folder level or naming convention. Type another open curly bracket ({).
    2. Select "Invoice_Number" from the intellisense drop down.


    1. Close the curly bracket at the end of "Invoice Number" and then finish with an end quotation mark ("). Now your Relative Path expression is complete.
      • Your final expression should be:
        $"Invoices\{Company_Name}\{Invoice_Number}"


    Export Formats

    Next, we need to tell Grooper what sort of files to export. We have a number of options from PDFs and Text formats to XML and ZIP formats. For our purposes we only want to export PDFs.

    1. Click on the ellipsis icon to the right of the Export Formats property.


    1. Click on the + icon located at the top of the "Export Formats" pop-up window to access the drop down to add an export format.
    2. Click on the "PDF Format" in the drop down menu.


    There are a number of properties we can configure for the export formats we choose to apply. For our purposes, we are going to just use all of the default settings, so no changes need to be made.

    1. Click "OK" in the top right of the pop-up window to apply the changes.


    Finish the Export Behavior

    We have now configured all properties needed for our Export Behavior. Now all we need to do is apply changes and save to our Content Model.

    1. Click "OK".


    1. We don't need to add any other behaviors to our list, so go ahead and click "OK" on this window as well.


    1. Finally, click the save icon at the top of the property grid to save the changes to the Content Model.

    Add the Export Step

    Now that we have our Export Behavior configured, we need to add an Export Batch Process Step to our Batch Process.

    1. Right-click on the Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing." Finally, click on "Export..."
    3. Change the "Step Name" if you like. We will be leaving it as the default "Export".
    4. Click "EXECUTE" at the top right of the "Add Activity" window.


    1. By default, our Export Batch Process Step is set to a Scope of Folder Level 1. Since our documents are at Folder Level 1, this will work for us and no changes need to be made.
    2. You might notice that in the "ACTIVITY PROPERTIES" panel on the right we have an Export Behavior property. Since we already added an Export Behavior on our Content Model, we don't need to do anything here.

    Finished Batch Process

    We have now finished designing our project going through the 5 phases of Grooper! Now it's time to see it in action.

    There are a few things we need to do before we can start processing documents:

    • Publish the Batch Process
    • Add a CMIS Connection
    • Start the Activity Processing and Import Watcher services

    After these steps are completed, then we can import a Batch and let the Batch Process automatically process the documents through the 5 phases of Grooper according to the project we designed.

    Publish the Batch Process

    Before a Batch Process can be used, we have to "publish" it. This tells Grooper that we are done making changes to the process and it is ready to be used.

    1. Right-click on the Batch Process.
    2. Click on "Publish".
    3. When the "Publish" window pops up, click "EXECUTE".


    1. A copy of your published Batch Processes can be found in the Processes folder in your node tree.

    FYI

    You can still make changes to your Batch Process in your Project after it has been published. However, to apply your changes to your working Batch Process you will need to republish your Batch Process.


    Adding a CMIS Connection

    In order to import a Batch into Grooper automatically from a file on a computer, we first need to establish a connection from Grooper to the folder where that file resides. The first step to establishing this connection is adding a CMIS Connection object.

    Creating a Folder

    We are first going to create a new folder in our Project that will house all CMIS Connection objects for this project. Folders are handy for keeping your Grooper repository organized as you build it out.

    1. Right-click on your Project in the node tree.
    2. Hover over "Add", and then click on "Folder..."
    3. When the "Add" window pops up, name the folder. We have named it "Connections".
    4. Click "EXECUTE" located in the top right of the pop up window.


    Creating the CMIS Object

    Now we can add the CMIS Connection object.

    1. Right-click on the folder you just created.
    2. Hover over "Add" and then click on "CMIS Connection..."
    3. When the "Add" window pops up, enter in a name for your CMIS Connection object. We have named it "Grooper Basics CMIS Connection" in this tutorial.
    4. Click "EXECUTE" in the top right of the pop up window.


    Configuring the CMIS Connection

    1. Now you should have a CMIS Connection object in your node tree. Select the object.
    2. Click on the hamburger icon to the right of the Connection Settings property.
    3. Select NTFS from the drop down menu.


    1. Click on the ellipsis icon to the right of the Connection Settings property.
    2. When the "Connection Settings" window pops up, click on the ellipsis icon to the right of the Repositories property.


    1. When the "Repositories" window pops up, click on the + icon at the top of the pop up window.
    2. An entry will pop up on your list located in the left of the pop up window. Make sure this entry is selected.
    3. You will need to paste the UNC path that leads to the folder you want to import your Batches from into the Base Path property.


    1. After pasting the UNC path into the Base Path property, the Repository Name property should autopopulate.
    2. Click "OK" located in the top right of the pop up window.


    Import the Repository

    Now that we have established a connection to the windows file system and pointed it at a folder, we need to actually access the files. To do that we need to import the repository.

    1. Click on the List Repositories icon located in the top right corner of the "REPOSITORIES" panel.


    A list of possible repositories should show up in the "REPOSITORIES" panel. Since the folder I pointed the CMIS connection to only has one subfolder, only one repository is showing in the screenshot below. You may have more in your list depending on where you set up your folder on your system.

    1. Click on the repository you want to import documents from.
    2. Click the Import Repository icon located in the top right next to the List Repositories icon.
    3. When the "Import Repository" window pops up, verify that the listed repository is correct.
    4. Click the "EXECUTE" button at the top of the pop up window.


    1. Now you should have a CMIS Repository object in your node tree under your CMIS Connection object.


    1. Click over to the "Browse" tab.
    2. We can see the contents of the folder we selected for our CMIS connection in the middle panel.


    The Activity Processing and Import Watcher Services

    In order to import Batches into Grooper, you need to have a couple of services installed in running. In this course we are assuming you already have these services running. If not, please check out the Activity Processing and the Import Watcher articles for installation instructions.

    With an Activity Processor service and Import Watcher service installed, we need to make sure they are turned on. If they are already running, skip to the #Import the Batch section.

    1. Click on the Machines folder in the node tree.
    2. You should see a list of servers where you can access services from in the middle panel at the top. Select the server from the list if not already selected.
    3. Under the list of servers, you should have a list of services that are installed on the selected server.
    4. To the right there is a "SERVICE PROPERTIES" panel. There are a number of service properties you can configure, but for our purposes, we can stick with the defaults.


    Turning on the Services

    1. Select the Activity Processing service from the list.
    2. Click the Start Service icon located in the top right of the panel. It should look like a play button.


    1. Select the Import Watcher service from the list.
    2. Click the Start Service icon located in the top right of the panel.


    1. Now we have an Activity Processor and Import Watcher running on our repository.


    Import the Batch

    Ok, now we have all that we need to import and start processing a Batch through our Batch Process. We are going to need to import a Batch from the Imports page in the Grooper web client.

    1. Click over to the Imports Page by clicking the Imports icon in the Context Toolbar at the top of the screen.
    2. Click the + icon located on the far right of the Context Toolbar to start a new import job.


    1. When the "Submit Import Job" window pops up, enter in a short Description for the Import Job in the Description property.


    1. Click on the hamburger icon to the right of the Provider property.
    2. Select Import Descendants from the drop-down menu.


    1. Under the Provider property, click on the hamburger icon to the right of the Repository property.


    1. Navigate to the imported repository under your CMIS connection.


    1. The Base Folder and Import Filter properties will automatically populate with information that will tell Grooper to import all documents from the repository we selected.


    1. Scroll down in the window and open the Batch Creation property.


    1. Click the hamburger icon to the right of the Starting Step property.
    2. Navigate to and select the step from your Batch Process that you want Grooper to start processing the imported documents. We want to start with the Split Pages Batch Process Step.


    1. Uncheck the Start Paused property if you want Grooper to start processing as soon as the Batch is imported.
    2. Click the "SUBMIT" button at the top of the pop up window to import the Batch and start the Batch Process.


    Running Through the Batch Process

    1. After submitting the job, you'll see the Import Job in the list of Imports. It should have a Status of "Working" if you unchecked the Start Paused property.


    1. Click on the Batches icon in the Context Toolbar to go to the Batches page.


    1. You will see the Batch that is currently being processed in the list of Batches. Select the Batch if not already selected.
    2. At the bottom you should see what looks a bit like a bar graph. This will show you what Grooper is currently processing.


    1. When the Batch Process is complete, you should see all blue bars for every Batch Process Step at the bottom of the screen.

    While Grooper is running through the Batch Process Steps you may see the bars temporarily have a green color.

    If you find that one or more of your bars turn a red color, then there was an error while processing that step. You will need to take a look at those steps in your Batch Process and see if there is something configured incorrectly.


    1. Now if you look in the folder where you set your Export Behavior, your PDFs should be properly named and organized.

    BONUS: Ad-Hoc Jobs

    As a bonus, we are going to share a slightly more advanced tip with you. When testing your Batch Process Steps, you can actually do something called Ad-Hoc processing. This allows you to test the whole Batch all at once without having to select the individual folders or page objects from the Batch Viewer first. This can come in handy when you can't select all of the folders and pages all at once.

    An Ad-Hoc job can only be performed if you have an Activity Processing service running on the Grooper server. Make sure you install and start the service before continuing.

    1. In our example below, we are wanting to test the Recognize step in the Batch Process and we have different pages inside different folders. Since they are in different folders, we cannot select all of the page objects at once. Running Recognize on each one individually can take a while.
    2. Instead of clicking on the Test icon, with an Activity Processing service running, we can submit an Ad-Hoc job. Click on the "Submit a Job" icon located to the right of the Test icon. This will run the activity at the folder level as configured on the "Batch Process Step" tab.


    1. When the "Submit Job" window pops up, click "OK" at the top of the pop up window.


    1. You should automatically be taken to the Jobs page. You will see your Ad-Hoc job in the list on the page and can see when it is complete.