2023.1:Grooper Basics - Overview: Difference between revisions

From Grooper Wiki
// via Wikitext Extension for VSCode
No edit summary
 
(44 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{|class="wip-box"
{{AutoVersion}}
 
<blockquote>This article serves as a beginner's guide to Grooper. In this article we will go over the most basic concepts and activities within the Grooper software so you can start using it for your company's needs. </blockquote>
 
{|class="download-box"
|
|
'''WIP'''
[[File:Asset 22@4x.png]]
|
|
This article is a work-in-progress or created as a placeholder for testing purposesThis article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains one or more '''Batches''' of sample documentsThe second contains one or more '''Projects''' with resources used in examples throughout this article. The third contains PDFs to be used when generating a new project (do not try to upload into Grooper; just unzip and use the PDFs per the tutorial instructions).
 
* [[Media:2023.1 Grooper-Basics-Overview Batch.zip]]
This tag will be removed upon draft completion.
* [[Media:2023.1 Grooper-Basics-Overview Project.zip]]
* [[Media:2023.1 Grooper-Basics-Overview PDFs.zip]]
|}
|}
{{AutoVersion}}
<blockquote>This article serves as a beginner's guide to Grooper. In this article we will go over the most basic concepts and activities within the Grooper software so you can start using it for your company's needs. </blockquote>


== Introduction ==
== Introduction ==
Line 18: Line 19:
When first diving into Grooper, the software can be a bit overwhelming, especially if you have not worked with document processing software before. The purpose of this tutorial is to familiarize users with the fundamental basics of Grooper. By the end of this tutorial you should understand how to take a set of documents from beginning to end through a very simple process and export out of Grooper.
When first diving into Grooper, the software can be a bit overwhelming, especially if you have not worked with document processing software before. The purpose of this tutorial is to familiarize users with the fundamental basics of Grooper. By the end of this tutorial you should understand how to take a set of documents from beginning to end through a very simple process and export out of Grooper.


<div style="padding-left: 1.5em";>
=== The 5 Phases of Grooper ===
=== The 5 Phases of Grooper ===


Line 54: Line 56:


These five phases encompass a full project in Grooper. In the next sections, we are going to put together a project step-by-step, going through each phase.  
These five phases encompass a full project in Grooper. In the next sections, we are going to put together a project step-by-step, going through each phase.  
</div>
== Navigating the Web Client ==


== Navigating the Web Client ==
{|class="fyi-box"
|-
|
'''FYI'''
|
At the top of this page you will find some downloadable zip files that contain a '''Project''', a '''Batch''', and some PDFs. If you would like to follow along in this section of the tutorial, download these zip files and upload the '''Project''' and '''Batch''' to your Grooper environment. For instructions on how to do this, visit our [[Download or Upload Grooper Objects]] wiki page. The PDFs will be used in a later section when we start to build out a '''Project''' from scratch.
|}


Before we continue, let's take a moment to look at Grooper's interface. For now we are only going to take a look at the Home page and the Design page as that is where we will be spending most of our time in this tutorial.  
Before we continue, let's take a moment to look at Grooper's interface. For now we are only going to take a look at the Home page and the Design page as that is where we will be spending most of our time in this tutorial.  
 
<div style="padding-left: 1.5em";>
=== The Home Page ===
=== The Home Page ===


Line 66: Line 76:
#* '''Design''': This is where we will be working for the majority of this tutorial. The Design page is where you configure your Grooper Project.  
#* '''Design''': This is where we will be working for the majority of this tutorial. The Design page is where you configure your Grooper Project.  
#* '''Batches''': On the Batches page you can view all Batches that are currently in production. You can also add new Batches from this page.  
#* '''Batches''': On the Batches page you can view all Batches that are currently in production. You can also add new Batches from this page.  
#* '''Tasks''': Here you can view any production that is pending review.  
#* '''Tasks''': Here you can view any production Batch that is pending review.  
#* '''Imports''': You can view a list of recent imports on the Imports page.  
#* '''Imports''': You can view a list of recent imports on the Imports page.  
#* '''Jobs''': Here you can see what jobs have been submitted and what stage each job is in.
#* '''Jobs''': Here you can see what jobs have been submitted and what stage each job is in.
Line 96: Line 106:


[[File:2023.1 Grooper-Basics - Overview 02 Navigating 03.png]]
[[File:2023.1 Grooper-Basics - Overview 02 Navigating 03.png]]
 
<div style="padding-left: 1.5em";>
==== The Batches Folder ====
==== The Batches Folder ====


Line 137: Line 147:


# The Projects folder will contain all of your various Grooper projects.
# The Projects folder will contain all of your various Grooper projects.
# You can create various '''projects''' at once and make folders to keep things organized.
# You can create multiple '''projects''' at once and make folders to keep things organized.


[[File:2023.1 Grooper-Basics - Overview 02 Navigating 09.png]]
[[File:2023.1 Grooper-Basics - Overview 02 Navigating 09.png]]
Line 148: Line 158:


[[File:2023.1 Grooper-Basics - Overview 02 Navigating 10.png]]
[[File:2023.1 Grooper-Basics - Overview 02 Navigating 10.png]]
 
</div>
</div>


== Building the Project ==
== Building the Project ==


Now we are going to actually build out a project from start to finish showing the most basic functionality of Grooper. This will give you a basic understanding of Grooper's fundamentals and you can build out your knowledge from there using other tutorials and courses.  
Now we are going to actually build out a project from start to finish showing the most basic functionality of Grooper. This will give you a basic understanding of Grooper's fundamentals and you can build out your knowledge from there using other tutorials and courses.  
{|class="fyi-box"
|-
|
'''FYI'''
|
If you wish to follow along with this portion of the course, you will need the downloadable PDFs from the top of this article. Those are the documents we will be processing in this tutorial
|}


Before we start building, let's take a look at what we hope to accomplish in our project. For this project we have a few invoices we want to bring into Grooper. We want Grooper to take these documents and move them to a folder, but name them in such a way that they are organized and easier to search through. We want Grooper to create a folder named after the company listed on the invoice and then name the file based on the Invoice Number.  
Before we start building, let's take a look at what we hope to accomplish in our project. For this project we have a few invoices we want to bring into Grooper. We want Grooper to take these documents and move them to a folder, but name them in such a way that they are organized and easier to search through. We want Grooper to create a folder named after the company listed on the invoice and then name the file based on the Invoice Number.  
Line 165: Line 184:
[[File:2023.1 Grooper-Basics - Overview 03 Building 01 Goal 02.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 01 Goal 02.png]]


According to our '''Batch Process''' Grooper will perform the following in order:
According to our '''Batch Process''', Grooper will perform the following in order:


# '''Split Pages''': Grooper will take the Batch and split out the document into individual pages.
# '''Split Pages''': Grooper will take the documents in the '''Batch''' and expose, or "split" out, the individual pages of those documents as child '''Batch Page''' objects.
# '''Recognize''': For Grooper to be able to do anything with the documents, it first needs to understand and be able to "recognize" the text on the document. That happens in this step.
# '''Recognize''': For Grooper to be able to do anything with the documents, it first needs to understand and be able to "recognize" the text on the document. That happens in this step.
# '''Classify''': While we as humans might recognize our document as an invoice, Grooper has no way to know what a document is. In this step we give it a classification.
# '''Classify''': While we as humans might easily understand a document is an "invoice", Grooper must have explicit instructions that allow it to identify a type of document. In Grooper, the act of identifying and assigning a "type" to a document, like "invoice", is know as classification.
# '''Extract''': Grooper will then collect information from the document. For our needs we will be extracting the company name and invoice number.  
# '''Extract''': Grooper will then collect information from the document. For our needs we will be extracting the company name and invoice number.  
# '''Export''': Finally, the documents will be exported into a new folder named after the company and the pdf file names will reflect the invoice number.  
# '''Export''': Finally, the documents will be exported into a new folder named after the company and the pdf file names will reflect the invoice number.  
Line 176: Line 195:


[[File:2023.1 Grooper-Basics - Overview 03 Building 01 Goal 03.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 01 Goal 03.png]]
 
<div style="padding-left: 1.5em";>
=== Phase One: Acquire ===
=== Phase One: Acquire ===


Line 216: Line 235:


[[File:2023.1 Grooper-Basics - Overview 03 Building 02 Acquire 06.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 02 Acquire 06.png]]
 
<div style="padding-left: 1.5em";>
==== Folder Levels ====
==== Folder Levels ====


Line 241: Line 260:


[[File:2023.1 Grooper-Basics - Overview 03 Building 02 Acquire 07.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 02 Acquire 07.png]]
 
</div>
=== Phase Two: Condition ===
=== Phase Two: Condition ===


Line 250: Line 269:
# We are going to split out the pages of our documents so that Grooper will understand what a page is. After we split the pages, we will have a "'''page''' object" in Grooper.  
# We are going to split out the pages of our documents so that Grooper will understand what a page is. After we split the pages, we will have a "'''page''' object" in Grooper.  
# Once we have '''page''' objects in Grooper, we are going to run an activity called "''Recognize''" on the page objects. Once we have run that activity, Grooper will be able to understand or "recognize" the text present on the document.  
# Once we have '''page''' objects in Grooper, we are going to run an activity called "''Recognize''" on the page objects. Once we have run that activity, Grooper will be able to understand or "recognize" the text present on the document.  
 
<div style="padding-left: 1.5em";>
==== Creating a Project ====
==== Creating a Project ====


Line 307: Line 326:


[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 02 Batch-Process 03.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 02 Batch-Process 03.png]]


==== Split Pages ====
==== Split Pages ====
Line 352: Line 372:


[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 06.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 06.png]]
 
<br>
 
<br>
<b><big>Testing the Split Pages Activity</big></b>
<b><big>Testing the Split Pages Activity</big></b>


# Click the first document in the '''Batch''', hold down shift, then click the last document in the '''Batch''' to select all of the documents.  
# Click the first document in the '''Batch''', hold down shift, then click the last document in the '''Batch''' to select all of the documents.  
# With all documents selected, click the icon that looks like a play button inside of a circle in the top right of the Batch Viewer. This will run the '''Split Pages Step''' on the documents selected.
# With all documents selected, click the Test icon that looks like a play button inside of a circle in the top right of the Batch Viewer. This will run the '''Split Pages Step''' on the documents selected.


[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 07.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 07.png]]
 
<br>
 
<br>
#<li value=3> The documents have now been split out into individual pages. Each '''Document Folder''' now has a '''Page''' object.  
#<li value=3> The documents have now been split out into individual pages. Each '''Document Folder''' now has a '''Page''' object.  


Line 373: Line 393:


[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 08.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 03 Split-Pages 08.png]]


==== Recognize ====
==== Recognize ====


The next step is for Grooper to recognize the text on the pages. The ''Recognize Activity'' can only be run on a '''page'''. That is why we must run the ''Split Pages Activity'' before we can run ''Recognize''. Now that we have '''pages''' in our '''batch''', let's add a ''Recognize'' '''Batch Process Step'''.  
The next step is for Grooper to recognize the text on the documents. It is recommended to run the ''Recognize Activity'' on a '''page''' object rather than a folder. That is why we ran the ''Split Pages Activity'' before we run ''Recognize''. Now that we have '''pages''' in our '''batch''', let's add a ''Recognize'' '''Batch Process Step'''.  


<b><big>Add the Recognize Step</big></b>
<b><big>Add the Recognize Step</big></b>
Line 392: Line 413:


#<li value=5> Select the ''Recognize'' '''Batch Process Step''' in the node tree.
#<li value=5> Select the ''Recognize'' '''Batch Process Step''' in the node tree.
# If we look at the Property Grid, we can see that the '''''Scope''''' is set to ''Page'' by default. For the most part, the ''Recognize Activity'' will always need to be run at the page level, so we will leave this property as is.  
# If we look at the Property Grid, we can see that the '''''Scope''''' is set to ''Page'' by default. Since we want to run ''Recognize'' at a page level, we will leave this property as is.  
# With the '''Batch Process Step''' properly configured, click on the "Activity Tester" tab.
# With the '''Batch Process Step''' properly configured, click on the "Activity Tester" tab.


Line 420: Line 441:
<b><big>Check the Text</big></b>
<b><big>Check the Text</big></b>


# After running ''Recognize'' on the page, if we click on the PDF icon in the top right of the Document Viewer, we can now see we have a "Test" option in the drop-down.
# After running ''Recognize'' on the page, if we click on the PDF icon in the top right of the Document Viewer, we can now see we have a "Text" option in the drop-down.
# Click "Text" from the drop-down.
# Click "Text" from the drop-down.


Line 442: Line 463:


[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 04 Recognize 09.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 03 Condition 04 Recognize 09.png]]
 
</div>
=== Phase Three: Organize ===
=== Phase Three: Organize ===


With phase one and two complete, it is time to move on to phase 3: Organize.  
With phase one and two complete, it is time to move on to phase 3: Organize.  


The most important part of the Organize phase is Classification. Before Grooper can do anything more with a document, we first must tell Grooper what the document is. Grooper cannot tell the difference between an invoice or a college transcript. These two documents would need to be processed very differently.  
The most important part of the Organize phase is Classification. Before Grooper can do anything more with a document, we first must tell Grooper what the document is. Without explicit instructions to define a "type" of document, Grooper cannot tell the difference between documents such as an invoice or a college transcript. These two documents would need to be processed very differently.


In our case, our '''batch''' only has one type of document so that makes our process of classification simple. We just need to tell Grooper that all of the documents are invoices.  
In our case, our '''batch''' only has one type of document so that makes our process of classification simple. We just need to tell Grooper that all of the documents are invoices.  
 
<div style="padding-left: 1.5em";>
==== Adding a Content Model ====
==== Adding a Content Model ====


Before we can ''classify'' a document, we need to first introduce a new object in Grooper. This object is called a '''Content Model'''.  
Before we can ''classify'' a document, we need to first introduce a new object in Grooper. This object is called a '''Content Model'''.  


A '''Content Model''' is a Grooper object that houses other objects like [[Data Elements (Concept)|Data Elements]] (these are used in the Collect Phase of Grooper) and '''Document Types'''. A '''Document Type''' is another object we will be using that essentially gives a document a classification or "name", but we will get to that later in this tutorial.  
A '''Content Model''' is a Grooper object that houses other objects like [[Data Element]]s (these are used in the Collect Phase of Grooper) and '''Document Types'''. A '''Document Type''' is another object we will be using that essentially gives a document a classification or "name", but we will get to that later in this tutorial.  


For now what you need to know is that in order to ''classify'' a document, we first need a '''Content Model''' in our '''Project'''. So, let's add one.
For now what you need to know is that in order to ''classify'' a document, we first need a '''Content Model''' in our '''Project'''. So, let's add one.
Line 486: Line 507:




#<li value=3> Give a name to your '''Document Type'''. It should reflect the type of document you will be processing through Grooper.  
#<li value=3> Give a name to your '''Document Type'''. It should reflect the type of document you will be processing through Grooper. Here we have named our '''Document Type''' "Generic Invoice".  
# Click "EXECUTE" in the top right hand corner of the pop-up window.
# Click "EXECUTE" in the top right hand corner of the pop-up window.


Line 546: Line 567:


[[File:2023.1 Grooper-Basics - Overview 03 Building 04 Organize 01 Classify 13.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 04 Organize 01 Classify 13.png]]


<b><big>Testing the Classify Step</big></b>
<b><big>Testing the Classify Step</big></b>


# Notice that one the "Activity Tester" tab, we currently have a page in our '''batch''' selected in the Batch Viewer.
# Notice that on the "Activity Tester" tab, we currently have a page in our '''batch''' selected in the Batch Viewer.
# The play button is grayed out because our ''Classify'' '''Step''' is set to Folder Level 1. The '''step''' cannot be tested on a '''Page''' level.
# The play button is grayed out because our ''Classify'' '''Step''' is set to Folder Level 1. The '''step''' cannot be tested on a '''Page''' level.


Line 570: Line 592:


[[File:2023.1 Grooper-Basics - Overview 03 Building 04 Organize 01 Classify 17.png]]
[[File:2023.1 Grooper-Basics - Overview 03 Building 04 Organize 01 Classify 17.png]]
</div>
=== Phase Four: Collect ===
Now it's time to enter into Phase Four: the Collect phase. Here we're actually going to extract some information from the document. All the prior phases have been preparing the documents for extraction.
Let's remember what our overall goal is for this project. When Grooper exports our documents, we want them to be filed into a folder named for the company name on the invoice and then we want each file to be named after the invoice number. That way, the documents will be well organized in our file system.
So, let's collect the company name and invoice numbers from these documents.
<div style="padding-left: 1.5em";>
==== The Data Model ====
A '''Data Model''' is an object in Grooper that acts as the container for holding all other objects that extract the data from the documents. Before we can start collecting information from the documents, we need to add a '''Data Model''' to our project.
# Right-click on the '''Content Model'''.
# Click on "Create Data Model".
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 01 Data Model 01.png]]
#<li value=3> Now you should have a '''Data Model''' object in your node tree.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 01 Data Model 02.png]]
==== The Data Field ====
There are multiple extraction objects you can use to collect data from the documents, but we are going to focus on just one: The '''Data Field'''. Let's go ahead an add our first data field to collect the company name from these documents.
# Right-click on your '''Data Model''' in the node tree.
# Hover over "Add" and then click on "Data Field".
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 02 Data Field 01.png]]
#<li value=3> Enter in a name for your '''Data Field'''. We want to collect the company name with this '''Data Field''' so we have named it appropriately as "Company Name".
# Click "EXECUTE" located in the top right corner of the pop-up window.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 02 Data Field 02.png]]
#<li value=5> Now you should have a '''Data Field''' in your node tree.
# There are various properties you can configure for a '''Data Field''' located in the Property Grid.
# In the right panel, you can see a preview of what the name and field will look like.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 02 Data Field 03.png]]
==== The List Match Extractor ====
There are many different '''''Value Extractors''''' we can use to collect data off of a document. To collect the company name we are going to use a ''List Match''.
# Click the hamburger icon next to the '''''Value Extractor''''' property.
# Select ''List Match'' from the drop-down menu.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 01.png]]
#<li value=3> Click on the ellipsis icon to the right of the '''''Value Extractor''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 02.png]]
<b><big>The Value Extractor Window</big></b>
No matter what '''''Value Extractor''''' you decide to use, you will be configuring the extractor using the '''''Value Extractor''''' window. There are several parts to the '''''Value Extractor''''' window.
# In the box under "LOCAL ENTRIES" is where you will type out what you want to extract.
# In the Batch Viewer you can select the folder or page you want to look at.
# In the Document Viewer, you can see a preview of the document you selected in the Batch Viewer.
# In the right hand corner we have the Results Viewer or Results List panel. You will be able to see a list of what Grooper returns from your extractor.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 03.png]]
<b><big>Configuring the Extractor</big></b>
You need to make sure you have the correct '''batch''' pulled up in your Batch Viewer/Document Viewer so you can see what is being extracted.
# If you do not have the right '''batch''' selected, click on the browse button in the top right of the Batch Viewer.
# When the window pops up, navigate to and select the '''batch''' you want to extract from.
# Click "OK" in the top right of the pop-up window.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 04.png]]
#<li value=4> Type what you want returned from the document into the "LOCAL ENTRIES" field.
# Anything that is being returned by our ''List Match'' will be highlighted in <span style="color:green">green</span> in the Document Viewer.
# In the Results List we are getting two text segments returned by our extractor.
# Click "OK" in the top right corner of the "Value Extractor" window.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 05.png]]
#<li value=8> Click the save icon in the top right of the property grid to save the changes made to the '''Data Field''' object.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 03 List-Match 06.png]]
<b><big>Testing the Extraction</big></b>
# Click over to the "Tester" tab.
# Select the document you want to test.
# Click on the test icon. This should look like a play button in the top right hand corner of the Batch Viewer.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 04 Test List Match 01.png]]
#<li value=4> Now we can see what is being extracted. The text is difficult to see though.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 04 Test List Match 02.png]]
#<li value=5> Click over to the "Data Field" tab.
# The '''''Display Width''''' property refers to the width of the '''Data Field''' in the extraction preview.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 04 Test List Match 03.png]]
#<li value=7> Let's change the value of the '''''Display Width''''' to ''175''.
# Click the save icon to save the changes to the '''Data Field'''.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 04 Test List Match 04.png]]
#<li value=9> If we go to the "Tester" tab and test the extraction again...
# Now we can read the full name of the company being extracted.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 04 Test List Match 05.png]]
==== Pattern Match ====
Now that we have a ''List Match'' extractor collecting the company name from the documents, let's collect the invoice number.
The company name is static. We can extract the exact name. An invoice number is highly variable. It is a sequence of numbers that is different on every document. We can use something called a ''Pattern Match'' to extract these invoice numbers.
# Add a new '''Data Field''' to your '''Data Model''' to collect the invoice number on our invoices.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 05 Pattern-Match 01.png]]
#<li value=2> Click on the hamburger icon to the right of the '''''Value Extractor''''' property.
# Select '''''Pattern Match''''' from the drop down menu.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 05 Pattern-Match 02.png]]
#<li value=4> Click the ellipsis icon to the right of the '''''Value Extractor''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 05 Pattern-Match 03.png]]
#<li value=5> When the "Value Extractor" window pops up, you may notice that it looks very similar to using a ''List Match''. For a ''Pattern Match'', we can use regex to match a value on the document.
#* The invoice numbers on these invoices have a consistent syntactic context. Two numbers, followed by a hyphen, followed by four numbers.
#* We can write the regex pattern <code>\d{2}-\d{4}</code> to collect the invoice numbers. We have entered that as the "Value Pattern".
# In the Document Viewer we can see that the invoice number is being returned.
# In the Results List we can see Grooper found two matches on the document. The values are the same, so it doesn't matter which one is returned.
# Click "OK".


=== Phase Four: Collect ===
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 05 Pattern-Match 04.png]]
 
 
#<li value=9> Click the save icon in the top right of the property grid.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 05 Pattern-Match 05.png]]
 
 
<b><big>Testing on the Data Model</big></b>
 
# Click on the Data Model.
# In the preview we can see both '''Data Fields''' that are child objects of the '''Data Model'''.
# Click over the "Tester" tab.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 06 Test-Pattern-Match 01.png]]
 
 
#<li value=4> Select a document from the '''batch''' in the Batch Viewer.
# Click on the test icon at the top right of the Batch Viewer.
# We can see what is being returned for all of the '''Data Fields''' that are child objects of the '''Data Model''' in the Data Element Tester panel located above the Document Viewer panel.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 06 Test-Pattern-Match 02.png]]
 
 
==== The Extract Step ====
 
Now that we have our extraction configured, let's go ahead and add an ''Extract'' '''Batch Process Step''' to our '''Batch Process'''.
 
# Right-click on the '''Batch Process'''.
# Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Extract..."
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 07 Extract-Step 01.png]]
 
 
#<li value=3> When the "Add Activity" window pops up, feel free to change the '''''Step Name''''' if you like. We are going to leave it as ''Extract''.
# Click "EXECUTE".
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 07 Extract-Step 02.png]]
 
 
#<li value=5> Now you should have an ''Extract'' '''Batch Process Step''' in your node tree.
# The '''''scope''''' for this step is set to a ''Folder'' level by default.
# The '''''Folder Level''''' property is set to ''1'' by default. Since our documents are at a Folder Level 1, this will work for us.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 07 Extract-Step 03.png]]
 
 
#<li value=8> Click over to the "Activity Tester" tab.
# Select all of the documents at Folder Level 1 in your Batch Viewer.
# Click on the test icon to run the Extract Activity on your '''Batch'''.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 07 Extract-Step 04.png]]
 
 
#<li value=11> Click over to the '''Data Model''' in your node tree.
# In the "Tester" tab, click on one of your documents at Folder Level 1.
# The extracted information should appear in your Data Fields without needing to click on the test icon now because we have already extracted the information through the Extract Activity in our '''Batch Process'''.
 
[[File:2023.1 Grooper-Basics - Overview 03 Building 05 Collect 07 Extract-Step 05.png]]


</div>
=== Phase Five: Deliver ===
=== Phase Five: Deliver ===
Ok, we have "Acquired" our documents by bringing them into Grooper. We then "Conditioned" the documents by splitting out pages and running the ''Recognize Activity'' on the '''page''' objects. After that we "Organized" the documents by assigning a '''Document Type''' to each document in a process called "classification". Once all that was done, we "Collected" the data from the documents by setting up '''Data Fields''' to extract the data we wanted.
Now in our fifth and final phase of Grooper, it's time to take all the data we collected and use that to "Deliver" or export the documents into organized folders.
<div style="padding-left: 1.5em";>
==== Adding an Export Behavior ====
First, we need to add something called an "Export Behavior". This is where we give Grooper the instructions on how we want it to export our documents such as where to export to, what to name the folders and files, and what file format to export.
# Select the '''Content Model''' from the node tree.
# Click on the ellipsis icon to the right of the '''''Behaviors''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 01.png]]
#<li value=3> When the "Behaviors" window pops up, click on the "+" icon located at the top of the pop-up window.
# Select "Export Behavior" from the drop down menu.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 02.png]]
Now you should see "Export Behavior" listed on the left under the list of behaviors. With that selected you should see a property called '''''Export Definitions''''' show up in the right side of the "Behaviors" window.
#<li value=5> Click on the ellipsis icon to the right of the '''''Export Definitions''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 03.png]]
#<li value=6> When the "Export Definitions" window pops up, click on the "+" icon at the top to access the drop down menu.
# Select "File Export" from the drop down menu.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 04.png]]
==== Setting a Target Folder ====
Now that we have added the Export Behavior, we can start configuring the behavior with instructions on what to do with our documents. Let's start by telling Grooper where to put the documents.
# In the '''''Target Folder''''' property, we need to put a UNC path to where we want Grooper to send the exported documents.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 05.png]]
#<li value=2> In your windows explorer, navigate to the folder where you want Grooper to export your documents. Right-click and select "Properties" from the pop-up menu to access the folder's properties.
# Click on the "Sharing" tab in the "Properties" pop-up window.
# Under "Network Path:" you will find the UNC path. Copy this to your clipboard.
{|class="fyi-box"
|-
|
'''FYI'''
|
The folder you wish to export to must be shared to the same server that Grooper is running on if it is not located on the same server. If Grooper cannot access that folder, it cannot export to it.
|}
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 06.png]]
#<li value=5> Back in Grooper, paste the UNC path into the '''''Target Folder''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 07.png]]
==== Defining a Relative Path ====
When we export our documents, we don't want to just export all of them into the folder we selected. We want to export them in an organized fashion. For this example we want to do the following:
* Create a folder titled "Invoices" for all of our documents.
* Create a second subfolder named after the company the specific invoice is from.
* Save the document named after the extracted invoice number.
Using the '''''Relative Path''''' property, we can tell Grooper to create numerous folders and subfolders and then select a name for our documents. The folders and subfolders can be named based on text that you specify, or it can be named based on data extracted from the documents.
We do this by writing an expression using a method called "string interpolation".
# Click on the ellipsis icon to the right of the '''''Relative Path''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 08.png]]
We are going to write our expression in the '''''Relative Path''''' window that pops up.
#<li value=2> Start with a dollar sign ($). Then add open quotation marks (").
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 09.png]]
The first folder we want to make is going to simply be titled "Invoices". So we can just type the name we want into the expression.
#<li value=3> Now we're going to create our first subfolder. We're going to call the folder we create "Invoices". So, we type "Invoices" and follow with a backslash (\).
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 10.png]]
The next folder we want to create is going to be using extracted data to name the folder. We have to include the placeholder for the extracted name inside of curly brackets or {}.
#<li value=4> Next, type an open curly bracket ({).
#* This will bring up a drop down menu called "intellisense". The drop down menu with give you options for things you can use to name folders and files.
# Since we want to name the folder after the company name extracted from the document, select "Company_Name" from the intellisense drop down.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 11.png]]
#<li value=6> Close the curly bracket at the end of "Company Name" and then type in a backslash (\) to indicate a new folder level or naming convention. Type another open curly bracket ({).
# Select "Invoice_Number" from the intellisense drop down.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 12.png]]
#<li value=8> Close the curly bracket at the end of "Invoice Number" and then finish with an end quotation mark ("). Now your '''''Relative Path''''' expression is complete.
#* Your final expression should be: <pre>$"Invoices\{Company_Name}\{Invoice_Number}"</pre>
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 13.png]]
==== Export Formats ====
Next, we need to tell Grooper what sort of files to export. We have a number of options from PDFs and Text formats to XML and ZIP formats. For our purposes we only want to export PDFs.
# Click on the ellipsis icon to the right of the '''''Export Formats''''' property.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 14.png]]
#<li value=2> Click on the + icon located at the top of the "Export Formats" pop-up window to access the drop down to add an export format.
# Click on the "PDF Format" in the drop down menu.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 15.png]]
There are a number of properties we can configure for the export formats we choose to apply. For our purposes, we are going to just use all of the default settings, so no changes need to be made.
#<li value=4> Click "OK" in the top right of the pop-up window to apply the changes.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 16.png]]
==== Finish the Export Behavior ====
We have now configured all properties needed for our Export Behavior. Now all we need to do is apply changes and save to our '''Content Model'''.
# Click "OK".
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 17.png]]
#<li value=2> We don't need to add any other behaviors to our list, so go ahead and click "OK" on this window as well.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 18.png]]
#<li value=3> Finally, click the save icon at the top of the property grid to save the changes to the Content Model.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 19.png]]
==== Add the Export Step ====
Now that we have our '''''Export Behavior''''' configured, we need to add an ''Export'' '''Batch Process Step''' to our '''Batch Process'''.
# Right-click on the '''Batch Process'''.
# Hover over "Add Activity", then hover over "Document Processing." Finally, click on "Export..."
# Change the "Step Name" if you like. We will be leaving it as the default "Export".
# Click "EXECUTE" at the top right of the "Add Activity" window.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 20.png]]
#<li value=5> By default, our ''Export'' '''Batch Process Step''' is set to a '''''Scope''''' of ''Folder Level 1''. Since our documents are at Folder Level 1, this will work for us and no changes need to be made.
# You might notice that in the "ACTIVITY PROPERTIES" panel on the right we have an '''''Export Behavior''''' property. Since we already added an Export Behavior on our '''Content Model''', we don't need to do anything here.
[[File:2023.1 Grooper-Basics - Overview 03 Building 06 Deliver Export-Behavior 21.png]]
</div>
</div>
== Finished Batch Process ==
We have now finished designing our project going through the 5 phases of Grooper! Now it's time to see it in action.
There are a few things we need to do before we can start processing documents:
* Publish the Batch Process
* Add a CMIS Connection
* Start the Activity Processing and Import Watcher services
After these steps are completed, then we can import a '''Batch''' and let the '''Batch Process''' automatically process the documents through the 5 phases of Grooper according to the project we designed.
<div style="padding-left: 1.5em";>
=== Publish the Batch Process ===
Before a '''Batch Process''' can be used, we have to "publish" it. This tells Grooper that we are done making changes to the process and it is ready to be used.
# Right-click on the '''Batch Process'''.
# Click on "Publish".
# When the "Publish" window pops up, click "EXECUTE".
[[File:2023.1 Grooper-Basics - Overview 04 Finished 01 Batch-Process 01.png]]
#<li value=4> A copy of your published '''Batch Processes''' can be found in the '''Processes''' folder in your node tree.
{|class="fyi-box"
|-
|
'''FYI'''
|
You can still make changes to your '''Batch Process''' in your '''Project''' after it has been published. However, to apply your changes to your working '''Batch Process''' you will need to republish your '''Batch Process'''.
|}
[[File:2023.1 Grooper-Basics - Overview 04 Finished 01 Batch-Process 02.png]]
=== Adding a CMIS Connection ===
In order to import a '''Batch''' into Grooper automatically from a file on a computer, we first need to establish a connection from Grooper to the folder where that file resides. The first step to establishing this connection is adding a '''CMIS Connection''' object.
<b><big>Creating a Folder</big></b>
We are first going to create a new folder in our '''Project''' that will house all '''CMIS Connection''' objects for this project. Folders are handy for keeping your Grooper repository organized as you build it out.
# Right-click on your '''Project''' in the node tree.
# Hover over "Add", and then click on "Folder..."
# When the "Add" window pops up, name the folder. We have named it "Connections".
# Click "EXECUTE" located in the top right of the pop up window.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 01.png]]
<b><big>Creating the CMIS Object</big></b>
Now we can add the '''CMIS Connection''' object.
# Right-click on the folder you just created.
# Hover over "Add" and then click on "CMIS Connection..."
# When the "Add" window pops up, enter in a name for your '''CMIS Connection''' object. We have named it "Grooper Basics CMIS Connection" in this tutorial.
# Click "EXECUTE" in the top right of the pop up window.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 02.png]]
<b><big>Configuring the CMIS Connection</big></b>
# Now you should have a '''CMIS Connection''' object in your node tree. Select the object.
# Click on the hamburger icon to the right of the '''''Connection Settings''''' property.
# Select ''NTFS'' from the drop down menu.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 03.png]]
#<li value=4> Click on the ellipsis icon to the right of the '''''Connection Settings''''' property.
# When the "Connection Settings" window pops up, click on the ellipsis icon to the right of the '''''Repositories''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 04.png]]
#<li value=6> When the "Repositories" window pops up, click on the + icon at the top of the pop up window.
# An entry will pop up on your list located in the left of the pop up window. Make sure this entry is selected.
# You will need to paste the UNC path that leads to the folder you want to import your '''Batches''' from into the '''''Base Path''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 05.png]]
#<li value=9> After pasting the UNC path into the '''''Base Path''''' property, the '''''Repository Name''''' property should autopopulate.
# Click "OK" located in the top right of the pop up window.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 06.png]]
<b><big>Import the Repository</big></b>
Now that we have established a connection to the windows file system and pointed it at a folder, we need to actually access the files. To do that we need to import the repository.
# Click on the List Repositories icon located in the top right corner of the "REPOSITORIES" panel.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 07.png]]
A list of possible repositories should show up in the "REPOSITORIES" panel. Since the folder I pointed the CMIS connection to only has one subfolder, only one repository is showing in the screenshot below. You may have more in your list depending on where you set up your folder on your system.
#<li value=2> Click on the repository you want to import documents from.
# Click the Import Repository icon located in the top right next to the List Repositories icon.
# When the "Import Repository" window pops up, verify that the listed repository is correct.
# Click the "EXECUTE" button at the top of the pop up window.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 08.png]]
#<li value=6> Now you should have a CMIS Repository object in your node tree under your CMIS Connection object.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 09.png]]
#<li value=7> Click over to the "Browse" tab.
# We can see the contents of the folder we selected for our CMIS connection in the middle panel.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 02 CMIS 10.png]]
=== The Activity Processing and Import Watcher Services ===
In order to import '''Batches''' into Grooper, you need to have a couple of services installed in running. In this course we are assuming you already have these services running. If not, please check out the [[Activity Processing]] and the [[Import Watcher]] articles for installation instructions.
With an Activity Processor service and Import Watcher service installed, we need to make sure they are turned on. If they are already running, skip to the [[#Import the Batch]] section.
# Click on the '''Machines''' folder in the node tree.
# You should see a list of servers where you can access services from in the middle panel at the top. Select the server from the list if not already selected.
# Under the list of servers, you should have a list of services that are installed on the selected server.
# To the right there is a "SERVICE PROPERTIES" panel. There are a number of service properties you can configure, but for our purposes, we can stick with the defaults.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 03 Services 01.png]]
<b><big>Turning on the Services</big></b>
# Select the Activity Processing service from the list.
# Click the Start Service icon located in the top right of the panel. It should look like a play button.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 03 Services 02.png]]
#<li value=3> Select the Import Watcher service from the list.
# Click the Start Service icon located in the top right of the panel.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 03 Services 03.png]]
#<li value=5> Now we have an Activity Processor and Import Watcher running on our repository.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 03 Services 04.png]]
=== Import the Batch ===
Ok, now we have all that we need to import and start processing a '''Batch''' through our '''Batch Process'''. We are going to need to import a '''Batch''' from the Imports page in the Grooper web client.
# Click over to the Imports Page by clicking the Imports icon in the Context Toolbar at the top of the screen.
# Click the + icon located on the far right of the Context Toolbar to start a new import job.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 01.png]]
#<li value=3> When the "Submit Import Job" window pops up, enter in a short '''''Description''''' for the Import Job in the '''''Description''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 02.png]]
#<li value=4> Click on the hamburger icon to the right of the '''''Provider''''' property.
# Select ''Import Descendants'' from the drop-down menu.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 03.png]]
#<li value=6> Under the '''''Provider''''' property, click on the hamburger icon to the right of the '''''Repository''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 04.png]]
#<li value=7> Navigate to the imported repository under your CMIS connection.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 05.png]]
#<li value=8> The '''''Base Folder''''' and '''''Import Filter''''' properties will automatically populate with information that will tell Grooper to import all documents from the repository we selected.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 06.png]]
#<li value=9> Scroll down in the window and open the '''''Batch Creation''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 07.png]]
#<li value=10> Click the hamburger icon to the right of the '''''Starting Step''''' property.
# Navigate to and select the step from your '''Batch Process''' that you want Grooper to start processing the imported documents. We want to start with the ''Split Pages'' '''Batch Process Step'''.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 08.png]]
#<li value=12> Uncheck the '''''Start Paused''''' property if you want Grooper to start processing as soon as the '''Batch''' is imported.
# Click the "SUBMIT" button at the top of the pop up window to import the '''Batch''' and start the '''Batch Process'''.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 09.png]]
<b><big>Running Through the Batch Process</big></b>
# After submitting the job, you'll see the Import Job in the list of Imports. It should have a Status of "Working" if you unchecked the '''''Start Paused''''' property.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 10.png]]
#<li value=2> Click on the Batches icon in the Context Toolbar to go to the Batches page.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 11.png]]
#<li value=3> You will see the '''Batch''' that is currently being processed in the list of '''Batches'''. Select the '''Batch''' if not already selected.
# At the bottom you should see what looks a bit like a bar graph. This will show you what Grooper is currently processing.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 12.png]]
#<li value=5> When the '''Batch Process''' is complete, you should see all <span style="color:blue">blue</span> bars for every '''Batch Process Step''' at the bottom of the screen.
{|class="attn-box"
|
&#9888;
|
While Grooper is running through the '''Batch Process Steps''' you may see the bars temporarily have a <span style="color:green">green</span> color.
If you find that one or more of your bars turn a <span style="color:red">red</span> color, then there was an error while processing that step. You will need to take a look at those '''steps''' in your '''Batch Process''' and see if there is something configured incorrectly.
|}
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 13.png]]
#<li value=6> Now if you look in the folder where you set your '''''Export Behavior''''', your PDFs should be properly named and organized.
[[File:2023.1 Grooper-Basics - Overview 04 Finished 04 Import-Job 14.png]]
== BONUS: Ad-Hoc Jobs ==
As a bonus, we are going to share a slightly more advanced tip with you. When testing your '''Batch Process Steps''', you can actually do something called Ad-Hoc processing. This allows you to test the whole '''Batch''' all at once without having to select the individual folders or '''page''' objects from the Batch Viewer first. This can come in handy when you can't select all of the folders and '''pages''' all at once.
{|class="attn-box"
|
&#9888;
|
An Ad-Hoc job can only be performed if you have an Activity Processing service running on the Grooper server. Make sure you install and start the service before continuing.
|}
# In our example below, we are wanting to test the ''Recognize'' step in the '''Batch Process''' and we have different pages inside different folders. Since they are in different folders, we cannot select all of the '''page''' objects at once. Running ''Recognize'' on each one individually can take a while.
# Instead of clicking on the Test icon, with an Activity Processing service running, we can submit an Ad-Hoc job. Click on the "Submit a Job" icon located to the right of the Test icon. This will run the activity at the folder level as configured on the "Batch Process Step" tab.
[[File:2023.1 Grooper-Basics - Overview 05 Bonus 01 Ad-Hoc 01.png]]
#<li value=3> When the "Submit Job" window pops up, click "OK" at the top of the pop up window.
[[File:2023.1 Grooper-Basics - Overview 05 Bonus 01 Ad-Hoc 02.png]]
#<li value=4> You should automatically be taken to the Jobs page. You will see your Ad-Hoc job in the list on the page and can see when it is complete.
[[File:2023.1 Grooper-Basics - Overview 05 Bonus 01 Ad-Hoc 03.png]]
</div>
</div>

Latest revision as of 15:10, 30 July 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

This article serves as a beginner's guide to Grooper. In this article we will go over the most basic concepts and activities within the Grooper software so you can start using it for your company's needs.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article. The third contains PDFs to be used when generating a new project (do not try to upload into Grooper; just unzip and use the PDFs per the tutorial instructions).

Introduction

Grooper is a powerful document processing software that can be tailored to your company's individual needs.

When first diving into Grooper, the software can be a bit overwhelming, especially if you have not worked with document processing software before. The purpose of this tutorial is to familiarize users with the fundamental basics of Grooper. By the end of this tutorial you should understand how to take a set of documents from beginning to end through a very simple process and export out of Grooper.

The 5 Phases of Grooper

From the time that documents are pulled in to Grooper to when those same documents are exported from Grooper, all those documents must go through what we call the "5 Phases of Grooper".

The 5 Phases of Grooper are as follows:

  1. Acquire
  2. Condition
  3. Organize
  4. Collect
  5. Deliver

This tutorial is structured to go through each of the phases one by one. Other tutorials and courses will build on what we learn here. For now, let's break down each of these phases and see what we can expect as we build out our project.


Phase 1: Acquire
The "Acquire" phase involves bringing the documents to be processed into Grooper.


Phase 2: Condition
The "Condition" phase is where we take the documents we have brought into Grooper and edit them until we have something Grooper can work with. Before Grooper can do anything with a document, it needs to be able to Recognize the text on the document. That is one of the most important parts of the Condition phase.


Phase 3: Organize
Once Grooper can recognize text it then needs to be able to tell what the document is. Without our input, Grooper just sees the document as pages in a folder. We need to give the document a name or a "classification". We classify documents in the "Organize" phase.


Phase 4: Collect
In the "Collect" phase, Grooper can finally collect or "extract" information from the documents.


Phase 5: Deliver
The final phase of Grooper is the "Deliver" phase where we export the documents and metadata out of the Grooper software to wherever you wish to store the files.


These five phases encompass a full project in Grooper. In the next sections, we are going to put together a project step-by-step, going through each phase.

Navigating the Web Client

FYI

At the top of this page you will find some downloadable zip files that contain a Project, a Batch, and some PDFs. If you would like to follow along in this section of the tutorial, download these zip files and upload the Project and Batch to your Grooper environment. For instructions on how to do this, visit our Download or Upload Grooper Objects wiki page. The PDFs will be used in a later section when we start to build out a Project from scratch.

Before we continue, let's take a moment to look at Grooper's interface. For now we are only going to take a look at the Home page and the Design page as that is where we will be spending most of our time in this tutorial.

The Home Page

  1. Context Toolbar: At the top of the screen you'll find a toolbar with several icons to the left, the page name, repository name, and Licensee information in the middle, and a few extra icons to the right. The leftmost icons will take you to different Grooper pages, with the first icon taking you to the Grooper home page. The rest of the icons are described more below as they are also found under "Navigation Links". The icons on the right allow you to change the Grooper Repository you are working in, look at your user information, and bring up the in-app Grooper help.
  1. Navigation Links: In the first panel in Grooper at the top left you will find the Navigation Links. These icons will take you to the different Grooper pages:
    • Design: This is where we will be working for the majority of this tutorial. The Design page is where you configure your Grooper Project.
    • Batches: On the Batches page you can view all Batches that are currently in production. You can also add new Batches from this page.
    • Tasks: Here you can view any production Batch that is pending review.
    • Imports: You can view a list of recent imports on the Imports page.
    • Jobs: Here you can see what jobs have been submitted and what stage each job is in.
    • Stats: The Stats page will take you to a place where you can view various Grooper statistics.
    • Learn: This icon will take you to the home page of our Grooper University courses at learn.grooper.com.
    • Wiki: This icon will take you to the home page of our Grooper Wiki website at wiki.grooper.com.
  1. Repository Info: In the second panel on the home page you can get a lot of information about what is currently contained within and is being processed in the repository.
  1. Recent Events: The final panel of the home page will show information regarding recent processing events and errors.


The Design Page

For now, we are going to focus on the Design page, as that is where we will be doing most of our work in this tutorial. In this section, we will look at the UI of the Design page and explain how things are organized to make it easier to navigate and follow along with the rest of this tutorial.

  1. To navigate to the Design page, click on the hammer and wrench icon in the Context Toolbar or you can click on the same icon located on the home screen in the "Navigation Links" panel.
  2. At the top-middle of the window, the page you are currently on will be displayed.
  3. The first panel on the left side of the screen is the "Node Tree". This is where you will navigate through the different elements in your repository.
  4. To the right of the Node Tree you will find different panels of information and configuration options for the element you have selected in the Node Tree. These panels will be different depending on the type of object you have selected.


  1. The first element at the top of the Node Tree is called the "Root Node". Here you can find various information about your repository, including your licensing information.
  2. For each element in the Node Tree selected, there will be multiple tabs you can access with different information and options. These tabs are located at the top of the configuration panels.

The Batches Folder

  1. The Batches folder in the Node Tree is where we will find all documents brought into Grooper as a Batch.
  2. Inside of the Batches folder are two folders: Production and Test. We will be working in the Test folder.


  1. Inside of the Test folder is where all Batches of documents you are using to test your project will be kept. You can create subfolders inside the Test folder to keep Batches organized. For now we are going to look at our first object in Grooper: a Batch object.
  2. With the Batch object selected, click on the "Viewer" tab.


FYI

You might notice that the icon to the right of "2023.1_Grooper-Basics_Batch" looks different than the other folder icons. Different elements in the Node Tree have different icons depending upon what type they are. Folders have a folder icon where objects have an icon that represent the object itself. This icon represents a Batch object.

It is recommended that you familiarize yourself with the different icons and what objects they represent as you learn Grooper. It will make navigating the Node Tree much easier and will help the object hierarchy make sense.


  1. The first panel to the right of the Node Tree in the Viewer tab is the Batch Contents panel. Here you can see the various folders and pages of your Batch.
  2. To the right of the Batch Contents panel is the Document Viewer. Here you can see a preview of the selected page.


  1. If we open up the various folders in the Node Tree under the Batch object, we can see the same folders and pages shown in the Batch Contents panel.

The Projects Folder

  1. The Projects folder will contain all of your various Grooper projects.
  2. You can create multiple projects at once and make folders to keep things organized.


  1. Inside of your project object, you will add various objects and folders that will build out your project.
  2. Just like in the Batches folder, when you select an object, you will have different tabs you can access.
  3. You will have a properties panel and various other panels depending on what tab you are accessing on which object.
  4. At the bottom of the screen you will see a panel that contains the Grooper in-app help. It will display definitions, tips, and explanations of whatever object or property you have selected.

Building the Project

Now we are going to actually build out a project from start to finish showing the most basic functionality of Grooper. This will give you a basic understanding of Grooper's fundamentals and you can build out your knowledge from there using other tutorials and courses.

FYI

If you wish to follow along with this portion of the course, you will need the downloadable PDFs from the top of this article. Those are the documents we will be processing in this tutorial

Before we start building, let's take a look at what we hope to accomplish in our project. For this project we have a few invoices we want to bring into Grooper. We want Grooper to take these documents and move them to a folder, but name them in such a way that they are organized and easier to search through. We want Grooper to create a folder named after the company listed on the invoice and then name the file based on the Invoice Number.


We want Grooper to do all the work for us in an automated fashion, so we are going to put together a Batch Process, which essentially tells Grooper how to process the document.

  1. Within the Batch process we will have various Batch Process Steps which are the individual ordered instructions of what to do with a Batch.

According to our Batch Process, Grooper will perform the following in order:

  1. Split Pages: Grooper will take the documents in the Batch and expose, or "split" out, the individual pages of those documents as child Batch Page objects.
  2. Recognize: For Grooper to be able to do anything with the documents, it first needs to understand and be able to "recognize" the text on the document. That happens in this step.
  3. Classify: While we as humans might easily understand a document is an "invoice", Grooper must have explicit instructions that allow it to identify a type of document. In Grooper, the act of identifying and assigning a "type" to a document, like "invoice", is know as classification.
  4. Extract: Grooper will then collect information from the document. For our needs we will be extracting the company name and invoice number.
  5. Export: Finally, the documents will be exported into a new folder named after the company and the pdf file names will reflect the invoice number.

The file set below is our expected outcome:

Phase One: Acquire

Let's get started building our Project.

We need to start with the first phase of Grooper: Acquire.

Before we can do anything with Grooper, we first need to give Grooper something to work with. We need to bring documents into Grooper. The documents are brought into Grooper into a Batch object. So the first thing we need to do is create a Batch object where we can keep our documents.

  1. In the Batches folder, right-click on the Test subfolder.
  2. Hover over "Add" and then click on "Batch...".


  1. When the "Add" window pops up, type in a name for your Batch.
  2. Click "EXECUTE" to create the Batch object.


  1. Now we should have a new Batch object in our Node Tree.
  2. With the Batch object selected, click on the "Viewer" tab.


  1. Now we have an empty Batch. In the viewer tab we can see we have a Batch Folder, but nothing inside.


  1. We want to drag and drop the PDF documents from a file on our computer or server to the Batch Folder.


  1. Now our documents are copied into Grooper. This completes the "Acquire" phase of Grooper for this project.

Folder Levels

Before we go any further, we need to talk about a concept called "Folder Levels" in a Batch.

When you first create a Batch, there's only a single empty folder inside. When you drag and drop documents into that folder, more folders are created with those document inside. These new created folders are considered to be "inside" of the original Batch Folder.

These first folders inside of the root folder are considered to be at a Folder Level of 1. A folder that exists inside of a folder at Level 1 would be considered Level 2. A folder inside of a Level 2 folder would be considered at Level 3, and so on.

Inside these folders, there can also be a new type of object called a Page object. A Page can exist at any level, but will always be referred to as being at a "Page Level". You do not need to keep track of what level the pages are at.

This concept can be confusing to grasp at first, so below we have some graphics to try and make this more clear.





In the Batch we just created, the original folder that was part of the Batch when it was created is at the "Batch Level" and the folders that were created when we added the documents to the Batch are at "Folder Level 1".

This will become important as we continue through the rest of the Grooper phases.

Phase Two: Condition

We have completed the first phase of Grooper. Now it is time to move on to the second phase: Condition.

Conditioning documents can involve a lot of different things. The point of conditioning the documents is to turn the documents into something that Grooper can better understand and work with. In this tutorial we are going to do two things with our documents:

  1. We are going to split out the pages of our documents so that Grooper will understand what a page is. After we split the pages, we will have a "page object" in Grooper.
  2. Once we have page objects in Grooper, we are going to run an activity called "Recognize" on the page objects. Once we have run that activity, Grooper will be able to understand or "recognize" the text present on the document.

Creating a Project

Now that we have a Batch to process, we must start building our Project. The first step is, of course, creating a new Project in Grooper.

  1. In the Node Tree, right-click on the "Projects" folder.
  2. In the menu that pops up, hover over "Add" and then select "Project...".


  1. When the "Add" window pops up on the screen, enter in a name for your project.
  2. When finished, click "EXECUTE" in the top right corner of the pop-up window.


  1. Now you should have a new Project in your Node Tree.


The Batch Process

What is a Batch Process?

One of the main advantages to using Grooper is being able to automate your document processing. The Batch Process is how your Grooper project is automated. The Batch Process is the set of instructions that tells Grooper what to do with the documents we give it. We will tell Grooper step-by-step what to do.

Throughout the rest of the tutorial, we will be adding these individual Steps to our Batch Process. The first Batch Process Step will will be adding is called Split Pages.

FYI

The Batch Process Steps are "objects" in Grooper. The Batch Process Steps apply what is called an "activity" to the Batch. So, the Split Pages Step is applying the "split pages activity" to the Batch. This distinction will become more important as you learn more about Grooper.

Creating a Batch Process

Let's go ahead and add the Batch Process object to our Grooper Project.

  1. Right-click on your project.
  2. Hover over "Add" and then click on "Batch Process..."


  1. When the "Add" window pops up, type in a name for your Batch Process.
  2. Click "EXECUTE" in the top right-hand corner of the pop-up window.


  1. Now you should have a Batch Process object in your node tree under your Project.


Split Pages

Let's now add our first Batch Process Step: "Split Pages".

  1. Right-click on your Batch Process object.
  2. Hover over "Add Activity", then hover over "Transform". Finally, click on "Split Pages...".


  1. When the "Add Activity" window pops up, you can change the name of the Batch Process Step if you like. We are going to leave it as the default "Split Pages".
  2. Click "EXECUTE" located in the top right-hand corner of the "Add Activity" window.


  1. Now we have a Split Pages Batch Process Step in our node tree.
  2. We can see in the property grid (the properties panel) that by default the Scope of the Batch Process Step is set to Folder.
    • This is where our earlier conversation about folder levels becomes important.
  3. Also by default, the Folder Level property is set to 1. Since our documents are at Folder Level 1, this will work for us nicely.
  4. With everything properly configured, let's click over to the "Activity Tester" tab.


Selecting Your Batch

  1. Right now we do not have a Batch selected so our Batch Viewer (the middle panel labeled "TEST BATCH") is blank.
  2. Click on the "Browse Batches" icon.
    • This is the left-most icon at the top-right of the Batch Viewer.


  1. Navigate to and select the Batch you want to apply the Batch Process Step to.
  2. Click "OK" at the top right of the pop-up window.


  • Now we have a Batch we can navigate through in our Batch Viewer.
    1. On the right in the Document Viewer we can see a preview of the document. We can see what the document looks like.


    Testing the Split Pages Activity
    1. Click the first document in the Batch, hold down shift, then click the last document in the Batch to select all of the documents.
    2. With all documents selected, click the Test icon that looks like a play button inside of a circle in the top right of the Batch Viewer. This will run the Split Pages Step on the documents selected.


    1. The documents have now been split out into individual pages. Each Document Folder now has a Page object.

    FYI

    Page objects are important as certain activities can only be run on a page and not a folder. Likewise, there are certain activities that can only be run on a folder and not a page.


    Recognize

    The next step is for Grooper to recognize the text on the documents. It is recommended to run the Recognize Activity on a page object rather than a folder. That is why we ran the Split Pages Activity before we run Recognize. Now that we have pages in our batch, let's add a Recognize Batch Process Step.

    Add the Recognize Step

    1. Right-click on the Batch Process object in the node tree.
    2. Hover over "Add Activity", then hover over "Cleanup & Recognition". Finally, click on "Recognize".


    1. When the "Add Activity" window pops up, you can change the name of the step if you like. However, we are going to leave it as the default "Recognize".
    2. Click "EXECUTE" in the top right corner of the pop-up window.

    1. Select the Recognize Batch Process Step in the node tree.
    2. If we look at the Property Grid, we can see that the Scope is set to Page by default. Since we want to run Recognize at a page level, we will leave this property as is.
    3. With the Batch Process Step properly configured, click on the "Activity Tester" tab.

    FYI

    The documents we are working with are PDFs with native text embedded in the document. If we were working with scanned documents or images with no native text, we would have to configure the Recognize Batch Process Step to run OCR on the documents. With native text, there is nothing more we needed to configure.


    1. If you select the folder at "Folder Level 1"...
    2. ... notice that the play button at the top right of the Batch Viewer is grayed out. This is because the Batch Process Step is set to a "Page Level" Scope.


    1. If we select one of the pages in the batch...
    2. ... the play button will no longer be grayed out and we can run the recognize activity.


    Check the Text

    1. After running Recognize on the page, if we click on the PDF icon in the top right of the Document Viewer, we can now see we have a "Text" option in the drop-down.
    2. Click "Text" from the drop-down.


    1. Now we can see the full text that was recognized by Grooper in the Document Viewer.


    1. To get back to the original page, click on the text icon to access the drop-down.
    2. Click on "Page" in the drop-down.


    Repeat the Steps

    1. Repeat the previous steps for each page object in the batch to recognize the text on every page.

  • Phase Three: Organize

    With phase one and two complete, it is time to move on to phase 3: Organize.

    The most important part of the Organize phase is Classification. Before Grooper can do anything more with a document, we first must tell Grooper what the document is. Without explicit instructions to define a "type" of document, Grooper cannot tell the difference between documents such as an invoice or a college transcript. These two documents would need to be processed very differently.

    In our case, our batch only has one type of document so that makes our process of classification simple. We just need to tell Grooper that all of the documents are invoices.

    Adding a Content Model

    Before we can classify a document, we need to first introduce a new object in Grooper. This object is called a Content Model.

    A Content Model is a Grooper object that houses other objects like Data Elements (these are used in the Collect Phase of Grooper) and Document Types. A Document Type is another object we will be using that essentially gives a document a classification or "name", but we will get to that later in this tutorial.

    For now what you need to know is that in order to classify a document, we first need a Content Model in our Project. So, let's add one.

    1. Right-click on your Project.
    2. Hover over "Add" and then click on "Content Model..."


    1. When the "Add" window pops up, enter in a name for your Content Model. Here we have named it "Grooper Basics Content Model".
    2. Click "EXECUTE" in the top right hand corner of the pop-up window.


    1. Now you should have a Content Model object in your node tree.


    Adding a Document Type

    Now we're going to add a Document Type to our Content Model. This Document Type will eventually be assigned to the documents in our Batch to tell Grooper what type of document we are working with.

    1. Right-click on the Content Model.
    2. Hover over "Add" then click on "Document Type...".


    1. Give a name to your Document Type. It should reflect the type of document you will be processing through Grooper. Here we have named our Document Type "Generic Invoice".
    2. Click "EXECUTE" in the top right hand corner of the pop-up window.


    1. Now we should have a Document Type in our Content Model.


    Setting the Default Content Type

    Now we need to tell Grooper how to classify a document. Since all of the documents in our batch are going to be assigned the same Document Type, we can simply tell Grooper that all of the documents need to be classified with the "Generic Invoice" Document Type.

    1. Select the Content Model in your node tree.
    2. Click on the hamburger icon to the right of the Default Content Type property.
    3. Navigate to and select the Document Type we just created.


    1. Now we should have the Generic Invoice showing as the Default Content Type. This means that all documents referencing this Content Model will all be classified as a "Generic Invoice".
    2. Any time you make a change to the properties of an object, you must either save or cancel your changes before you can navigate to another object in the node tree. There is a save icon and a cancel icon located in the top right corner of the property grid. Here we are going to click the save icon to save our changes.


    The Classify Step

    Ok, we've now done all the set up needed to classify our documents. Let's set up our Classify Batch Process Step in our Batch Process.

    1. Right-click on your Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Classify..."


    1. When the "Add Activity" window pops up, you can change the name of the step if you like. We are going to leave it as the default of "Classify".
    2. Click "EXECUTE" located in the top right of the pop-up window.


    1. Now you should have a Classify Batch Process Step in your node tree.
    2. By default, the Scope is set to a Folder Level of 1. Classification can only be run on a folder level. Page objects cannot be classified. We are going to leave these properties as the default settings.


    1. Click on the hamburger icon to the right of the Content Model Scope property.
    2. Navigate to and select the Content Model we will be using for classification from the drop down.


    1. Click the save icon in the top right of the property grid.
    2. Click on the "Activity Tester" tab.


    Testing the Classify Step

    1. Notice that on the "Activity Tester" tab, we currently have a page in our batch selected in the Batch Viewer.
    2. The play button is grayed out because our Classify Step is set to Folder Level 1. The step cannot be tested on a Page level.


    1. Instead we are going to select the folder at level 1. Classification must be run on a folder level.
    2. Now the play button is no longer grayed out.


    1. Click on the first level 1 folder, hold the shift key, then click on the last level 1 folder to select all of the level 1 folders.
    2. Click the play button to run the classify activity on the batch.


    1. Now all of the documents are classified. The word "Document" has been replaced with "Generic Invoice".

    Phase Four: Collect

    Now it's time to enter into Phase Four: the Collect phase. Here we're actually going to extract some information from the document. All the prior phases have been preparing the documents for extraction.

    Let's remember what our overall goal is for this project. When Grooper exports our documents, we want them to be filed into a folder named for the company name on the invoice and then we want each file to be named after the invoice number. That way, the documents will be well organized in our file system.

    So, let's collect the company name and invoice numbers from these documents.

    The Data Model

    A Data Model is an object in Grooper that acts as the container for holding all other objects that extract the data from the documents. Before we can start collecting information from the documents, we need to add a Data Model to our project.

    1. Right-click on the Content Model.
    2. Click on "Create Data Model".


    1. Now you should have a Data Model object in your node tree.


    The Data Field

    There are multiple extraction objects you can use to collect data from the documents, but we are going to focus on just one: The Data Field. Let's go ahead an add our first data field to collect the company name from these documents.

    1. Right-click on your Data Model in the node tree.
    2. Hover over "Add" and then click on "Data Field".


    1. Enter in a name for your Data Field. We want to collect the company name with this Data Field so we have named it appropriately as "Company Name".
    2. Click "EXECUTE" located in the top right corner of the pop-up window.


    1. Now you should have a Data Field in your node tree.
    2. There are various properties you can configure for a Data Field located in the Property Grid.
    3. In the right panel, you can see a preview of what the name and field will look like.


    The List Match Extractor

    There are many different Value Extractors we can use to collect data off of a document. To collect the company name we are going to use a List Match.

    1. Click the hamburger icon next to the Value Extractor property.
    2. Select List Match from the drop-down menu.


    1. Click on the ellipsis icon to the right of the Value Extractor property.


    The Value Extractor Window

    No matter what Value Extractor you decide to use, you will be configuring the extractor using the Value Extractor window. There are several parts to the Value Extractor window.

    1. In the box under "LOCAL ENTRIES" is where you will type out what you want to extract.
    2. In the Batch Viewer you can select the folder or page you want to look at.
    3. In the Document Viewer, you can see a preview of the document you selected in the Batch Viewer.
    4. In the right hand corner we have the Results Viewer or Results List panel. You will be able to see a list of what Grooper returns from your extractor.


    Configuring the Extractor

    You need to make sure you have the correct batch pulled up in your Batch Viewer/Document Viewer so you can see what is being extracted.

    1. If you do not have the right batch selected, click on the browse button in the top right of the Batch Viewer.
    2. When the window pops up, navigate to and select the batch you want to extract from.
    3. Click "OK" in the top right of the pop-up window.


    1. Type what you want returned from the document into the "LOCAL ENTRIES" field.
    2. Anything that is being returned by our List Match will be highlighted in green in the Document Viewer.
    3. In the Results List we are getting two text segments returned by our extractor.
    4. Click "OK" in the top right corner of the "Value Extractor" window.


    1. Click the save icon in the top right of the property grid to save the changes made to the Data Field object.


    Testing the Extraction

    1. Click over to the "Tester" tab.
    2. Select the document you want to test.
    3. Click on the test icon. This should look like a play button in the top right hand corner of the Batch Viewer.


    1. Now we can see what is being extracted. The text is difficult to see though.


    1. Click over to the "Data Field" tab.
    2. The Display Width property refers to the width of the Data Field in the extraction preview.


    1. Let's change the value of the Display Width to 175.
    2. Click the save icon to save the changes to the Data Field.


    1. If we go to the "Tester" tab and test the extraction again...
    2. Now we can read the full name of the company being extracted.


    Pattern Match

    Now that we have a List Match extractor collecting the company name from the documents, let's collect the invoice number.

    The company name is static. We can extract the exact name. An invoice number is highly variable. It is a sequence of numbers that is different on every document. We can use something called a Pattern Match to extract these invoice numbers.

    1. Add a new Data Field to your Data Model to collect the invoice number on our invoices.


    1. Click on the hamburger icon to the right of the Value Extractor property.
    2. Select Pattern Match from the drop down menu.


    1. Click the ellipsis icon to the right of the Value Extractor property.


    1. When the "Value Extractor" window pops up, you may notice that it looks very similar to using a List Match. For a Pattern Match, we can use regex to match a value on the document.
      • The invoice numbers on these invoices have a consistent syntactic context. Two numbers, followed by a hyphen, followed by four numbers.
      • We can write the regex pattern \d{2}-\d{4} to collect the invoice numbers. We have entered that as the "Value Pattern".
    2. In the Document Viewer we can see that the invoice number is being returned.
    3. In the Results List we can see Grooper found two matches on the document. The values are the same, so it doesn't matter which one is returned.
    4. Click "OK".


    1. Click the save icon in the top right of the property grid.


    Testing on the Data Model

    1. Click on the Data Model.
    2. In the preview we can see both Data Fields that are child objects of the Data Model.
    3. Click over the "Tester" tab.


    1. Select a document from the batch in the Batch Viewer.
    2. Click on the test icon at the top right of the Batch Viewer.
    3. We can see what is being returned for all of the Data Fields that are child objects of the Data Model in the Data Element Tester panel located above the Document Viewer panel.


    The Extract Step

    Now that we have our extraction configured, let's go ahead and add an Extract Batch Process Step to our Batch Process.

    1. Right-click on the Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing". Finally, click on "Extract..."


    1. When the "Add Activity" window pops up, feel free to change the Step Name if you like. We are going to leave it as Extract.
    2. Click "EXECUTE".


    1. Now you should have an Extract Batch Process Step in your node tree.
    2. The scope for this step is set to a Folder level by default.
    3. The Folder Level property is set to 1 by default. Since our documents are at a Folder Level 1, this will work for us.


    1. Click over to the "Activity Tester" tab.
    2. Select all of the documents at Folder Level 1 in your Batch Viewer.
    3. Click on the test icon to run the Extract Activity on your Batch.


    1. Click over to the Data Model in your node tree.
    2. In the "Tester" tab, click on one of your documents at Folder Level 1.
    3. The extracted information should appear in your Data Fields without needing to click on the test icon now because we have already extracted the information through the Extract Activity in our Batch Process.

    Phase Five: Deliver

    Ok, we have "Acquired" our documents by bringing them into Grooper. We then "Conditioned" the documents by splitting out pages and running the Recognize Activity on the page objects. After that we "Organized" the documents by assigning a Document Type to each document in a process called "classification". Once all that was done, we "Collected" the data from the documents by setting up Data Fields to extract the data we wanted.

    Now in our fifth and final phase of Grooper, it's time to take all the data we collected and use that to "Deliver" or export the documents into organized folders.

    Adding an Export Behavior

    First, we need to add something called an "Export Behavior". This is where we give Grooper the instructions on how we want it to export our documents such as where to export to, what to name the folders and files, and what file format to export.

    1. Select the Content Model from the node tree.
    2. Click on the ellipsis icon to the right of the Behaviors property.


    1. When the "Behaviors" window pops up, click on the "+" icon located at the top of the pop-up window.
    2. Select "Export Behavior" from the drop down menu.


    Now you should see "Export Behavior" listed on the left under the list of behaviors. With that selected you should see a property called Export Definitions show up in the right side of the "Behaviors" window.

    1. Click on the ellipsis icon to the right of the Export Definitions property.


    1. When the "Export Definitions" window pops up, click on the "+" icon at the top to access the drop down menu.
    2. Select "File Export" from the drop down menu.


    Setting a Target Folder

    Now that we have added the Export Behavior, we can start configuring the behavior with instructions on what to do with our documents. Let's start by telling Grooper where to put the documents.

    1. In the Target Folder property, we need to put a UNC path to where we want Grooper to send the exported documents.


    1. In your windows explorer, navigate to the folder where you want Grooper to export your documents. Right-click and select "Properties" from the pop-up menu to access the folder's properties.
    2. Click on the "Sharing" tab in the "Properties" pop-up window.
    3. Under "Network Path:" you will find the UNC path. Copy this to your clipboard.

    FYI

    The folder you wish to export to must be shared to the same server that Grooper is running on if it is not located on the same server. If Grooper cannot access that folder, it cannot export to it.


    1. Back in Grooper, paste the UNC path into the Target Folder property.


    Defining a Relative Path

    When we export our documents, we don't want to just export all of them into the folder we selected. We want to export them in an organized fashion. For this example we want to do the following:

    • Create a folder titled "Invoices" for all of our documents.
    • Create a second subfolder named after the company the specific invoice is from.
    • Save the document named after the extracted invoice number.

    Using the Relative Path property, we can tell Grooper to create numerous folders and subfolders and then select a name for our documents. The folders and subfolders can be named based on text that you specify, or it can be named based on data extracted from the documents.

    We do this by writing an expression using a method called "string interpolation".

    1. Click on the ellipsis icon to the right of the Relative Path property.


    We are going to write our expression in the Relative Path window that pops up.

    1. Start with a dollar sign ($). Then add open quotation marks (").


    The first folder we want to make is going to simply be titled "Invoices". So we can just type the name we want into the expression.

    1. Now we're going to create our first subfolder. We're going to call the folder we create "Invoices". So, we type "Invoices" and follow with a backslash (\).


    The next folder we want to create is going to be using extracted data to name the folder. We have to include the placeholder for the extracted name inside of curly brackets or {}.

    1. Next, type an open curly bracket ({).
      • This will bring up a drop down menu called "intellisense". The drop down menu with give you options for things you can use to name folders and files.
    2. Since we want to name the folder after the company name extracted from the document, select "Company_Name" from the intellisense drop down.


    1. Close the curly bracket at the end of "Company Name" and then type in a backslash (\) to indicate a new folder level or naming convention. Type another open curly bracket ({).
    2. Select "Invoice_Number" from the intellisense drop down.


    1. Close the curly bracket at the end of "Invoice Number" and then finish with an end quotation mark ("). Now your Relative Path expression is complete.
      • Your final expression should be:
        $"Invoices\{Company_Name}\{Invoice_Number}"


    Export Formats

    Next, we need to tell Grooper what sort of files to export. We have a number of options from PDFs and Text formats to XML and ZIP formats. For our purposes we only want to export PDFs.

    1. Click on the ellipsis icon to the right of the Export Formats property.


    1. Click on the + icon located at the top of the "Export Formats" pop-up window to access the drop down to add an export format.
    2. Click on the "PDF Format" in the drop down menu.


    There are a number of properties we can configure for the export formats we choose to apply. For our purposes, we are going to just use all of the default settings, so no changes need to be made.

    1. Click "OK" in the top right of the pop-up window to apply the changes.


    Finish the Export Behavior

    We have now configured all properties needed for our Export Behavior. Now all we need to do is apply changes and save to our Content Model.

    1. Click "OK".


    1. We don't need to add any other behaviors to our list, so go ahead and click "OK" on this window as well.


    1. Finally, click the save icon at the top of the property grid to save the changes to the Content Model.

    Add the Export Step

    Now that we have our Export Behavior configured, we need to add an Export Batch Process Step to our Batch Process.

    1. Right-click on the Batch Process.
    2. Hover over "Add Activity", then hover over "Document Processing." Finally, click on "Export..."
    3. Change the "Step Name" if you like. We will be leaving it as the default "Export".
    4. Click "EXECUTE" at the top right of the "Add Activity" window.


    1. By default, our Export Batch Process Step is set to a Scope of Folder Level 1. Since our documents are at Folder Level 1, this will work for us and no changes need to be made.
    2. You might notice that in the "ACTIVITY PROPERTIES" panel on the right we have an Export Behavior property. Since we already added an Export Behavior on our Content Model, we don't need to do anything here.

    Finished Batch Process

    We have now finished designing our project going through the 5 phases of Grooper! Now it's time to see it in action.

    There are a few things we need to do before we can start processing documents:

    • Publish the Batch Process
    • Add a CMIS Connection
    • Start the Activity Processing and Import Watcher services

    After these steps are completed, then we can import a Batch and let the Batch Process automatically process the documents through the 5 phases of Grooper according to the project we designed.

    Publish the Batch Process

    Before a Batch Process can be used, we have to "publish" it. This tells Grooper that we are done making changes to the process and it is ready to be used.

    1. Right-click on the Batch Process.
    2. Click on "Publish".
    3. When the "Publish" window pops up, click "EXECUTE".


    1. A copy of your published Batch Processes can be found in the Processes folder in your node tree.

    FYI

    You can still make changes to your Batch Process in your Project after it has been published. However, to apply your changes to your working Batch Process you will need to republish your Batch Process.


    Adding a CMIS Connection

    In order to import a Batch into Grooper automatically from a file on a computer, we first need to establish a connection from Grooper to the folder where that file resides. The first step to establishing this connection is adding a CMIS Connection object.

    Creating a Folder

    We are first going to create a new folder in our Project that will house all CMIS Connection objects for this project. Folders are handy for keeping your Grooper repository organized as you build it out.

    1. Right-click on your Project in the node tree.
    2. Hover over "Add", and then click on "Folder..."
    3. When the "Add" window pops up, name the folder. We have named it "Connections".
    4. Click "EXECUTE" located in the top right of the pop up window.


    Creating the CMIS Object

    Now we can add the CMIS Connection object.

    1. Right-click on the folder you just created.
    2. Hover over "Add" and then click on "CMIS Connection..."
    3. When the "Add" window pops up, enter in a name for your CMIS Connection object. We have named it "Grooper Basics CMIS Connection" in this tutorial.
    4. Click "EXECUTE" in the top right of the pop up window.


    Configuring the CMIS Connection

    1. Now you should have a CMIS Connection object in your node tree. Select the object.
    2. Click on the hamburger icon to the right of the Connection Settings property.
    3. Select NTFS from the drop down menu.


    1. Click on the ellipsis icon to the right of the Connection Settings property.
    2. When the "Connection Settings" window pops up, click on the ellipsis icon to the right of the Repositories property.


    1. When the "Repositories" window pops up, click on the + icon at the top of the pop up window.
    2. An entry will pop up on your list located in the left of the pop up window. Make sure this entry is selected.
    3. You will need to paste the UNC path that leads to the folder you want to import your Batches from into the Base Path property.


    1. After pasting the UNC path into the Base Path property, the Repository Name property should autopopulate.
    2. Click "OK" located in the top right of the pop up window.


    Import the Repository

    Now that we have established a connection to the windows file system and pointed it at a folder, we need to actually access the files. To do that we need to import the repository.

    1. Click on the List Repositories icon located in the top right corner of the "REPOSITORIES" panel.


    A list of possible repositories should show up in the "REPOSITORIES" panel. Since the folder I pointed the CMIS connection to only has one subfolder, only one repository is showing in the screenshot below. You may have more in your list depending on where you set up your folder on your system.

    1. Click on the repository you want to import documents from.
    2. Click the Import Repository icon located in the top right next to the List Repositories icon.
    3. When the "Import Repository" window pops up, verify that the listed repository is correct.
    4. Click the "EXECUTE" button at the top of the pop up window.


    1. Now you should have a CMIS Repository object in your node tree under your CMIS Connection object.


    1. Click over to the "Browse" tab.
    2. We can see the contents of the folder we selected for our CMIS connection in the middle panel.


    The Activity Processing and Import Watcher Services

    In order to import Batches into Grooper, you need to have a couple of services installed in running. In this course we are assuming you already have these services running. If not, please check out the Activity Processing and the Import Watcher articles for installation instructions.

    With an Activity Processor service and Import Watcher service installed, we need to make sure they are turned on. If they are already running, skip to the #Import the Batch section.

    1. Click on the Machines folder in the node tree.
    2. You should see a list of servers where you can access services from in the middle panel at the top. Select the server from the list if not already selected.
    3. Under the list of servers, you should have a list of services that are installed on the selected server.
    4. To the right there is a "SERVICE PROPERTIES" panel. There are a number of service properties you can configure, but for our purposes, we can stick with the defaults.


    Turning on the Services

    1. Select the Activity Processing service from the list.
    2. Click the Start Service icon located in the top right of the panel. It should look like a play button.


    1. Select the Import Watcher service from the list.
    2. Click the Start Service icon located in the top right of the panel.


    1. Now we have an Activity Processor and Import Watcher running on our repository.


    Import the Batch

    Ok, now we have all that we need to import and start processing a Batch through our Batch Process. We are going to need to import a Batch from the Imports page in the Grooper web client.

    1. Click over to the Imports Page by clicking the Imports icon in the Context Toolbar at the top of the screen.
    2. Click the + icon located on the far right of the Context Toolbar to start a new import job.


    1. When the "Submit Import Job" window pops up, enter in a short Description for the Import Job in the Description property.


    1. Click on the hamburger icon to the right of the Provider property.
    2. Select Import Descendants from the drop-down menu.


    1. Under the Provider property, click on the hamburger icon to the right of the Repository property.


    1. Navigate to the imported repository under your CMIS connection.


    1. The Base Folder and Import Filter properties will automatically populate with information that will tell Grooper to import all documents from the repository we selected.


    1. Scroll down in the window and open the Batch Creation property.


    1. Click the hamburger icon to the right of the Starting Step property.
    2. Navigate to and select the step from your Batch Process that you want Grooper to start processing the imported documents. We want to start with the Split Pages Batch Process Step.


    1. Uncheck the Start Paused property if you want Grooper to start processing as soon as the Batch is imported.
    2. Click the "SUBMIT" button at the top of the pop up window to import the Batch and start the Batch Process.


    Running Through the Batch Process

    1. After submitting the job, you'll see the Import Job in the list of Imports. It should have a Status of "Working" if you unchecked the Start Paused property.


    1. Click on the Batches icon in the Context Toolbar to go to the Batches page.


    1. You will see the Batch that is currently being processed in the list of Batches. Select the Batch if not already selected.
    2. At the bottom you should see what looks a bit like a bar graph. This will show you what Grooper is currently processing.


    1. When the Batch Process is complete, you should see all blue bars for every Batch Process Step at the bottom of the screen.

    While Grooper is running through the Batch Process Steps you may see the bars temporarily have a green color.

    If you find that one or more of your bars turn a red color, then there was an error while processing that step. You will need to take a look at those steps in your Batch Process and see if there is something configured incorrectly.


    1. Now if you look in the folder where you set your Export Behavior, your PDFs should be properly named and organized.

    BONUS: Ad-Hoc Jobs

    As a bonus, we are going to share a slightly more advanced tip with you. When testing your Batch Process Steps, you can actually do something called Ad-Hoc processing. This allows you to test the whole Batch all at once without having to select the individual folders or page objects from the Batch Viewer first. This can come in handy when you can't select all of the folders and pages all at once.

    An Ad-Hoc job can only be performed if you have an Activity Processing service running on the Grooper server. Make sure you install and start the service before continuing.

    1. In our example below, we are wanting to test the Recognize step in the Batch Process and we have different pages inside different folders. Since they are in different folders, we cannot select all of the page objects at once. Running Recognize on each one individually can take a while.
    2. Instead of clicking on the Test icon, with an Activity Processing service running, we can submit an Ad-Hoc job. Click on the "Submit a Job" icon located to the right of the Test icon. This will run the activity at the folder level as configured on the "Batch Process Step" tab.


    1. When the "Submit Job" window pops up, click "OK" at the top of the pop up window.


    1. You should automatically be taken to the Jobs page. You will see your Ad-Hoc job in the list on the page and can see when it is complete.