2023:Review User Guide
| WIP | This article is currently under construction.
It was written for Grooper version 2022. Its guidance is mostly applicable to version 2023 as well. Be aware:
|
This article is for document review users using the Grooper Web Client to review Grooper's automated processing results. The Grooper Web Client allows users to connect to a Grooper dashboard over the internet via a web server. This allows end-users to process review based steps in a Batch Process in a web browser, without the need to install Grooper on their own machine.
| Previous Versions |
|---|
About
Welcome to the Grooper Web Client! The Grooper Web Client allows users to process documents using a web browser.
In the following sections, we will give end-users guidance on how to navigate the Web Client user interface and use it to process Batches to review documents. We will discuss the following topics:
- #Web Client UI - How to navigate Grooper using a web browser
- #Performing Review Tasks: The Batches and Tasks Pages - How to process human-attended document review activities
- #Review Views - How to use the various review-based activities in Grooper
- #Batch Management - How to maintain document Batches in production (pausing work, updating processing instructions, and more) and access Batch statistics and the event log
Web Client UI
First, let's look at how to navigate the Grooper Web Client interface.
|
Access the Grooper Web Client by entering the URL provided by your Grooper administrator.
|
|||
|
Upon entering the URL, you'll land at the Web Client's homepage. This page is divided into four main sections:
|
The Navigation Links section is the main way you'll get around in the Web Client. It contains a variety of links for Grooper users, including:
|
Design - Used to access and edit the repository.
Batches - Used to access a list of all current Batches in production.
Tasks - Used to access a list of review tasks ready for reviewers.
Imports - Used to access a list of recent import jobs.
Jobs - Used to access a list of the recent workflow.
Stats - Used to access a variety of statistics on Grooper's workflow.
Learn - Used to access Grooper University courses at learn.grooper.com.
Wiki - Used to access our wiki site at wiki.grooper.com
|
Repository Info
The Repository Info window provides some "at a glance" processing statistics and information about your Grooper Repository.
|
A Grooper Repository is the environment in which processing resources are created and executed. This includes the Batches of documents themselves, the Batch Processes used to process them, and components used in the Batch Process such as Content Models. This data displayed in the Repository Info window subdivided into three sections: Totals
Tasks
Nodes
|
Recent Events
The Recent Events window is Grooper's event log.
|
This panel displays information regarding different processing events. This includes audit trails of processing events, such as Batch creation, task steps in a Batch Process submitted for processing, and Batch completion. This also includes warnings and error messages, giving you information about errors processing steps of a Batch Process. This panel can be useful to track down information or a sequence of events if you're troubleshooting a problem. |
|||
|
Context Toolbar
The Context Toolbar is a navigation bar providing various utility in the Web Client.
|
Depending on the context (which page you've navigated to), this menu will change slightly. However, please note wherever you are in the Grooper Web Client, clicking the house icon will always take you back to this home screen. You can also use the Context Toolbar to navigate to the Design, Batches, Tasks, Imports, Jobs, and Stats pages. |
Switching Grooper Repositories
Depending on the size and scope of your operation, you may be working out of multiple Grooper Repositories. If you are, you may need to switch between Grooper Repositories to access documents ready for processing in one or the other.
|
To do this, you'll use the Repository button on the homepage's Context Toolbar.
|
|
|
A new window called "Change Repository" will pop up.
|
|
|
Upon making your selection you will switch to the selected Repository, granting you access to all the Batches and processing assets contained therein.
|
Performing Review Tasks: The Batches and Tasks Pages
Documents come into Grooper either by scanning pages or importing files into a Batch. A Batch is the fundamental container of work in Grooper. It holds your documents as they are processed through Grooper. Along with the container comes a list of processing instructions called a Batch Process.
So a Batch is really two things:
- A container of documents in various states of processing.
- These are represented as Batch Folders and Batch Pages contained in the Batch Root Folder.
- A step by step list of instructions of what to do with those documents.
- This is the Batch Process.
A Batch Process will consist of automated tasks called Unattended Activities, as well as review-based activities requiring user intervention called Attended Activities. For end-users, most of your work will be centered around document review tasks (or Attended Activates). In these activities, you will review the automated work Grooper has done in previously in the Batch Process. For example, you may be reviewing the classification decisions Grooper made or reviewing Grooper's data extraction to ensure all data was captured accurately.
Different organizations will utilize human review to varying degrees. Depending on the use case, Grooper may be able to automate more work without the need for human intervention. However, as good as Grooper can be at making document processing decisions, no computer software can beat the human brain. Review tasks are well suited for situations where you need to ensure the accuracy of Grooper's results in one way or another. You play a critical role in verifying Batches are processed accurately through the steps of a Batch Process.
So, how do you get started?
There are two ways users can start processing review tasks in a Grooper Repository, either using the Batches or Tasks pages. Either is acceptable. These present two different ways of displaying available work in Grooper. We will start by reviewing the Batches page.
Batches Page
|
The Batches page will present a user interface to select Batches currently in production within the Repository. Users will be able to see the Batch's progress and process any human attended Review activity. To get to the Batch page, click the Batches icon on the Grooper Web Client homepage.
|
|||||
|
This will bring up the Batches interface. The first thing you'll see is a list of Batches currently in process.
|
|||||
|
You can sort the Batch List by the following properties:
|
|||||
|
If you have a particularly large number of Batches, you can narrow down what you're looking for using the search box or the filter utility.
|
|||||
|
Now that we've gotten the lay of the land, you're probably asking yourself how do I actually start doing work in Grooper? How do I start reviewing documents?
|
|||||
|
What color the step is will indicate something about the steps processing status.
|
|||||
|
For end-users doing review work in Grooper, you will be processing steps with the "Review" activity type that are ready for processing.
|
|||||
|
This will bring up the Review activity module to perform one kind of review or another, be it classification review, data review, image processing review or another. In Grooper, the different kinds of review applications are displayed as "Views". For example, the type of review this step is doing is classification review. The user is presented a "Classification Viewer" in order to verify each document in the Batch is classified correctly. We will discuss how to use this "Classification Viewer" and the other "Review Views" later in the #Review Views section of this article. For now, we're going to simply exit the review module.
|
|||||
|
Tasks Page
|
The Tasks page is different from the Batches page in that it only presents users with Batches with Review steps currently ready for processing. Users can pick and choose which Batch they want to review, or they can set up a task filter and start processing all Batches it returns in order the Batch's age. To get to the Tasks page, click the Tasks icon on the Grooper Web Client homepage.
|
|||
|
This will bring up the Tasks interface. The first thing you'll see is a list of Batches with Review steps ready for processing. |
|||
|
|||
|
To start reviewing Batches, you have two options.
|
|||
|
Just as we saw using the Batches page, this will bring up the Review activity module to perform one kind of review or another, be it classification review, data review, image processing review or another. For example, this is the exact same "Classification View" module for the exact same Batch we saw earlier. The document review is identical whether you open the Review step using the Batches page or the Tasks page. The only difference is how you get there. The individual "Review Views" will be discussed in the #Review Views section of this article. For now, we're going to simply exit the review module.
|
|||
|
What is a Document?
Before continuing, lets take some time to cement some Grooper terminology we've been using as well as some of the icons you'll be seeing through the rest of this article.
As we've mentioned previously, a Batch is the fundamental collection of work in Grooper's document processing. It is essentially two things:
- A container of documents in various states of processing.
- A step by step list of instructions of what to do with those documents, or its Batch Process.
We often use the term "document" loosely. It can be an overly generic term for the stuff in the Batch that Grooper is doing stuff to. However, from Grooper's perspective a "document" is a very specific thing represented in a specific way in a Batch. So what is a document really?
Grooper has two objects to represent items in a Batch:
- Batch Folders
- Batch Pages
So, anything in a Batch is either a folder or a page.
A "document" is just a special kind of folder. In the most basic sense, a "document" is a folder with content. That content can be child Batch Pages or a digital file (like a PDF) attached to the folder.
|
This is Grooper's normal representation of a Batch as a hierarchy of Batch Folders and Batch Pages.
|
|||
|
There's a big difference between "Folder(1)" and "Folder (2)".
Why? "Folder (1)" has content. It contains two Batch Pages, "Page 1" and "Page 2". We can expand the folder's contents using the arrow button to the left of the folder icon. "Folder (2)" has no content, making it a regular old folder.
|
|||
|
Simple enough, right? Next, let's talk about classification. A classified document is a document folder who has been assigned a Document Type from a Content Model. Grooper architects design Content Models to determine what makes one kind of document distinct from another and how to get information from them. These "different types of documents" are distinguished as Document Types created in the Content Model. By assigning a document folder a Document Type, Grooper then can use the logic defined in the Content Model to extract data from it. Proper document classification is often critical to the process downstream. So, it's paramount to make sure Grooper assigned a document the right Document Type. One of the things you may be doing in Grooper is executing a classification review module to do just that. |
|||
|
However, be aware, once a document is classified, the items in your Batch are going to look a little different. Here, "Folder (1)" has been classified. It's folder name has changed to "Federal W-4 (1)". Why? It was assigned a Document Type named "Federal W-4".
|
|||
The two main ways to get content into Grooper is by scanning pages directly into a Batch or importing files (such as PDF or TIF documents) from a file system. If you are importing document files, Grooper will create a Batch Folder for every file imported, and attach that file to the folder. Things will look a little different than what we've described so far. | |||
|
Here we have three Batch Folders created for three PDF files imported into a new Batch. Absolutely no processing steps have been executed for this Batch. However, for each folder...
Are these folders documents? Yes
Are these documents classified? No
|
|||
To sum up:
- All documents are folders. Not all folders are documents.
- Documents are folders with content.
- Content can be child pages (or documents).
- Content can be files attached to the folder.
- Classified documents are documents who have been assigned a Document Type.
Review Views
In this section, we will demonstrate the various document review applications in Grooper and how to use them.
When you start processing Review steps in a Batch, you're going to see one or more different "Views" into the Batch. These Review Views present the Batch in different ways, best suited for the type of work you're doing. In these Views, you will verify Grooper's work during automated steps of a Batch Process and use the review modules to manually edit a document if Grooper made a mistake.
There are currently five Review Views available in Grooper:
- Classification Viewer
- You will use this to verify how Grooper classified a document during the Classify step. You may also use this view to verify how pages were separated into document folders during the Separate step.
- Data Viewer
- You will use this to verify how Grooper extracted data from a document during the Extract step.
- Thumbnail Viewer
- You will use this to review individual page images. Most commonly, this is used to verify how pages were processed by an IP Profile (for example, during the Image Processing step) or otherwise ensure the pages are ready for OCR during the Recognize step.
- Folder Viewer
- This is a fairly generic Batch viewer. This is most often added as a secondary Review View so that the user has an option to navigate to folders using the standard folder/page hierarchy view.
- Scan Viewer
- If you are using Grooper to scan paper documents with a scanner hooked up to your workstation, you will use the Scan Viewer to do so.
Document Viewer Tips
|
The Document Viewer is a common element among all Review Views. It will always occupy the right-most panel of the Review screen. It's how you, the user, can inspect a document or page selected in a Batch. Before we get into each of the individual Review Views and how to use them, let's familiarize ourselves with the Document Viewer. This will include quality of life advice, such as how to zoom in and out of a page's image. |
Zooming In and Out
|
By default, the image will be zoomed to a Width view. The image will fill the viewer based on the width of the document. The zoom view is indicated by the Zoom icon at the top of the Document Viewer.
|
|||||||||||||||
1. The Zoom Icon
|
|||||||||||||||
2. Mouse Wheel to ZoomYou can also use the mouse wheel to zoom in and out of the image.
You can zoom in up to 300% of the image's size and zoom out up to 5% of its size. |
|||||||||||||||
3. Keyboard ShortcutsAlternatively, you can use the following keyboard shortcuts to control the zoom view: |
|
Resizing Panels
|
You may also resize the Document Viewer panel. This can be particularly helpful when using the Data Viewer to review extracted data. For example, we can't see all the the extracted table data here. There are columns hidden out of view. |
|
|
We can resize the Document Viewer panel to see more of the Review Viewer panel, using our mouse.
|
|
|
Rendition Views
|
The Rendition Views menu is found at the top right of the Document Viewer. Just click the icon to access the drop down menu. This allows users different views of the document or page's content. Depending on the circumstance, review users may find one Rendition View most helpful to complete their Review task.
|
|
Attachment RenditionIf you ingested documents into a Batch by importing files (such as PDFs) from a file system, you will be able to access the Attachment Rendition. When files are imported into Grooper, a document folder is created for each file, and that file is attached to the folder.
|
|
Child RenditionThe Child Rendition will display a document's content, as composed of its child objects. For example, if a folder has child pages, the document is the sum total of all its pages.
|
|
Text RenditionThe Text Rendition will display a document's OCR or extracted native text data.
|
|
Classification Viewer
The Classification Viewer allows Grooper users to review document classification. Grooper classifies documents using logic defined in a Grooper Content Model. Document Types are added to the Content Model to distinguish one type of document from another. Grooper is able to tell one Document Type from another by using trained examples of the documents, assigning rules for classification, or some combination of the two. Most typically, a document is assigned a Document Type during the Classify step of a Batch Process (although there are other ways depending on the Batch Process and how documents are ingested to a Batch).
Starting the Review Step
|
In the Classification Viewer you will visually verify the Document Type Grooper assigns is correct. You will either manually assign documents a Document Type if Grooper was unable to classify the document or change the document's Document Type if Grooper misclassified the document.
|
|||
|
When you open the Classification Viewer module, this is what you'll see. The Batch's documents are presented in the typical folder hierarchy viewer.
|
Reviewing Document Classification
|
|
|
|
|
Grooper's calculation of these similarity scores are based on a variety of things, such as training algorithms and extraction rules. While Grooper tries to emulate what a human does when it looks at a document and makes a decision as to what it is, it's purely mathematical in nature. If the score is highest, its that Document Type from Grooper's perspective. You, as a human being, are intuitive. You can make cognitive connections a computer simply can't. So, your job is to look at the document and make sure Grooper got it right. | |
|
Is this an invoice from Fairdeal Services?
Your job for the document is done. You've verified its Document Type is correct.
|
|
Correcting Document Classification
|
So what happens when things go wrong?
So, we need to fix this and manually assign the Document Type. There are two ways to do this. |
|||
|
Option 1: Right Click and Assign Document Type
|
|||
|
This will bring up the "Assign Document Type" window.
|
|||
|
|||
|
Upon applying your selection, the Document Type will be assigned to the document.
|
|||
To remove a flag from the document:
|
|||
|
Option 2: Use the Document Types Panel A quicker method of manually classifying a document may be to simply select the right Document Type from the Document Types Panel. We will use the next document in our Batch to illustrate this. Another common problem is Grooper misclassifying a document.
|
|||
|
Rather than right clicking the document in the Batch and selecting a Document Type from a dropdown list, you can also simply double click the right Document Type in the Document Types Panel.
|
|||
|
Option 2.5: Better Utilizing the Document Types Panel You should continue checking all document folders to ensure they've been classified correctly. We have one more problem in our Batch to resolve.
|
|||
|
So, we need to manually classify the document. This gives us an opportunity to demo a handy keyboard shortcut.
This is particularly useful if you have a large Content Model with dozens or hundreds of Document Types.
|
Completing the Review Step
|
|||
|
You will be presented with a Confirmation window to verify you're ready to complete the review task.
|
|||
|
|||
|
Completion Criteria
The Classification Viewer may be configured so that certain criteria must be met in order to complete the review task. If so configured, either or both of the following conditions must be satisfied:
- All document folders must be classified.
- All flags on document folders must be removed.
|
If this completion criteria has been enabled, and a Batch has documents that are flagged and/or unclassified, you the Classification Viewer will notify you in two ways:
|
Shortcuts
| Shortcut | Keystrokes | Description |
| Shared Folder and Page Commands | ||
| Flag Item | Ctrl + L | Places a flag on the selected folder/page. Users may select pre-generated flag messages or enter their own custom message. |
| Clear Flag | Ctrl + Shift + L | Removes a flag on the folder/page. |
| Goto Flagged | Ctrl + G | Selects the next document/page in the Batch with a flag. If there are no subsequent documents with flags in the Batch, it will cycle back to the first document with a flag. |
| Delete | Del | This will delete the selected folder/page. CAUTION!!! There is no "undo" in Grooper. If you delete an item, it will be gone forever. |
| Rename | F2 | Renames the folder/page. Be aware, this does not classify a document folder. It only changes the folder's name. |
| Cut | Ctrl + X | Cuts a selected folder/page in the Batch. |
| Copy | Ctrl + C | Copies a selected folder/page in the Batch. |
| Paste | Ctrl + V | Pastes a copied or cut folder/page to the selected folder location in the Batch. |
| Move Down | Ctrl + Down | Moves the selected folder/page down in the Batch. |
| Move Up | Ctrl + Up | Moves the selected folder/page in the Batch. |
| Append to Previous | Ctrl + P | For folders, this appends any of a selected folder's children (pages or folders) to the folder before it. Effectively this will delete the selected folder and move any of its pages/folders to the bottom of the previous document/folder.
For pages, this will move the selected pages to the bottom of the previous folder above. |
| Prepend to Next | Ctrl + Shift + P | For folders, this prepends any of the selected folder's children (pages or folders) to the folder after it. Effectively this will delete the selected folder and move any of its pages/folders to the bottom of the next document/folder.
For pages, this will move the selected pages to the bottom of the next folder below. |
| Merge Selected | Ctrl + M | Merges selected folders/pages into a new document. This will create a folder, prompt you to assign it a Document Type, and move the selected folders/pages into the new folder. |
| Folder Specific Commands | ||
| Assign Document Type | Ctrl + Shift + A | Opens a window to select a Document Type for the selected document. |
| Remove Level | Ctrl + U | Deletes the folder and moves any child objects (pages or folders) to the folder's level in the Batch. For example, if there was a document folder at Level 1 in the Batch with a single page in it (at Level 2). The folder would be deleted and the page would be moved to Level 1 in the Batch. |
| Create Folder | Ctrl + F | Creates a new empty folder. |
| Page Specific Commands | ||
| Rotate Left | Ctrl + Left | Rotates the page 90 degrees to the left (counter-clockwise). |
| Rotate Right | Ctrl + Right | Rotates the page 90 degrees to the right (clockwise). |
| Split Folder | Ctrl + S | Splits a document into a new folder at the selected page. This applies specifically to document folders with multiple pages. Imagine you have a five page document folder at Level 1 in the Batch. You select page 3 and apply the "Split Folder" command. This will cut pages 3 to 5 from the document folder and place them into an unclassified folder at Level 1. You'll end up with two folders created out of the original (One containing pages 1 and 2. One containing pages 3 to 5) both at the same level in the Batch hierarchy (Level 1). |
Data Viewer
The Data Viewer is used to review the data Grooper collects from each document during the Extract step of a Batch Process.
The Extract activity applies the logic set up in a Content Model to find and return data from a document. This extraction logic is defined by configuring Data Models. Data Elements are added to the Data Model for each piece of information you want to collect.
There are three types of Data Elements. Data can be collected as either Data Fields, Data Tables, or Data Sections (or "fields", "tables", and "sections" for short).
- Fields are for the most basic kinds of information listed on a document.
- This is what's called "single instance" data. Think a social security number on a W-2 form. There will be one single social security number filled in for the whole document. There is a single instance of this information (hence the term "single instance"), collected as a single value for the field.
- Tables are necessary to collect information listed in a table formed by rows and columns on a document.
- Sections can be tools to group data into a category, sub-divide a document into smaller units, or establish "multi-instance" sections (more on what this means later).
As a reviewer, it's your job to check Grooper's results for each of these Data Elements after the Extract activity collects them. This is precisely what the Data Viewer is for. There's a lot of things that can go wrong in the wide world of document processing. Optical Character Recognition (OCR) can convert a document's image to digital text. However, it's not perfect. Rarely will your OCR results be 100% accurate. If the document's underlying text data is imperfect, your data extraction will probably not be accurate. There might be other problems with the extraction logic's ability to find and return data. This is especially the case for document sets with a lot of variety. If a document has a data structure that has not been properly modeled in the Data Model's design, there's a good chance Grooper will fail to return the data at all or only return partial data. Regardless why the error occurred, you, the reviewer, are the last line of defense to ensure accurate data is captured for each document.
Starting the Review Step
|
In the Data Viewer you will verify the data Grooper extracts from each document is correct. If what Grooper extracts does not match up with what's on the page, you will edit the result using text box editor.
|
|||
|
When you open the Data Viewer module, this is what you'll see. This is a different view into a Batch than we've seen so far. It's designed specifically to give us information about the data collected for each document.
|
|
When selecting a field, you should notice a few things:
Your job as the reviewer is to look at the extracted value on the left and make sure it matches what's on the document on the right.
|
|
|
Most review users will use the
|
|
|
|
|
The next document has something wrong with its data. Grooper has several visual cues designed to indicate it's found data errors for the current document, or the Batch as a whole.
|
|
|
Reviewing Data Fields
We will start our journey into data review by looking at how to review fields. We will use the same set of invoice documents we reviewed for classification previously. It's fairly common in a review workflow to go from reviewing document classification to reviewing data extraction. First, you review Grooper's work to make sure the documents are classified correctly. Once Grooper knows what kind of document it's working with, it knows what data its looking for and how to find it. Now that Grooper has extracted the data, we can use the Data Viewer to verify it collected all the data required and collected it accurately.
| FYI | It should be noted document Data Models have a high degree of configurability. Obviously, unless you're processing invoices, the specific data elements you will be reviewing in your environment will be different. You may have hundreds of data points to review on a single document. You may have just a few. That all depends on the business requirements for your document set and what your organization deems appropriate to extract from them.
However, the basics remain the same across all use cases. Grooper will extract information from the document, populate that data into fields and tables, and you'll review the results based off what you a human can see on the document. |
Required Fields
Commonly, an organization will deem certain data critical for document processing. Certain fields must therefore be extracted in order for the work to be considered complete. In Grooper, we satisfy this requirement by making a field "required". This will place the field (or table cell) in an error state if no value was extracted at all. In the Data Viewer, Grooper will alert you that the required value is missing, and will require you to manually enter it before review is completed.
|
In the case of this document's Data Model, three fields are required:
|
|
|
Data Model Differences
Before looking at more problems, please be aware Data Models can be (and often are) different for individual Document Types. For the most part, we're working with a "flat" Content Model. All the Document Types share the same Data Model, meaning we're looking for the same data elements for each one. However, in your environment, each Document Type may represent more diverse kinds of documents and require their own individual Data Models with their own specific fields and tables. Or, your Document Types may all share some data elements, but have some addition fields unique to the individual Document Type.
|
This is the case with our "Envoy" Document Type. For the most part, the data we want to collect from this Document Type is the same as the rest. However, just for the "Envoy" Document Type we want to collect the purchase order number listed on the invoice. For whatever reason, we'll pretend have a business need for the PO number from this vendor, but none of the rest.
|
Data Element Overrides (and Required Validation)
Another way Data Models can differ from Document Type to Document Type is through "Data Element Overrides" (sometimes just called "overrides" for short). This allows Grooper designers to change how fields, tables and sections behave for a specific Document Type while still maintaining a parent Data Model shared by multiple Document Types.
We're going to use another common review feature to demonstrate this. There may be some data that is not only required to be present, but extremely important Grooper extracted accurately. Your Grooper designer may designate this as a field that requires validation. So, even if it's accurately extracted, the field will stay in an error state until the user clears it.
|
For the "Ankara" Document Type, we've decided the "Remit To Address" requires manual validation. We've set up an override so that just this Document Type requires validation for this field. For the rest of them, we'll just take what Grooper gives us.
So, how do we proceed? We have to get rid of the error or Grooper will consider this an "invalid" document. |
|||
|
To clear the error, you must "confirm" the field is valid.
|
|||
|
Rubberband OCR
|
Our next document has a lot of problems with it. Grooper was not able to extract much from this document at all.
|
|||
|
|||
|
However, typing can be time consuming. There is a handy feature that can save time called "Rubberband OCR". We're going to use Rubberband OCR to capture the next two fields.
|
|||
|
|||
|
|
|||
|
There's also a keyboard shortcut for Rubberband OCR.
|
|||
|
|||
|
|
|||
"Valid" Doesn't Mean Accurate
This may lead you to believe the rest of the documents are fine, and there is nothing wrong with their data. However, "valid" does not mean always mean accurate. |
|||
|
You should always take care when reviewing documents to touch every single one, verifying each field even if Grooper does not flag it as erroneous. "Valid" from Grooper's perspective means some very specific things. This includes:
If Grooper fails to extract a non-required field or the underlying OCR data is inaccurate, you may still need to edit the results. | |||
|
|||
Reviewing Data Tables
Tables differ from fields in that they are "multi-instance" data. You might have a single row for one table on one document. The next might have a hundred. There are multiple potential instances of a row, meaning there will be multiple instances of a column's value (one for each row).
By and large, the review experience for tables is the same as for fields. Just instead of reviewing a single field's value, you're reviewing multiple cell values for all the columns and rows of the table. The same basic principles of validating Grooper's extraction against what's on the document apply.
That said, there are a few things specific to table extraction you'll need to know.
VisualizationFirst, let's look at what Grooper is showing you when you enter a table cell for review.
|
Removing Table Rows
Extracting tabular data from documents can be difficult because they can be so variable. Your Grooper designer will need to model the table's structure in one way or another for a variety of different documents. A table on one document might look totally different from another, and more often than not, there's no telling how many rows there will be.
One thing that can happen during extraction is either not enough rows are captured, or too many rows are captured. You will need to add or remove rows accordingly.
We just need to get rid of these rows entirely. |
|
|
|
|
|
|
|
Adding Table RowsWhat do you do when Grooper fails to extract a table at all? You need to add some rows and fill in the cells manually.
|
|
|
|
|
|
|
Again, "Valid" Does Not Mean Accurate
Attention to detail is paramount when reviewing a document's data, regardless whether it's field-level data or tabular data. It may even be more paramount when reviewing table data, as tables can be densely packed with information that is easily glossed over if you're not paying close attention.
So we're good to complete our review, right? Wrong! Always remember, a "valid" document does not necessarily mean its data is 100% accurate. |
|
|
There's actually an error in the extracted table data due to an OCR error.
|
|
So far, we have discussed how to review two of the three Data Elements Grooper uses to collect data, Data Fields to collect field level data and Data Tables to collect table data. In the next section, we're going to discuss the remaining Data Element, the Data Section and how it can be used to subdivide a document into smaller sections. There are at least a couple things you'll need to be aware of if your document set's Data Models utilize sections to capture document data. |
Reviewing Data Sections
Data Sections (or just "sections" for short) are a method of subdividing extraction in one way or another. They are used in three main ways in Grooper:
- As an organizational tool.
- Fields and tables can be placed in a section to group them logically as similar kinds of data.
- To restrict extraction to a smaller subsection of a document.
- Imagine you have a four-hundred page document. What you want to extract is on one paragraph on one page somewhere in the document. If you can first narrow down extraction to just that paragraph, it often makes extraction much simpler and ultimately more accurate.
- To extract repeating data fields from multiple repeating sections.
- This pertains to "multi-instance" sections. Think about a form you've had to fill in where two or more parties have to enter in the same information, maybe their name, address, date of birth, and other information. You will know the fields for each individual, but need a method to capture the fields for multiple individuals (in other words, multiple instances of that data). In these cases, a "multi-instance" section will allow you to do this.
As a reviewer, you may see any combination of sections used in these ways.
1. Sections for OrganizationSpoiler alert! You've already seen this.
|
|||
|
2. Sections for Subdividing a Document (Single Instance)In this document, we have a few sections performing different functions. The first function is to organize the information extracted in such a way that it is easy to read.
|
|
|
These sections also have an important function in extraction. On this document we have two phone numbers that need to be extracted: 555-555-5555 and 555-555-0124. Notice that they both have the same label of "Phone #" on the document. When Grooper runs extraction, by default it returns the first item that it finds on the page from top to bottom. Sections can help us extract both the number for Document Processors, Inc and the number for Johnny B Grooper.
|
|
|
3. Sections for Subdividing a Document (Multi-Instance)Sometimes, documents have repeating sections of fields listed in a single document where the same set of information is collected repeatedly throughout the document. In these cases, a field by itself isn't going to cut it. A field is designed to collect one piece of information. If that field is repeated multiple times, you're missing out on all the times it's listed. Sections can be configured as "single instance" sections or "multi instance" sections. So far, all we've seen are single instance sections. Multi-instance sections allow you to subdivide a document into multiple sections of similar data. A "multi-instance" section will allow us to divide up the document into multiple sections. Then we can extract (and review) the fields for each section.
|
Deleting Section Records
Grooper finds these section instances in a variety of ways. Just like Grooper can produce false positive field results, it can produce false positive section instances as well. Part of your job in reviewing section data will be determining whether or not a section record is valid at all and should just be deleted.
So, we've got some deleting to do. This data is "junk". We want to get rid of it before we export these documents and their data to some back-end system, like a database or content management system. |
|
|
|
|
|
|
Adding Section Records
And now for the flip side of the coin, adding section records. Just like Grooper can fail to extract a field for whatever reason, so can it fail to extract a section instance. In these cases, you're going to need to add a section record.
|
|
|
There are two ways to add a section record.
| |
|
|
|
|
|
If you happen to insert the section record in the wrong location/order, you can move a section record by using the Move Previous and Move Next commands.
|
|
|
Last but not least, any added section's field or table values will need to be entered manually to complete its review. |
Completing the Review and Completion Criteria
|
Once you are finished reviewing the documents extracted data you will complete your review by pressing the Complete Task button. |
|||
|
Completion Criteria
The Data Viewer, just like anything else in Grooper, has some degree of configurability. This gives Grooper designers the flexibility to architect solutions to best fit the needs of your business case. Be aware your Grooper designer may configure Data Viewer in one of two major ways:
- Either you will be able to complete your review with invalid documents...
- Or, you will not.
You should be aware this is set up at the Batch Process level. You should not assume, just because you can complete a review task, the Batch has no errors that need to be manually validated. Some Batch Processes may be configured to allow you to complete the Review step with unresolved data errors. Others may not.
|
This Batch's Batch Process was configured so that Review could not be completed with invalid documents' data errors unresolved.
|
|
|
This Batch's Batch Process was configured so that Review can be completed with invalid documents' data errors unresolved.
Please fully review your documents data as dictated by your organization before completing your review. |
Shortcuts
| Shortcut | Keystrokes | Description |
| Navigation Shortcuts | ||
| Go to Next Field | Tab | Navigate from current field/table cell to the next one. If you are on a document's last field/table cell, this will save any changes and navigate to the next document in the Batch. |
| Go to Previous Field | Shift + Tab | Navigate from current field/table cell to the previous one. |
| Go to Next Error | Ctrl + N | CURRENTLY INOPERABLE IN BETA. Navigate to next field/table cell with a validation error. |
| Go to Previous Error | Ctrl + P | CURRENTLY INOPERABLE IN BETA. Navigate to previous field with a validation error. |
| Go to Next Invalid Document | Ctrl + F8 | Navigate to the next invalid document. This will cycle back to the first invalid document in the Batch if no subsequent documents are invalid. This will do nothing if there are no invalid documents left in the Batch. |
| General Validation/Review Shortcuts | ||
| Confirm | F6 | Confirm the current field/table cell is valid. |
| Rubberband OCR | F4 | Allows the user to populate a field/table cells value by selecting a portion of text on the document with the mouse. |
| Rubberband Zone | F3 | Allows the user to place a geometric zone for a field/table cell by selecting a region on the document with the mouse. This is often used for document redaction or to highlight something on the document for a secondary review. |
| Table Shortcuts | ||
| Append Row | Ctrl + Ins | Adds a row to the end of the table. |
| Insert Row | Ctrl + Shift + Ins | Inserts a row above the currently selected row. |
| Delete Row | Ctrl + Del | Deletes the currently selected row. |
| Duplicate Row | Ctrl + D | Duplicates the currently selected row. |
| Move Row Up | Ctrl + Shift + Up | Moves the currently selected row up one row (i.e. If you moved the third row up, it would make the third row the second row) |
| Move Row Down | Ctrl + Shift + Down | Moves the currently selected row down one row (i.e. If you moved the third row down, it would make the third row the fourth row) |
| Section Shortcuts | ||
| Append Record | Ctrl + Ins | Adds a section record to the end of a multi-instance section |
| Insert Record | Ctrl + Shift + Ins | Inserts a section record before the currently selected section record. |
| Delete Record | Ctrl + Del | Deletes the currently selected section record. |
Advanced Techniques: Validation and Calculation Expressions
| ‼ |
COMING SOON!! This portion of the article is under construction. |
Advanced Techniques: Database Lookups
| ‼ |
COMING SOON!! This portion of the article is under construction. |
Advanced Techniques: Rubberband Zone
| ‼ |
COMING SOON!! This portion of the article is under construction. |
Thumbnail Viewer
The Thumbnail Viewer allows a review interface for individual pages in Grooper. Most commonly, this is used to review the results of an IP Profile applied to scanned pages (for example, during an Image Processing step in a Batch Process).
Commonly, users will review whether pages were not appropriately oriented or de-skewed by an IP Profile, flag pages that need to be re-scanned for one reason or another, or delete superfluous pages, like blank pages.
Starting the Review Step
|
In the Thumbnail Viewer you will review individual pages. This Review View is generally used early on in a Batch Process to ensure documents "look good", in one way or another, and are ready to be processed by OCR.
|
|||
|
When you open a Thumbnail Viewer, this is what you will see. Keep in mind, this review interface is designed for reviewing pages. The Batch's folder structure is hidden. The user is simply presented a list of pages in the Batch
|
Sorting Pages
The Thumbnail Viewer allows you to sort pages by a variety of different image qualities, such as the page's overall intensity or dimensions. This can be a useful tool to aid the reviewer in a variety of different ways.
You can sort images by the following qualities:
- Physical order (the numerical order of the pages in the Batch from first to last or last to first)
- Intensity (the overall "brightness" of the image)
- Width
- Height
- Aspect Ratio
- Flag status (whether or not a Grooper flag is present on the page)
|
|||
|
Intensity measures the "brightness" of an image. From a practical standpoint, text is represented by black (or darker) pixels contrasted against the document's background which are white (or brighter) pixels. Sorting by intensity is a quick way to group all blank (or mostly blank) pages together, as they will be the most bright (since they have more white pixels). This gives the reviewer a quick way to delete blank pages, if necessary. After sorting by intensity, the images are laid out from "least intense" to "most intense".
|
|||
|
You can also flip the sort order of the documents. For example, if we choose to Sort By Descending, the most intense (lighter) images will be the first in the Batch and the least intense (darker) images will be last.
|
|||
|
|||
|
If you need to go back to the original view of the Batch, with pages listed in sequential order, all you need to do is sort by "Physical Order".
|
|||
|
Now the pages are listed in sequence as they are ordered in the Batch itself (You might have to change the sort order from "Descending" to "Ascending" again.
|
Resolving Common Problems
The Thumbnail Viewer allows users to delete superfluous (or "junk") pages, make adjustments to individual pages, or flag them for one reason or another. We will discuss the following common Thumbnail Viewer uses:
- Deleting "junk" pages
- Manually rotating pages
- Flagging pages
- Applying an IP Profile
Deleting "Junk" Pages
Blank pages are junk pages. They contain no meaningful information. More likely than not, you'll just want to get rid of them. This is where sorting by intensity can come in rather handy.
| FYI | Grooper does have ways to automate blank page deletion. However, there my be reasons why Grooper fails to delete blank pages. For example, if the page is "nearly" blank. Furthermore, the following advice will apply to deleting what would otherwise be considered "junk", pages you don't want as a part of your final document set for one reason or another. |
|
In this case, we've already sorted by intensity. So, the brightest pages are first in the list.
|
|||
|
|||
|
|||
|
You may also be asked to evaluate the content of a page to determine whether or not it is "junk".
|
|||
|
|||
|
Manually Rotating Pages
Another common problem you may encounter are pages that are oriented incorrectly. They may be scanned in upside down, or oriented as a landscape image when they should be a portrait image. Grooper does have methods to detect a page's proper orientation, but sometimes that fails. For example, some documents have horizontal text along both the horizontal and vertical axis of a page. Which is the "proper" orientation in these case? You, as a reviewer, may need to make that decision.
|
In cases like this, you'll need to manually rotate a page to the correct orientation.
|
|
|
|
|
|
|
|
|
As with deleting pages, you can select multiple pages by holding down the
|
Flagging Pages
You may need to flag a page for one reason or another. Likely, this will have to do with something specific to your business case. You may need to flag an image that needs elevated review, for example.
Commonly, you may need to flag a page that needs to be re-scanned. A Batch Process can be designed in such a way that that pages without flags would go on to the next logical step (probably Recognize to OCR the pages). However, any page with a flag would be submitted to a secondary Scan step so any issues scanning the page can be resolved by replacing the image with a new scan.
|
|
|
|
|
Applying an IP Profile
Your Grooper designer may ask you to apply what's called an IP Profile to pages in a Thumbnail Viewer. An IP Profile is a collection of image processing commands that are applied to clean up a page's image to prep it for further processing.
For example, your images may have already been processed by an IP Profile by this point. However, this kind of IP Profile should be designed to account for most problems for most images. It should be fairly generic, generally cleaning up the pages in the Batch. You might have outlier pages that need more fine tuned processing. In that case, you may need to identify those images, and manually apply a more specific profile.
|
If this is the case, you'll apply the IP Profile by right-clicking a page's thumbnail and selecting the one you want.
|
|||
|
|||
|
Undoing Changes
If you make changes to an image and decide you need to undo those changes, you can undo them with the Undo Image Cleanup command.
You can undo the following:
- Rotation applied by the Rotate Left and Rotate Right commands.
- Changes made by an IP Profile using the Apply Image Cleanup command.
- Pixel inversion applied by the Image > Invert command.
You cannot undo the following:
- Page deletion.
|
|
|
Confirming Pages - Completion Criteria
Depending on how the Thumbnail Viewer is set up, one (or both) of the following completion criteria may be necessary to complete your review:
- You may be required to "confirm" all pages.
- This is a way of ensuring a reviewer has visually inspected each page in the Batch.
- You may be required to remove any flags on a page.
- This is a way of ensuring any flag thrown by a previous step is inspected and the page is reviewed.
|
This Batch's process was configured so that the Review step could not be completed until all pages are confirmed.
|
|||
|
To confirm a page:
|
|||
|
|||
|
|||
|
Shortcuts
| Shortcut | Keystrokes | Description |
| General Commands | ||
| Confirm Page | Enter | Confirms the page is reviewed |
| Flag Item | Ctrl + L | Flags the page. The user will enter their own flag message or select one from a pre-generated list. |
| Go To Next Unconfirmed | Ctrl + Enter | Navigates to the next page that has not been confirmed yet. |
| Delete | Del | Deletes the page from the Batch. |
| Image Modification Commands | ||
| Rotate Left | Ctrl + Left | Rotates the page 90 degrees to the left (counter-clockwise) |
| Rotate Right | Ctrl + Right | Rotates the page 90 degrees to the right (counter-clockwise) |
| Apply Image Cleanup | Alt + I | Allows the user to select an IP Profile to modify the selected page's image. |
| Invert | Ctrl + Shift + V | Inverts the image's color scheme. This can be useful when viewing scans of film negatives (such as microfilm or microfiche). |
| Undo Image Cleanup | Ctrl + Z | Undoes any image modifications. Replaces the page's image with its stored undo image. |
| Sorting Commands | ||
| Sort By Aspect Ratio | Ctrl + Shift + A | Sorts the pages by its picture aspect ratio. This would group all landscape oriented images together and all portrait oriented images together. |
| Sort By Batch Order | Ctrl + Shift + B | Sorts the pages by their order in the Batch. |
| Sort By Flag Status | Ctrl + Shift + F | Sorts the pages based on if they have a flag thrown on the page, or not. This will group all pages with flags together. If no pages are flagged, this will do nothing. |
| Sort By Height | Ctrl + Shift + H | Sorts the pages based on their physical height, in inches. |
| Sort By Width | Ctrl + Shift + W | Sorts the pages based on their physical width, in inches. |
| Sort By Intensity | Ctrl + Shift + I | Sorts the pages based on their intensity (or brightness). Pages will be ordered from brightest to darkest. Or put differently, least amount of text (and other marks) to most amount of text (and other marks). |
| Display Format Commands | ||
| Display as Binary | Ctrl + Shift + B | Changes the page's image format to black and white. |
| Display as Grayscale | Ctrl + Shift + G | Changes the page's image format to grayscale. |
| Display as Grayscale | Ctrl + Shift + G | Changes the page's image format to color. |
| Reset | Ctrl + Shift + R | Clears all ScanOnce settings on the page. |
Folder Viewer
The Folder Viewer gives reviewers a fairly generic view into a Batch's folder structure. Most often, this is added as a secondary Review View to allow users a different view into a Batch.
For example, a Data Viewer is well suited to view and edit documents' extracted data. However, you don't have the same view into a Batch's folder and page structure. This can make it difficult to navigate if you need something more like a hierarchical folder view.
|
This Review step has two Review Views, a Data Viewer to review and edit the document's data, as well as a Folder Viewer.
|
|
|
The Folder Viewer's interface is fairly basic.
|
Using Review Views for Separation
Another reason you may find a Folder Viewer useful is to aid in document separation. "Separation", from Grooper's perspective, is the act of organizing pages into folders. Typically, Grooper will separate during the Separate step of a Batch process (FYI: There are many methods to automate document separation. Your organization may even use a "real time" method that allows documents to be separated at scan-time).
Examining the Batch Before Separation
|
Next, we're going to use a Folder Viewer to gain some insights into how it can be used to fix document separation issues in Grooper. We will review a Batch using a Folder Viewer before the Separate activity runs (just to get a "before" look in to the Batch) and we will review the documents after to see how we can use the Folder Viewer to manually fix the separation issues.
|
|
|
In this Batch, we have a fairly common situation. We've imported a handful of PDFs that are packets of multiple documents. Each PDF has four individual documents contained within that we will need to separate into document folders. In this screenshot, we've executed the "Before Separation" Review step to see what the documents look like before the Separate step.
|
Examining the Batch After Separation
|
|
|
|
|
|
|
|
First, we're going to look at our overly separated document. Generally, speaking, we need to move the pages in Folder (2), to Folder (1). Put another way, we need to append those pages to the folder before it. There's at least two ways you could do this. One will require more effort (the hard way) than the other (the easy way).
Appending Folders - The Hard Way
Let's start with the hard way. We're going to simply cut a page from a folder and paste it to another.
- The only thing that makes this "hard" is there's a different way of doing things that requires less clicking around in the UI. There's nothing wrong with this approach. It will certainly get the job done.
|
You can move pages around in a Batch by cutting and pasting them into different folders (and moving them up and down in the folder as needed).
|
|
|
|
|
Appending Folders - The Easy Way
Now, there shouldn't be anything mind-blowing and cutting and pasting. You probably do it every day, whether on your work computer or personal computer or even smartphone.
There is however, a simpler shortcut, the Append To Previous command.
|
|||
|
The Append To Previous command does the exact same thing we did in the previous steps, with the click of a single button.
|
|||
|
We also have a document at was not separated enough. The last document folder contains four pages. But really, this should be split into two folders, containing two pages each. Long story short, the Separate activity failed to insert a new folder at the right point.
Splitting Folders - The Hard Way
We'll start with the hard way again. In this case, we want to split the pages in a single folder into two folders. To do this, we'll manually insert a folder, then place the pages forming the document into the new folder.
- Again, there's nothing really that "hard" about this. It's just going to take more clicks than the "easy" way. That said, it still gets the job done.
|
The first thing we need to do is add a new folder. The only (mildly) tricky thing is to make sure you're selecting the right spot to ensure its inserted where you want.
|
|
|
Whether its the root Batch folder, a simple folder, or document folder at any level in the Batch, inserting a folder is the same.
|
|
|
|
|
In our case, the last two pages of Folder (3) need to be moved into Folder (4). This will split out the pages that don't belong to Folder (3) and place them into the new folder.
|
|
|
|
|
Splitting Folders - The Easy Way
Again, there's nothing particularly difficult about this. You're just making folders and putting pages in them to make a new document.
However, there is a simpler shortcut, the Split Folder command.
|
|||
|
The Split Folder command does the exact same thing we did in the previous steps, with the click of a single button.
|
|||
|
Using Classification Viewer to Resolve Separation Problems
|
As mentioned before, the Classification Viewer and the Folder Viewer are extremely similar.
|
|
|
You also have access to the same kind of foldering commands in the Classification Viewer that you do in the Folder Viewer.
|
|
|
These commands manipulate the folders and pages in the Batch in the exact same manner. They even use the same keyboard shortcuts!
|
Merging Documents
One last common command is the Merge Selected command. This allows users to select one or more pages (or even one or more folders) and create a new folder at the same folder level, place the selected pages (or folders) in the new folder, and assign the folder a Document Type. Depending on the circumstance, this can be the quickest way to fix a separation issue and classify a document all at the same time.
The process looks like this:
|
Document (4) has some major separation problems. None of its pages were separated into folders.
We should have ended up with three document folders:
|
|
|
We'll use the Merge Selected command to merge selected pages into a new folder.
|
|
|
This will bring up the Merge Selected window, allowing you to choose a Document Type.
|
|
|
Upon pressing the Apply button, three things happen.
|
|
|
Shortcuts
The Folder Viewer keyboard shortcuts are identical to the Classification Viewer shortcuts.
Scan Viewer
For some organizations, your first step in document processing will be scanning paper pages into Grooper using a document scanner physically connected to your workstation. In that case, you're going to use the Scan Viewer. This interface allows users to scan paper documents into a Batch as the first step in a Batch Process. A lightweight application called Grooper Desktop installed on your workstation will listen for when the scanner runs and sends images to the Grooper web server as pages are scanned.
About Grooper Desktop
If you're using the Grooper Web Client to scan paper documents into a Batch, you'll need to have the Grooper Desktop application installed on your workstation. Grooper Desktop will run as a service on your machine and integrate your scanner with the Scan Viewer.
|
When Grooper Desktop is installed and running, you can open it in the System Tray of your Windows Taskbar.
|
|||
|
After opening Grooper Desktop there are four important properties that need to be configured in order to scan documents.
|
|||
|
After enabling web scanning, selecting your scanner model and configuring any device settings, you will need to start the Grooper Desktop service.
|
|||
|
Grooper Desktop runs as a service in the background of your machine. When you scan documents using the Scan Viewer, this service will start up your scanner when you hit the Scan button in Grooper and upload the scanned images to the Batch. |
Desktop Scanning in Grooper - 2023
Batch Management
Depending on your role in our organization, you may be required to do some "Batch Management". This will require you to manipulate Batches in various ways.
- Imagine the wrong Batch Process was assigned to a Batch. You would need to stop processing that Batch and assign it the right one.
- The same would apply if your Grooper designer made changes to a Batch's Batch Process. You'd need to pause the Batch's processing, and update its processing instructions by telling it the Batch Process changed.
Then, in both cases, you'd need to re-start processing the Batch. You can do all that from the "Batches Page" in the Grooper Web Client.
In this section we will discuss how to do the following:
- Pause processing for a Batch in production.
- Resume processing for a Batch in production.
- Update a Batch's Batch Process to either reflect changes made by your Grooper designer or update it to an entirely different Batch Process.
- Reset tasks already completed in a Batch Process in order for them to be reprocessed.
We will also point out the "Statistics", "Events", and "Details" panels to glean more information about the processing results for a selected Batch.
Pausing and Resuming Batch Processing
Pausing Batches
There are a variety of reasons you may need to pause a Batch in production, but generally speaking, it's a way to momentarily halt any further tasks from processing. Batches can be paused using the "Batches Page".
|
|
|
We would need to pause the Batch.
|
|
|
|
|
Grooper will pop up the Pause window to confirm you want to pause processing on the Batch.
|
|
|
|
|
Pausing the Batch places its processing on hold, including any Review tasks.
|
Resuming Batches
Once a Batch is paused, it must be "resumed" before any of its tasks are processed. When you're ready to resume processing a paused Batch, by-and-large you will repeat the same process you did to pause it. You'll right-click the paused Batch and press the Resume button.
|
|
|
Grooper will pop up the Resume window to confirm you want to resume processing on the Batch.
|
|
|
Updating Batch Processes and Resetting Steps
The two largest reasons to pause a Batch are to:
- Update the Batch's Batch Process
- This could be because the wrong Batch Process was assigned to it.
- This could be because your Grooper designer made changes to the Batch Process and the Batch needs to be aware of the updated processing instructions.
- Reset steps already processed.
- Steps need to be reset if work needs to be "re-done".
- This often will happen out of necessity when a Grooper designer makes changes to a Batch Process or a Content Model. Imagine a Grooper designer made changes to the Document Types in a Content Model. You may need to reset the Classify step and re-process it so that those changes are reflected in the Batch.
- This can also be a way to re-enter a Review step if further changes need to be made after the fact.
Updating Batch Processes
Imagine a situation where the wrong Batch Process was assigned when the Batch was created. You'd need to tell Grooper you want the Batch to use a different Batch Process.
Imagine a situation where certain processing instructions in the Batch Process changed. Maybe a step was added. Maybe the configuration of a step was changed slightly. You'd need to inform Grooper of the changes that were made.
- In either case, you would inform Grooper of the Batch's new Batch Process or changes to its existing Batch Process by "updating" the Batch Process.
The general steps to update a Batch Process are as follows.
- Using the "Batches Page", pause the Batch.
- Right-click the Batch and select Update Process
- Select a "target step" in a Batch Process.
- Determine which (if any) steps need to be "reset".
- Depending on the situation, you may also need to reprocess steps that were previously processed with the updated instructions. We will discuss this in the "Resetting Steps" tab of this tutorial.
- Resume the Batch.
Scenario #1: Updating a Process to Change Processes
|
|
|
|
|
Now that the Batch is paused, we can update its Batch Process.
|
|
|
This will bring up the Update Process configuration window. From here you will select the Batch Process you want to update to. The Target Step property determines at which step of which Batch Process you wish to begin processing.
|
|
|
Scenario #2: Updating the Current Batch Process
In Grooper, a Batch's process is tied to each Batch when the Batch is created. If a Grooper designer wants to make a change to a Batch Process, they aren't going to alter every single Batch's process individually. There could be a hundred Batches using the same process. That would be a tedious headache.
Instead, they will make changes to the "working" Batch Process and "publish" those changes. Then, any user can make Batches currently in production aware of those changes by updating its Batch Process with the Update Process button.
|
Imagine a scenario where the Grooper designer goofed when creating our "Invoice Process" Batch Process. They added a Data Viewer to a Review step where they should have added a Classification Viewer. They would need to make changes to the "Invoice Process" Batch Process. Then, any Batches currently in production would need to have their processes updated from the "Batches Page".
The steps are just out of order. |
|
|
So, our Grooper designer does his work behind the scenes in Grooper Design Studio, and he tells you that Batch is ready to be updated. No problem. This is even easier than our last scenario.
|
|
|
Now that the Batch is paused, we can update its Batch Process.
|
|
|
This will bring up the Update Process configuration window. In this case, because we're just updating the Batch's current process and we don't need to reset any steps, truly all we need to do is press the Apply button.
Remember, the Target Step property determines at which step of which Batch Process you wish to begin processing.
|
|
|
Upon applying changes, the Batch's process will be replaced with the updated version.
|
Scenario #3: Updating a Process and Resetting Steps
Our two examples were fairly unique in that they did not require us to reset any steps. In many cases, when you're updating a Batch's process, you'll need to reprocess one or more steps.
There is an entirely separate command you'll need to run on the Batches to do this, using the Reset button. We will show you how to do this in the next tab.
Resetting Steps
Often when you're updating a Batch's process, you'll need to reprocess some of the steps in it. This may be because your Grooper designer changed the configuration of one or more activities and the work that's already been done needs to be re-done with the updated changes.
|
For example, we have a situation here where a Batch's "Export" step was misconfigured, resulting in failed document exports.
|
|
|
Let's imagine our Grooper designer has fixed the issue with the Batch Process. Next, we'll need to update the Batch's process with the new Export configuration.
|
|
|
|
|
|
|
This will bring up the Reset configuration window.
|
|
|
|
|
The Progress, Statistics, Events, and Details Panels
At the bottom of the "Batches Page" you'll find four tabbed panels:
- Progress
- Statistics
- Events
- Details
These panels will give you more information about the Batch and the individual steps processed.
Progress
|
We've seen the "Progress Panel" throughout this article. This is the user's primary "at-a-glance" view into the processing progress of each step in the selected Batch. Each block represents a step in the Batch Process assigned to the Batch. Different colors indicate one thing or another in terms of the processing status of the tasks in that step.
|
Statistics
|
The "Statistics Panel" gives you processing data for each step in the selected Batch's process. The various different Grooper "activity types" will give you different kinds of information. For example, the Recognize activity runs OCR for image-based content. It will, therefore, give you stats on how many OCR characters were found in total for that step. Review steps will contain processing information that may be useful when obtaining various kinds of metrics, such as the number of fields edited, the number of keystrokes entered, and total time it took to complete the Review task. |
Events
|
The "Events Panel" displays a step-by-step list of all tasks submitted for processing. This is designed to give you an audit trail into the processing history for the Batch. This will include "Audit" type events when tasks are submitted and completed, when the Batch was created, if and when it was paused, when processing was then resumed, and more. This panel is often most helpful when processing errors occur, logged as "Error" events. This will give you additional information to help you troubleshoot the issue. |
Statistics
|
The "Statistics Panel" gives you the most "top level" details about items in the Batch. This will give you total counts for pages, folders, Content Types, Attachment Types and various other information pertaining to them. |




















































































































































































