2022:Web Client: Difference between revisions
Dgreenwood (talk | contribs) |
Dgreenwood (talk | contribs) |
||
| Line 1,485: | Line 1,485: | ||
* Sections can be tools to group data into a category, sub-divide a document into smaller units, or establish "multi-instance" sections (more on what this means later). | * Sections can be tools to group data into a category, sub-divide a document into smaller units, or establish "multi-instance" sections (more on what this means later). | ||
As a reviewer, it's your job to check Grooper's results for each of these '''Data Elements''' after the '''Extract''' activity collects them. This is precisely what the '''''Data Viewer''''' is for. | As a reviewer, it's your job to check Grooper's results for each of these '''Data Elements''' after the '''Extract''' activity collects them. This is precisely what the '''''Data Viewer''''' is for. There's a lot of things that can go wrong in the wide world of document processing. Optical Character Recognition (OCR) can convert a document's image to digital text. However, it's not perfect. Rarely will your OCR results be 100% accuracy. If the document's underlying text data is imperfect, so may be your data extraction. There might be other problems with the extraction logic's ability to find and return data. This is especially the case for document sets with a lot of variety. If a document has a data structure that has not been properly modeled in the '''Data Model's''' design, there's a good chance Grooper will fail to return the data at all or only return partial data. Regardless why the error occurred, you, the reviewer, are the last line of defense to ensure accurate data is captured for each document. | ||
<tabs style="margin:20px"> | <tabs style="margin:20px"> | ||
| Line 1,514: | Line 1,514: | ||
# Instead of using a folder hierarchy, you can navigate through the documents in the '''Batch''' using the Folder Navigator at the top of the Review Panel. | # Instead of using a folder hierarchy, you can navigate through the documents in the '''Batch''' using the Folder Navigator at the top of the Review Panel. | ||
#* There are eight document folders in this '''Batch'''. I have navigated to the | #* There are eight document folders in this '''Batch'''. I have navigated to the sixth document in the '''Batch'''. So we are at folder "6" of "8", indicated by "6 / 8". | ||
#* You may use the single arrow buttons to go to the next or previous document. | #* You may use the single arrow buttons to go to the next or previous document. | ||
#* You may use the double arrow buttons to go to the first and last document. | #* You may use the double arrow buttons to go to the first and last document. | ||
#* You can also type the number of the document you want to select in the number box. | #* You can also type the number of the document you want to select in the number box. | ||
# The document's classified '''Document Type''' and folder number is listed next. | # The document's classified '''Document Type''' and folder number is listed next. | ||
#* Pro Tip: If you need to reclassify the document at this point, you can right click this heading and choose "Assign Document Type" to change its '''Document Type'''. Be aware changing a document's '''Document Type''' will clear its extracted data. However, you can also right click this heading and select "Extract" to re-run Grooper's data extraction. | #* Pro Tip: If you need to reclassify the document at this point, you can right click this heading and choose "''Assign Document Type''" to change its '''Document Type'''. Be aware changing a document's '''Document Type''' will clear its extracted data. However, you can also right click this heading and select "''Extract''" to re-run Grooper's data extraction. | ||
# The document's extracted data occupies the rest of the Review Panel. The various fields, tables and sections established in the document's '''Data Model''' are listed here with their extraction results placed in editable text boxes. | # The document's extracted data occupies the rest of the Review Panel. The various fields, tables and sections established in the document's '''Data Model''' are listed here with their extraction results placed in editable text boxes. | ||
|valign=top| | |valign=top| | ||
| Line 1,537: | Line 1,537: | ||
|- | |- | ||
|valign=top| | |valign=top| | ||
<br> | |||
# Press the warning icon to get more information about the errors present. | # Press the warning icon to get more information about the errors present. | ||
# This will toggle a list of every field or table cell with an error and | # This will toggle a list of every field or table cell with an error and their corresponding error message. | ||
#* In this case it's telling us the "Invoice Total" field's "Value is required". | #* In this case it's telling us the "Invoice Total" field's "Value is required". | ||
# Any field in table cell in an error state will be highlighted red. | # Any field in table cell in an error state will be highlighted red. | ||
| Line 1,555: | Line 1,556: | ||
We will start our journey into data review by looking at how to review fields. We will use the same set of invoice documents we reviewed for classification previously. And this is a fairly common part of your workflow. First, you review Grooper's work to make sure the documents are classified correctly. Once Grooper knows what kind of document it's working with, it knows what data its looking for and how to find it. Now that Grooper has extracted the data, we can use the Data Viewer to verify it collected all the data required and collected it accurately. | We will start our journey into data review by looking at how to review fields. We will use the same set of invoice documents we reviewed for classification previously. And this is a fairly common part of your workflow. First, you review Grooper's work to make sure the documents are classified correctly. Once Grooper knows what kind of document it's working with, it knows what data its looking for and how to find it. Now that Grooper has extracted the data, we can use the Data Viewer to verify it collected all the data required and collected it accurately. | ||
{|cellpadding="10" cellspacing="5" | |||
|-style="background-color:#36b0a7; color:white" | |||
|style="font-size:14pt"|'''FYI'''||It should be noted document '''Data Models''' have a high degree of configurability. Obviously, unless you're processing invoices, the specific data elements you will be reviewing in your environment will be different. You may have hundreds of data points to review on a single document. You may have just a few. That all depends on the business requirements for your document set and what your organization deems appropriate to extract from them. | |||
However, the basics remain the same across all use cases. Grooper will extract information from the document, populate that data into fields and tables, and you'll review the results based off what you a human can see on the document. | |||
|} | |||
==== Required Fields ==== | |||
Commonly, an organization will deem certain data critical for document processing. Certain fields ''must'' therefore be extracted in order for the work to be considered complete. In Grooper, we satisfy this requirement by making a field "required". This will place the field (or table cell) in an error state if no value was extracted at all. In the '''''Data Viewer''''', Grooper will alert you that the required value is missing, and will require you to manually enter it before review is completed. | |||
{|cellpadding=10 cellspacing=5 | |||
|valign=top style="width:40%"| | |||
In the case of this document's '''Data Model''' three fields are required: | |||
* Invoice Number | |||
* Invoice Date | |||
* Invoice Total | |||
# We have navigated to the second document in the '''Batch'''. | |||
# The "Invoice Number" and "Invoice Date" fields extracted just fine. | |||
# The "Invoice Total" field did not. It is empty or "blank". | |||
#* Grooper will highlight any required fields that are empty in red. | |||
# When you enter that field's textbox, Grooper will pop up a message indicating the problem, "Value is required." | |||
|valign=top| | |||
[[File:Web-review-data-view-05.png]] | |||
|- | |||
|valign=top| | |||
<br> | |||
# All we need to do is enter the value for this invoice total as it appears on the document. | |||
# Type the value into the field's textbox and press <code>Enter</code> or <code>Tab</code> to move to the next field. | |||
# You will see the error warning disappear because the data error was resolved. | |||
# Press the ''Save'' button to save any changes made to the document's data. | |||
|valign=top| | |||
[[File:Web-review-data-view-06.png]] | |||
|} | |||
==== Data Model Differences ==== | |||
Before looking at more problems, please be aware '''Data Models''' can be (and often are) different for individual '''Document Types'''. For the most part, we're working with a "flat" '''Content Model'''. All the '''Document Types''' share the same '''Data Model''', meaning we're looking for the same data elements for each one. However, in your environment, each '''Document Type''' may represent more diverse kinds of documents and require their own individual '''Data Models''' with their own specific fields and tables. Or, your '''Document Types''' may all share some data elements, but have some addition fields unique to the individual '''Document Type'''. | |||
{|cellpadding=10 cellspacing=5 | |||
|valign=top style="width:40%"| | |||
<br> | |||
This is the case with our "Envoy" '''Document Type'''. For the most part, the data we want to collect from this '''Document Type''' is the same as the rest. However, ''just'' for the "Envoy" '''Document Type''' we want to collect the purchase order number listed on the invoice. For whatever reason, we'll pretend have a business need for the PO number from this vendor, but none of the rest. | |||
# The top half of the review screen is occupied by the "parent" '''Data Model's''' fields. These are the ones shared by all '''Document Types'''. | |||
# Then, we have the additional "Envoy" '''Data Model's''' elements we can review as well. | |||
#* In our case it's a single "PO Number" field in a section named "Additional Details" | |||
# We review the field just like we would any other field, and continue to the next document. | |||
{|cellpadding="10" cellspacing="5" | |||
|-style="background-color:#36b0a7; color:white" | |||
|style="font-size:14pt"|'''FYI'''||Pro Tip! | |||
Most users find tabbing through fields with the <code>Tab</code> key is the easiest way to review a document's fields in the '''''Data Viewer'''''. | |||
If you are on the ''last'' field of a document (such as this one) and press the <code>Tab</code> key, it will save the document and take you to the next one in the '''Batch'''. | |||
|} | |||
|valign=top| | |||
[[File:Web-review-data-view-07.png]] | |||
|} | |||
==== Data Element Overrides (and Required Validation) ==== | |||
Another way '''Data Models''' can differ from '''Document Type''' to '''Document Type''' is through "Data Element Overrides" (sometimes just called "overrides" for short). This allows Grooper designers to change how fields, tables and sections behave for a specific '''Document Type''' while still maintaining a parent '''Data Model''' shared by multiple '''Document Types'''. | |||
We're going to use another common review feature to demonstrate this. There may be some data that is not only required to be present, but extremely important Grooper extracted accurately. Your Grooper designer may designate this as a field that requires validation. So, even if it's accurately extracted, the field will stay in an error state until the user clears it. | |||
{|cellpadding=10 cellspacing=5 | |||
|valign=top style="width:40%"| | |||
For the "Ankara" '''Document Type''', we've decided the "Remit To Address" requires manual validation. We've set up an override so that ''just'' this '''Document Type''' requires validation for this field. For the rest of them, we'll just take what Grooper gives us. | |||
# Fields requiring validation will ''always'' be in an error state until the field is reviewed. | |||
#* Grooper will give you an error message saying "This field must be reviewed" | |||
# In our case, Grooper ''did'' extract this address accurately. What's on the document is what's in the extracted field. | |||
So, how do we proceed? We have to get rid of the error or Grooper will consider this an "invalid" document. | |||
|valign=top| | |||
[[File:Web-review-data-view-08.png]] | |||
|- | |||
|valign=top| | |||
<br> | |||
To clear the error, you must "confirm" the field is valid. | |||
# Right click in the field's text box. | |||
# Select ''Confirm'' | |||
#* Or, you can use the keyboard shortcut <code>F6</code> | |||
|valign=top| | |||
[[File:Web-review-data-view-09.png]] | |||
|- | |||
|valign=top| | |||
<br> | |||
#<li value=3> This will confirm the value is correct, and the textbox's color will change to green. | |||
{|cellpadding="10" cellspacing="5" | |||
|-style="background-color:#36b0a7; color:white" | |||
|style="font-size:14pt"|'''FYI'''||You may have noticed there are still data errors present on this document. The total number of errors dropped from "4" to "3". | |||
The remaining three errors pertain to the extracted table data. We will circle back to these issues in the next section when we discuss reviewing table extraction in Grooper. | |||
|} | |||
|valign=top| | |||
[[File:Web-review-data-view-10.png]] | |||
|} | |||
==== Rubberband OCR ==== | |||
==== "Valid" Doesn't Mean Accurate ==== | |||
</tab> | </tab> | ||
Revision as of 12:21, 17 March 2022
| WIP | This article is a work-in-progress. It was written using a beta version of 2022. This article is subject to change and/or expansion as it is updated to the release version of 2022.
This tag will be removed upon draft completion. |
The Grooper Web Client allows users to connect to a Grooper dashboard over the internet via a web server. This allows end-users to process review based steps in a Batch Process in a web browser, without the need to install Grooper on their own machine.
About
THIS SECTION TO BE COMPLETED AT A LATER DATE
| ⚠ | The Grooper Web Client DOES NOT support Internet Explorer.
The following browsers are supported:
Other modern browsers may work but have not been fully tested, such as:
|
Installation
Setting up the Grooper Web Client is done in three simple steps:
- Install the IIS components on your server.
- Install the Grooper Web Client application.
- Open the Web Client URL in a browser and start using it.
As a side note, there are some additional requirements for users scanning paper documents into Grooper with a physical scanner. These requirements will be detailed in the #Scanning with Web Review section of this article.
1. Install IIS
The first step to setting up your server for Grooper Web Review is installing the IIS (Internet Information Services) components.
| ⚠ | It's important to do this step first. Installing and setting up IIS first is required before installing the Grooper Web Client. |
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
With IIS installed, our next step is to install the Grooper Web Server.
|
2. Install Grooper Web Client
Next, we will install the Grooper Web Client application.
| ⚠ | If you have not done so already, install Grooper and add repository connections before continuing.
If you need instructions on installing Grooper, please visit the Install and Setup article. |
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
3. Access Web Client
|
By default, the Web Client URL will be the following:
You can now start using the Grooper Web Client. We will detail the UI navigation and how to execute Review tasks in the #User Guide section of this article. |
Security
Most likely you don't want any old user to access the Grooper Web Client. If you wish to limit the users able to access Grooper by a web browser, you'll need to update the Security settings in Grooper Design Studio. This will allow you to grant users access by adding individual users or user groups using Windows ACL.
Step 1: Add a Designer (or Designers)
|
|
|||
|
|
|||
|
|
|||
|
Now that a Designer has been added, we can add Users. The users added to the Users list will be able to use Review steps in Batch Processes and will enable the usage of Review Queues.
|
Step 2: Add Users
Now that a Designer has been added, we can add Users. The users added to the Users list will be able to use Review steps in Batch Processes and will enable the usage of Review Queues.
| FYI | Review Queues allow further security control in Grooper. For example, if you have several Batch Processes but want to limit a user's ability to only review one particular Batch Process, you can use a Review Queue to do that.
Please note, you must add a user to the Users list before configuring a Review Queue. We will discuss Review Queues later in this article. |
|
|
|
|
|
|
|
|
|
|
|
Step 3: Logon to Web Client
Now, only listed Users will have access to do review work via the Grooper Web Client.
|
|
User Guide
Welcome to the Grooper Web Client! The Grooper Web Client allows users to process documents using a web browser.
In the following sections, we will give end-users guidance on how to navigate the Web Client user interface and use it to process Batches and review their documents. We will discuss the following topics:
- #Web Client UI - How to navigate Grooper using a web browser
- #Performing Review Tasks - How to process human-attended document review activities
- #Review Applications - How to use the various review-based activities in Grooper
- #Batch Management - How to maintain document Batches in production (pausing work, updating processing instructions, and more) and access Batch statistics and the event log.
Web Client UI
The first thing you're going to want to know is how to get around the Grooper Web Client interface.
|
|
|||
|
|
The Navigation Links section is the main way you'll get around in the Web Client. It contains a variety of links for Grooper users, including:
|
Tasks - Used to access a list of review tasks ready for users.
Learn - Used to access Grooper University courses at learn.grooper.com.
Connect - Used to access our Grooper x Change web forums at xchange.grooper.com.
Wiki - Used to access our wiki site at wiki.grooper.com
|
|||
|
|
Repository Info
The Repository Info window provides some "at a glance" processing statistics and information about your Grooper Repository.
|
This data displayed in the Repository Info window subdivided into three sections: Totals
Tasks
Nodes
|
Recent Events
The Recent Events window is Grooper's event log.
|
This panel can be useful to track down information or a sequence of events if you're troubleshooting a problem. |
|||
|
|
Context Toolbar
The Context Toolbar is a navigation bar providing various utility in the Web Client.
|
|
Switching Grooper Repositories
Depending on the size and scope of your operation, you may be working out of multiple Grooper Repositories. If you are, you may need to switch between Grooper Repositories to access documents ready for processing in one or the other.
|
A dropdown menu will appear listing available Grooper Repositories you're connected to.
|
|
|
|
Performing Review Tasks: The Batches and Tasks Pages
Documents come into Grooper either by scanning pages or importing files into a Batch. A Batch is the fundamental container of work in Grooper. It holds your documents as they are processed through Grooper. Along with the container comes a list of processing instructions called a Batch Process.
So a Batch is really two things:
- A container of documents in various states of processing.
- These are represented as Batch Folders and Batch Pages contained in the Batch Root Folder.
- A step by step list of instructions of what to do with those documents.
- This is the Batch Process.
A Batch Process will consist of automated tasks called Unattended Activities, as well as review-based activities requiring user intervention called Attended Activities. For end-users, most of your work will be centered around document review tasks (or Attended Activates). In these activities, you will review the automated work Grooper has done in previously in the Batch Process. For example, you may be reviewing the classification decisions Grooper made or reviewing Grooper's data extraction to ensure all data was captured accurately.
Different organizations will utilize human review to varying degrees. Depending on the use case, Grooper may be able to automate more work without the need for human intervention. However, as good as Grooper can be at making document processing decisions, no computer software can beat the human brain. Review tasks are well suited for situations where you need to ensure the accuracy of Grooper's results in one way or another. You play a critical role in verifying Batches are processed accurately through the steps of a Batch Process.
So, how do you get started?
There are two ways users can start processing review tasks in a Grooper Repository, either using the Batches or Tasks pages. Either is acceptable. These present two different ways of displaying available work in Grooper. We will start by reviewing the Batches page.
Batches Page
|
To get to the Batch page, click the Batches icon on the Grooper Web Client homepage.
|
|||||
|
|
|||||
|
You can sort the Batch List by the following properties:
|
|||||
|
|
|||||
|
|
|||||
|
|
|||||
|
|
|||||
|
We will discuss how to use this "Classification Viewer" and the other "Review Views" later in the #Review Views section of this article. For now, we're going to simply exit the review module.
|
Tasks Page
|
To get to the Tasks page, click the Tasks icon on the Grooper Web Client homepage.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
The individual "Review Views" will be discussed in the #Review Applications section of this article. For now, we're going to simply exit the review module.
|
What is a Document?
Before continuing, lets take some time to cement some Grooper terminology we've been using as well as some of the icons you'll be seeing through the rest of this article.
As we've mentioned previously, a Batch is the fundamental collection of work in Grooper's document processing. It is essentially two things:
- A container of documents in various states of processing.
- A step by step list of instructions of what to do with those documents, or its Batch Process.
We often use the term "document" loosely. It can be an overly generic term for the stuff in the Batch that Grooper is doing stuff to. However, from Grooper's perspective a "document" is a very specific thing represented in a specific way in a Batch. So what is a document really?
Grooper has two objects to represent items in a Batch:
- Batch Folders
- Batch Pages
So, anything in a Batch is either a folder or a page.
A "document" is just a special kind of folder. In the most basic sense, a "document" is a folder with content. That content can be child Batch Pages or a digital file (like a PDF) attached to the folder.
|
|
|||
|
Why? "Folder (1)" has content. It contains two Batch Pages, "Page 1" and "Page 2". We can expand the folder's contents using the arrow button to the left of the folder icon. "Folder (2)" has no content, making it a regular old folder.
|
|||
|
Simple enough, right? Next, let's talk about classification. A classified document is a document folder who has been assigned a Document Type from a Content Model. Grooper architects design Content Models to determine what makes one kind of document distinct from another and how to get information from them. These "different types of documents" are distinguished as Document Types created in the Content Model. By assigning a document folder a Document Type, Grooper then can use the logic defined in the Content Model to extract data from it. Proper document classification is often critical to the process downstream. So, it's paramount to make sure Grooper assigned a document the right Document Type. One of the things you may be doing in Grooper is executing a classification review module to do just that. |
|||
|
Here, "Folder (1)" has been classified. It's folder name has changed to "Federal W-4 (1)". Why? It was assigned a Document Type named "Federal W-4".
|
|||
The two main ways to get content into Grooper is by scanning pages directly into a Batch or importing files (such as PDF or TIF documents) from a file system. If you are importing document files, Grooper will create a Batch Folder for every file imported, and attach that file to the folder. Things will look a little different than what we've described so far. | |||
|
Here we have three Batch Folders created for three PDF files imported into a new Batch. Absolutely no processing steps have been executed for this Batch. However, for each folder...
Are these folders documents? Yes
Are these documents classified? No
|
|||
To sum up:
- All documents are folders. Not all folders are documents.
- Documents are folders with content.
- Content can be child pages (or documents).
- Content can be files attached to the folder.
- Classified documents are documents who have been assigned a Document Type.
Review Views
In this section, we will demonstrate the various document review applications in Grooper and how to use them.
When you start processing Review steps in a Batch, you're going to see one or more different "Views" into the Batch. These Review Views present the Batch in different ways, best suited for the type of work you're doing. In these Views, you will verify Grooper's work during automated steps of a Batch Process and use the review modules to manually edit a document if Grooper made a mistake.
There are currently four Review Views available in Grooper:
- Classification Viewer
- You will use this to verify how Grooper classified a document during the Classify step. You may also use this view to verify how pages were separated into document folders during the Separate step.
- Data Viewer
- You will use this to verify how Grooper extracted data from a document during the Extract step.
- Thumbnail Viewer
- You will use this to review individual page images. Most commonly, this is used to verify how pages were processed by an IP Profile (for example, during the Image Processing step) or otherwise ensure the pages are ready for OCR during the Recognize step.
- Folder Viewer
- This is a fairly generic Batch viewer. This is most often added as a secondary Review View so that the user has an option to navigate to folders using the standard folder/page hierarchy view.
Document Viewer Tips
|
Before we get into each of the individual Review Views and how to use them, let's familiarize ourselves with the Document Viewer. This will include quality of life advice, such as how to zoom in and out of a page's image. |
Zooming In and Out
|
The zoom view is indicated by the Zoom setting at the top right of the image.
|
|||||||||||||||
|
1. Double Click to Zoom
|
|||||||||||||||
|
2. Mouse Wheel to ZoomYou can also use the mouse wheel to zoom in and out of the image.
You can zoom in up to 300% of the image's size and zoom out up to 5% of its size. |
|||||||||||||||
|
3. Keyboard ShortcutsAlternatively, you can use the following keyboard shortcuts to control the zoom view: |
|
Resizing Panels
|
You may also resize the Document Viewer panel. This can be particularly helpful when using the Data Viewer to review extracted data. For example, we can't see all the the extracted table data here. There's a fourth column hidden out of view. |
|
|
|
|
|
|
Rendition Views
|
The Rendition Views are found at the top right of the Document Viewer. This allows users different views of the document or page's content. Depending on the circumstance, review users may find one Rendition View most helpful to complete their Review task.
|
|
|
Attachment RenditionIf you ingested documents into a Batch by importing files (such as PDFs) from a file system, you will be able to access the Attachment Rendition. When files are imported into Grooper, a document folder is created for each file, and that file is attached to the folder.
|
|
|
Child RenditionThe Child Rendition will display a document's content, as composed of its child objects. For example, if a folder has child pages, the document is the sum total of all its pages.
|
|
|
Text RenditionThe Text Rendition will display a document's OCR or extracted native text data.
|
|
|
|
Classification Viewer
The Classification Viewer allows Grooper users to review document classification. Grooper classifies documents using logic defined in a Grooper Content Model. Document Types are added to the Content Model to distinguish one type of document from another. Grooper is able to tell one Document Type from another by using trained examples of the documents, assigning rules for classification, or some combination of the two. Most typically, a document is assigned a Document Type during the Classify step of a Batch Process (although there are other ways depending on the Batch Process and how documents are ingested to a Batch).
Starting the Review Step
|
|
|||
|
|
Reviewing Document Classification
|
|
|
|
|
|
|
Grooper's calculation of these similarity scores are based on a variety of things, such as training algorithms and extraction rules. While Grooper tries to emulate what a human does when it looks at a document and makes a decision as to what it is, it's purely mathematical in nature. If the score is highest, its that Document Type from Grooper's perspective. You, as a human being, are intuitive. You can make cognitive connections a computer simply can't. So, your job is to look at the document and make sure Grooper got it right. | |
|
Your job for the document is done. You've verified its Document Type is correct.
|
|
Correcting Document Classification
|
So, we need to fix this and manually assign the Document Type. There are two ways to do this. |
|||
|
Option 1: Right Click and Assign Document Type
|
|||
|
|
|||
|
|||
|
|
|||
|
To remove a flag from the document:
|
|||
|
Option 2: Use the Document Types Panel A quicker method of manually classifying a document may be to simply select the right Document Type from the Document Types Panel. We will use the next document in our Batch to illustrate this. Another common problem that can arise is Grooper misclassifying a document.
|
|||
|
|
|||
|
Option 2.5: Better Utilizing the Document Types Panel You should continue checking all document folders to ensure they've been classified correctly. We have one more problem in our Batch to resolve.
|
|||
|
This is particularly useful if you have a large Content Model with dozens or hundreds of Document Types. |
Completing the Review Step
|
|
|||
|
|
|||
|
|
|||
|
|
Completion Criteria
The Classification Viewer may be configured so that certain criteria must be met in order to complete the review task. If so configured, either or both of the following conditions must be satisfied:
- All document folders must be classified.
- All flags on document folders must be removed.
|
If this completion criteria has been enabled, and a Batch has documents that are flagged and/or unclassified, you the Classification Viewer will notify you in two ways:
|
Shortcuts
| Shortcut | Keystrokes | Description |
| Shared Folder and Page Commands | ||
| Flag Item | Ctrl + L | Places a flag on the selected folder/page. Users may select pre-generated flag messages or enter their own custom message. |
| Clear Flag | Ctrl + Shift + L | Removes a flag on the folder/page. |
| Delete | Del | This will delete the selected folder/page. CAUTION!!! There is no "undo" in Grooper. If you delete an item, it will be gone forever. |
| Rename | F2 | Renames the folder/page. Be aware, this does not classify a document folder. It only changes the folder's name. |
| Cut | Ctrl + X | Cuts a selected folder/page in the Batch. |
| Copy | Ctrl + C | Copies a selected folder/page in the Batch. |
| Paste | Ctrl + V | Pastes a copied or cut folder/page to the selected folder location in the Batch. |
| Move Down | Ctrl + Down | Moves the selected folder/page down in the Batch. |
| Move Up | Ctrl + Up | Moves the selected folder/page in the Batch. |
| Append to Previous | Ctrl + P | For folders, this appends any of a selected folder's children (pages or folders) to the folder before it. Effectively this will delete the selected folder and move any of its pages/folders to the bottom of the previous document/folder.
For pages, this will move the selected pages to the bottom of the previous folder above. |
| Prepend to Next | Ctrl + Shift + P | For folders, this prepends any of the selected folder's children (pages or folders) to the folder after it. Effectively this will delete the selected folder and move any of its pages/folders to the bottom of the next document/folder.
For pages, this will move the selected pages to the bottom of the next folder below. |
| Merge Selected | Ctrl + M | Merges selected folders/pages into a new document. This will create a folder, prompt you to assign it a Document Type, and move the selected folders/pages into the new folder. |
| Folder Specific Commands | ||
| Assign Document Type | Ctrl + Shift + A | Opens a window to select a Document Type for the selected document. |
| Goto Flagged | Ctrl + G | Selects the next document in the Batch with a flag. If there are no subsequent documents with flags in the Batch, it will cycle back to the first document with a flag. |
| Remove Level | Ctrl + U | Deletes the folder and moves any child objects (pages or folders) to the folder's level in the Batch. For example, if there was a document folder at Level 1 in the Batch with a single page in it (at Level 2). The folder would be deleted and the page would be moved to Level 1 in the Batch. |
| Insert Folder | Ins | Adds an empty folder to the selected folder. |
| Page Specific Commands | ||
| Rotate Left | Ctrl + Left | Rotates the page 90 degrees to the left (counter-clockwise). |
| Rotate Right | Ctrl + Right | Rotates the page 90 degrees to the right (clockwise). |
| Split Folder | Ctrl + S | Splits a document into a new folder at the selected page. This applies specifically to document folders with multiple pages. Imagine you have a five page document folder at Level 1 in the Batch. You select page 3 and apply the "Split Folder" command. This will cut pages 3 to 5 from the document folder and place them into an unclassified folder at Level 1. You'll end up with two folders created out of the original (One containing pages 1 and 2. One containing pages 3 to 5) both at the same level in the Batch hierarchy (Level 1). |
Data Viewer
The Data Viewer is used to review the data Grooper collects from each document during the Extract step of a Batch Process.
The Extract activity applies the logic set up in a Content Model to find and return data from a document. This extraction logic is defined by configuring Data Models. Data Elements are added to the Data Model for each piece of information you want to collect.
There are three types of Data Elements. Data can be collected as either Data Fields, Data Tables or Data Sections (or "fields", "tables" and "sections" for short).
- Fields are for what's called "single instance" data.
- Think a social security number on a W-2 form. There will be one single social security number filled in for the whole document. There is a single instance of this information (hence the term "single instance"), collected as a single value for the field.
- Tables are necessary to collect information listed in a table formed by rows and columns on a document.
- Sections can be tools to group data into a category, sub-divide a document into smaller units, or establish "multi-instance" sections (more on what this means later).
As a reviewer, it's your job to check Grooper's results for each of these Data Elements after the Extract activity collects them. This is precisely what the Data Viewer is for. There's a lot of things that can go wrong in the wide world of document processing. Optical Character Recognition (OCR) can convert a document's image to digital text. However, it's not perfect. Rarely will your OCR results be 100% accuracy. If the document's underlying text data is imperfect, so may be your data extraction. There might be other problems with the extraction logic's ability to find and return data. This is especially the case for document sets with a lot of variety. If a document has a data structure that has not been properly modeled in the Data Model's design, there's a good chance Grooper will fail to return the data at all or only return partial data. Regardless why the error occurred, you, the reviewer, are the last line of defense to ensure accurate data is captured for each document.
Starting the Review Step
|
|
|||
|
|
|||
|
|
|||
|
|
Reviewing Data Fields
We will start our journey into data review by looking at how to review fields. We will use the same set of invoice documents we reviewed for classification previously. And this is a fairly common part of your workflow. First, you review Grooper's work to make sure the documents are classified correctly. Once Grooper knows what kind of document it's working with, it knows what data its looking for and how to find it. Now that Grooper has extracted the data, we can use the Data Viewer to verify it collected all the data required and collected it accurately.
| FYI | It should be noted document Data Models have a high degree of configurability. Obviously, unless you're processing invoices, the specific data elements you will be reviewing in your environment will be different. You may have hundreds of data points to review on a single document. You may have just a few. That all depends on the business requirements for your document set and what your organization deems appropriate to extract from them.
However, the basics remain the same across all use cases. Grooper will extract information from the document, populate that data into fields and tables, and you'll review the results based off what you a human can see on the document. |
Required Fields
Commonly, an organization will deem certain data critical for document processing. Certain fields must therefore be extracted in order for the work to be considered complete. In Grooper, we satisfy this requirement by making a field "required". This will place the field (or table cell) in an error state if no value was extracted at all. In the Data Viewer, Grooper will alert you that the required value is missing, and will require you to manually enter it before review is completed.
|
In the case of this document's Data Model three fields are required:
|
|
|
|
Data Model Differences
Before looking at more problems, please be aware Data Models can be (and often are) different for individual Document Types. For the most part, we're working with a "flat" Content Model. All the Document Types share the same Data Model, meaning we're looking for the same data elements for each one. However, in your environment, each Document Type may represent more diverse kinds of documents and require their own individual Data Models with their own specific fields and tables. Or, your Document Types may all share some data elements, but have some addition fields unique to the individual Document Type.
|
|
Data Element Overrides (and Required Validation)
Another way Data Models can differ from Document Type to Document Type is through "Data Element Overrides" (sometimes just called "overrides" for short). This allows Grooper designers to change how fields, tables and sections behave for a specific Document Type while still maintaining a parent Data Model shared by multiple Document Types.
We're going to use another common review feature to demonstrate this. There may be some data that is not only required to be present, but extremely important Grooper extracted accurately. Your Grooper designer may designate this as a field that requires validation. So, even if it's accurately extracted, the field will stay in an error state until the user clears it.
|
For the "Ankara" Document Type, we've decided the "Remit To Address" requires manual validation. We've set up an override so that just this Document Type requires validation for this field. For the rest of them, we'll just take what Grooper gives us.
So, how do we proceed? We have to get rid of the error or Grooper will consider this an "invalid" document. |
|||
|
|
|||
|
|
Rubberband OCR
"Valid" Doesn't Mean Accurate
Reviewing Data Tables
Reviewing Data Sections
Shortcuts
Advanced Techniques: Validation and Calculation Expressions
Advanced Techniques: Database Lookups
Advanced Techniques: Rubberband Zone
- Redaction use case and/or elevation use case example
Thumbnail View
Shortcuts
Folder View
NOTES TO SELF
This is probably as good a time as any to talk about switching back and forth between views, if so enabled.
Shortcuts
Batch Management
Pausing and Resuming Batch Processing
Updating Batch Processes and Resetting Steps
Viewing Batch Statistics
Accessing the Batch Event Log
Designer Guide
Setting Up Review Views
Best practice to include a Content Scope (even if it seems redundant)
Data Model Styling for Data View
Review Queues
Review Queues allow further control of what Grooper Users have access to. Imagine a situation where you have several Grooper Batch Processes running in your Grooper environment. One or more of these processes may require elevated access for one reason or another. For example, you may have a Batch Process designed to process human resources files. These files would have personally identifiable information (PII) and should only be reviewed by users trained in PII compliance.
If you want to restrict users ability to perform review tasks you will need to do the following:
- Add the users to the Users list at the root node of the Grooper Repository.
- Create a new Review Queue.
- Select which Grooper Users you wish to add to the Review Queue.
- On the Review step of a Batch Process select the Review Queue.
- Then, only Grooper Users listed in the Review Queue will be able to perform that Review task in that Batch Process.
DETAILED EXAMPLE COMING SOON



































































































