2.90:Separation Review: Difference between revisions

From Grooper Wiki
No edit summary
 
(39 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Separation uses various techniques to group pages into classified documents.
__NOINDEX__
{{AutoVersion}}
<br>
[[File:separation_and_review_14.png|right|1000px|This is an example of the '''Separation Review''' '''Attended Client''' interface.]]


==About Separation and Separation Review==
<blockquote style="font-size:14pt">
'''Grooper''' uses various approaches and '''[https://en.wikipedia.org/wiki/Algorithm algorithms]''' to determine the classification of a page or folder. The various settings on a '''[[Content Model]]''' and '''[[Document Type]]''' really add to the complexities for separating pages into documents. ESP Auto Separation ''removes'' but does not ''eliminate'' a lot of the manual work to separate and classify documents. Separation Review is a new review module designed to make the manual work quick and easy.
'''Separation Review''' is an attended activity client allowing human review of document separation during the '''Separate''' activity ''before'' loose pages are placed in document folders.
</blockquote>


==Use Cases==
===Separation===
====Training Scope====
* Normal
** This is the classic version of capturing training features for a document type.
* FirstLast
** This is handy for training only the first and last page of a document type. It lowers the features training requirements, improves speed, and allows pages between the first and last page to magically be combined with the first and last page. This requires minimal effort for big returns.
* FirstOnly
** If someone has used Document Titles extraction for a Positive Extractor to make Separation easy in the past, consider this property an upgrade to that approach.
** Taking the approach of only using Document Titles breaks down when an image has poor quality in it and causes the document titles’ extractor to miss the title.
** This allows the continued use of using Document Titles to be used as features combined with any other features on a page being trained so that if the document title is missed, the separation engine can immediately and automagically rely on features contained in the training for the document type.


====Repeating Last Page====
==About==
* Contracts that contain signature pages being copied and distributed to involved parties and then signed and returned to be stored with the Contract document should use this feature.
'''Grooper''' uses various approaches to separate loose pages into document folders. Different '''[[Separation Provider]]s''' establish the separation points (or binding points) for '''Batch Folder''' creation differently.  
* Any document type that may or will have a duplicate last page.


====Secondary Page Extractor====
'''Separation Review''' was designed specifically with the '''''[[ESP Auto Separation]]''''' provider in mind.  "ESP" performs page level classification on a '''Batch''' of loose pages (and other methods) to both separate ''and'' classify the '''Batch'''.  '''''ESP Auto Separation''''' ''removes'' but does not ''eliminate'' a lot of the manual work to separate and classify documents. '''Separation Review''' is a new review module designed to make the manual work quick and easy.
* False-positive classification frequently happens on any page that is not page 1 of the document type, or the last page.
* '''''Please Note:'''''  '''Separation Review''' will only be effective for documents separated by the '''''ESP Auto Separation''''' provider.  If you are using a different '''''Separation Provider''''', you should use a different '''Review''' control to verify Grooper separated documents correctly.
* Use on document type that are always 2-page document types. To take advantage of this, configure a Positive Extractor in the Classification section and configure the Secondary Page Extractor to identify the second page. Cheating’s allowed in Grooper.


====Examples====
==Use Cases==
Separation will vary wildly between document types. Here are some real-world configurations using the new separation options. Some Classification Features are configured in the images. It’s common for Classification and Separation to be configured simultaneously. The explanations for each image consider classification and separation for runtime operations.


<tabs style="margin:20px">
<tab name="Example Number One" style="margin:25px">
{| class="wikitable"
| style="padding: 10px" | [[File:separation_and_review_09.png]] || style="padding: 10px" |
'''Positive Extractor'''<br/>
A Rule-Based Classification will occur and fallback on Features-Based Classification.<br/><br/>
'''Pagination'''<br/>
It says this document can have any number of pages.<br/><br/>
'''Prioritize EPI'''<br/>
This respects EPI if it is present, otherwise rely on EPI training from Features training.<br/><br/>
'''Secondary Page Extractor'''<br/>
It determines if the page isn’t the first or last page of the document. If this extractor has a result, Separation appends this page to the page above it and understands the next page will be another secondary page, last page, or the start of a new document.<br/><br/>
The Content Model has an EPI Extractor configured. ''not shown here''
|}
</tab>
<tab name="Example Number Two" style="margin:25px">
{| class="wikitable"
| style="padding: 10px" | [[File:separation_and_review_10.png]] || style="padding: 10px" |
'''Allow Training'''<br/>
No lexical features (as set on the Content Model) are considered for classification or separation.<br/><br/>
'''Positive Extractor'''<br/>
Separation will only occur when the Positive Extractor has a result. Rule-Based Classification only.<br/><br/>
'''Pagination'''<br/>
Automatically appends a specified number of pages using training.<br/><br/>
'''Repeating Last Page'''<br/>
In considering Structured Pagination, and this property set to True, repeating last pages will be appended.<br/><br/>
Separation will separate using page-for-page trained samples only.<br/>
Copies of the last page, such as signed pages from multiple parties, may exist and will be appended.<br/>
|}
</tab>
<tab name="Example Number Three" style="margin:25px">
{| class="wikitable"
| style="padding: 10px" | [[File:separation_and_review_11.png]] || style="padding: 10px" |
'''Allow Training'''<br/>
Lexical features (as set on the Content Model) are considered for classification and separation.<br/><br/>
'''Positive Extractor'''<br/>
A Rule-Based Classification will occur and fallback on Features-Based Classification.<br/><br/>
'''Negative Extractor'''<br/>
A result from this extractor will exclude this Document Type as a classification option.<br/><br/>
'''Pagination'''<br/>
This document can have any number of pages.<br/><br/>
'''Training Scope'''<br/>
The features on page 1 will be the only features saved in training. If Rules-Based Classification fails, only the first page’s features of a trained sample are used. When separation occurs and detects page 1 of this document type, all proceeding pages will be appended until another recognized document type is identified.<br/><br/>
The Content Model has an EPI Extractor configured and creates a hybrid of Rule-Based and Feature-Based Classification. ''not shown here''<br/>
|}
</tab>
<tab name="Example Number Four" style="margin:25px">
{| class="wikitable"
| style="padding: 10px" | [[File:separation_and_review_12.png]] || style="padding: 10px" |
'''Allow Training'''<br/>
Lexical features (as set on the Content Model) are considered for classification and separation.<br/><br/>
'''Positive Extractor'''<br/>
A Rule-Based Classification will occur and fallback on Features-Based Classification.<br/><br/>
'''Negative Extractor'''<br/>
A result from this extractor will exclude this Document Type as a classification option.<br/><br/>
'''Pagination'''<br/>
This document can have any number of pages.<br/><br/>
'''Prioritize EPI'''<br/>
Respect EPI if it is present, otherwise rely on EPI training from Features training.<br/><br/>
'''Secondary Page Extractor'''<br/>
- It determines if the page isn’t the first or last page of the document. If this extractor has a result, Separation appends this page to the page above it and understands the next page will be another secondary page, last page, or the start of a new document.<br/><br/>
The Content Model has an EPI Extractor configured and creates a hybrid of Rule-Based and Feature-Based Classification. ''not shown here''<br/>
|}
</tab>
<tab name="Example Number Five" style="margin:25px">
{| class="wikitable"
| style="padding: 10px" | [[File:separation_and_review_13.png]] || style="padding: 10px" |
'''Allow Training'''<br/>
Lexical features (as set on the Content Model) are considered for classification and separation.<br/><br/>
'''Positive Extractor'''<br/>
A Rule-Based Classification will occur and fallback on Features-Based Classification.<br/><br/>
'''Pagination'''<br/>
This document can have any number of pages.<br/><br/>
'''Secondary Page Extractor'''<br/>
- It determines if the page isn’t the first or last page of the document. If this extractor has a result, Separation appends this page to the page above it and understands the next page will be another secondary page, last page, or the start of a new document.<br/><br/>
The Content Model has an EPI Extractor configured and creates a hybrid of Rule-Based and Feature-Based Classification. ''not shown here''<br/>
|}
</tab>
</tabs>
===Separation Review===
Use the '''Separation Review''' activity (in a '''Batch Process''') any time a user will need to validate '''Separation''' and '''Classification''' is 100% accurate. The type of documents being processed (complexity, OCR result variances, etc.) can determine whether a user will need '''Separation Review'''.
Use the '''Separation Review''' activity (in a '''Batch Process''') any time a user will need to validate '''Separation''' and '''Classification''' is 100% accurate. The type of documents being processed (complexity, OCR result variances, etc.) can determine whether a user will need '''Separation Review'''.
<br/><br/>
<br/><br/>
Line 116: Line 25:
<br/>
<br/>
Documents that may optionally need '''Separation Review''' will depend on the priority of desired accuracy. These documents are considered structured document types because the pages contain predictable locations of repeated fields, tables, and sections used in both simple and complex classification techniques.
Documents that may optionally need '''Separation Review''' will depend on the priority of desired accuracy. These documents are considered structured document types because the pages contain predictable locations of repeated fields, tables, and sections used in both simple and complex classification techniques.
*Accounts Recievable
*Accounts Receivable
*Accounts Payable
*Accounts Payable
*Tax Documents
*Tax Documents
Line 123: Line 32:


==How To==
==How To==
===Separation===
Separation occurs on a per Document Type basis. Separation logic will be unique to each Document Type’s property configuration. To configure and adjust separation for each Document Type, navigate to that Document Type and set the properties under the Separation section. There is not a “one and done” configuration for Separation because every Document Type is unique.


===Separation Review===
{|cellpadding="10" cellspacing="5"
Let's take a look at the interface of '''Separation Review'''.
|-style="background-color:#f89420; color:white"
<br/>
|style="font-size:14pt"|'''&#9888;'''
|
{|cellpadding=10 cellspacing=5
|style="width:50%"|
'''Separation Review''' will always take place ''after'' a '''Separate''' activity is performed in a '''Batch Process'''.
 
If you want to take advantage of '''Separation Review''', you don't actually want the '''Separate''' step to place the pages into folders.  Rather, you want to review the '''Separation Provider's''' logic before creating '''Batch Folders''' in the '''Batch'''.
 
You ''must'' set the '''''Bind Only''''' property to ''True'' for the configured '''''Separation Provider'''''.
 
This can be done on the '''Separate''' activity itself or on the referenced '''Separation Profile'''.
|
[[File:Separation and review bind only.png]]
|}
|}
 
===Configuring Separation Review===
'''Separation Review''' is a configuration of the '''Review''' activity of a '''Batch Process'''.<br/>
<tabs style="margin:20px">
<tab name="1-3" style="margin:25px">
{|
| style="padding: 25px" |
1. First, create a '''Batch Process'''.<br/>
2. Add a '''Batch Process Step'''.<br/>
3. Set the '''Activity Type''' property to ''Review''.<br/>
|| [[File:separation_and_review_31a.png|1000px]]
|}
</tab>
<tab name="4-6" style="margin:25px">
{|
| style="padding: 25px" |
4. Click the '''Batch Views''' property then its ellipsis button.<br/>
5. In the '''Tab Page Setup Collection Editor''' window, click the '''Add''' button.
*This will add an item to the list named '''<NO CONTROL TYPE>'''.
6. In the '''Batch View Type''' property drop-down, select ''Separation Viewer''.<br/>
|| [[File:separation_and_review_31b.png|1000px]]
|}
</tab>
<tab name="7-9" style="margin:25px">
{|
| style="padding: 25px" |
7. Click the '''Control Settings''' property then its ellipsis button.<br/>
8. In the '''Separation Viewer Settings''' window click the drop-down arrow next to the '''Separation Settings''' property to expose further properties.<br/>
9. Click the '''Scope''' property, then it's corresponding drop-down arrow, and from the drop-down node tree, select a '''Content Type'''.<br/>
|| [[File:separation_and_review_31c.png|1000px]]
|}
</tab>
</tabs>
<p/>
 
===Property Details===
'''''Attachment Rules''''' – Configure how certain document types should be attached to a regular document type.<br/>
'''''Respect Original Page Numbers''''' – If a PDF or mult-page TIF file is used to create individual pages, Grooper will keep track of the page’s original number and use that value to prevent the last page of one document from being attached to the first page of the next document.<br/>
'''''Bind Only''''' – If this is set to true, all pages will remain individual pages when Separation Review completes, which defeats the purpose of this module. As of this writing, setting this property to False seems to make Separation Review easier to complete.<br/>
'''''Goto First Invalid Item''''' – Separation Review will auto-select the first item that couldn’t be assigned a document type because that item’s confidence didn’t meet or exceed the threshold for confident classification set on the Content Model.<br/>
'''''Flag Messages''''' – Optional property to predefined acceptable messages to assign to items in Separation Review.
 
===Interface Overview===
{| class="wikitable"
{| class="wikitable"
| colspan="2" style="padding: 10px" | [[File:Separation and review 14.png|center|1000px]]
| colspan="2" style="padding: 10px" | [[File:Separation and review 14.png|center|1000px]]
|-
|-
| style="padding: 10px" | [[File:Separation and review 15.png|right|300px]] || style="vertical-align:top;" | The documents are highlighted with two alternating colors to easily distinguish how separation will occur without needing to check the name of the '''Document Type'''.<br/><br/>
| style="padding: 10px" | [[File:Separation and review 15.png|right]] || style="vertical-align:top;" | The documents are highlighted with two alternating colors to easily distinguish how separation will occur without needing to check the name of the '''Document Type'''.<br/><br/>
If an item has red text, this means the item has a classification flag. In this image, hovering over Page 9 would show a tooltip with the reason the item is flagged.<br/><br/>
If an item has red text, this means the item has a classification flag. In this image, hovering over Page 9 would show a tooltip with the reason the item is flagged.<br/><br/>
This image also shows pages because the '''Separate''' activity has the '''Bind Only''' property set to ''True''which will leave the pages as loose pages in the batch until choosing the '''Separate''' command at the end of the review process. If the '''Bind Only''' property is set to ''False'', this image will show folders because the '''Separate''' activity proceeded to separate the pages into folders.
This image also shows pages because the '''Separate''' activity has the '''Bind Only''' property set to ''True''which will leave the pages as loose pages in the batch until choosing the '''Separate''' command at the end of the review process. If the '''Bind Only''' property is set to ''False'', this image will show folders because the '''Separate''' activity proceeded to separate the pages into folders.
|-
|-
| style="padding: 10px" | [[File:Separation and review 16.png|right|300px]] || style="vertical-align:top;" | This is a list view showing a list of candidates and their respective confidences based on the current selection.
| style="padding: 10px" | [[File:Separation and review 16.png|right]] || style="vertical-align:top;" | This is the '''Candidates''' list view showing a list of candidates and their respective confidences based on the current selection.
|}
|}


====Appending and Prepending to Classified Groups of Documents====
===Appending and Prepending to Classified Groups of Documents===
<tabs style="margin:20px">
<tabs style="margin:20px">
<tab name="Step1" style="margin:25px">
<tab name="Step1" style="margin:25px">
{| class="wikitable"
{| class="wikitable"
| style="padding: 10px" | Right-clicking on Page 9 allows selection of the '''Prepend to Next''' object command. Ctrl+N is the shortcut key. || Page 9 is now considered Page 1 of 4.
| style="padding: 10px" | Right-clicking on Page 9 allows selection of the '''Prepend to Next''' object command. Ctrl+N is the shortcut key. || style="padding: 10px" | Page 9 is now considered Page 1 of 4.
|-
|-
| style="padding: 10px" | [[File:Separation and review 17.png|center]] || [[File:Separation and review 18.png|center]]  
| style="padding: 10px" | [[File:Separation and review 17.png|center]] || style="padding: 10px" | [[File:Separation and review 18.png|center]]  
|}
|}
</tab>
</tab>
<tab name="Step 2" style="margin:25px">
<tab name="Step 2" style="margin:25px">
{| class="wikitable"
{| class="wikitable"
| style="padding: 10px" | Right-clicking  on Page 13 allows selection of '''Append to Previous''' object command. Ctrl+P is the shortcut key. || Page 13 is now considered Page 5 of 5.
| style="padding: 10px" | Right-clicking  on Page 13 allows selection of '''Append to Previous''' object command. Ctrl+P is the shortcut key. || style="padding: 10px" | Page 13 is now considered Page 5 of 5.
|-
|-
| style="padding: 10px" | [[File:Separation and review 19.png|center]] || [[File:Separation and review 20.png|center]]  
| style="padding: 10px" | [[File:Separation and review 19.png|center]] || style="padding: 10px" | [[File:Separation and review 20.png|center]]  
|}
|}
</tab>
</tab>
</tabs>
</tabs>


====Multi-selecting Pages and Classifying Them====
===Multi-selecting Pages and Classifying Them===
<tabs style="margin:20px">
<tab name="Step1" style="margin:25px">
The keyboard should be the only control needed to work through '''Separation Review'''. When one or more pages are selected, start typing the name of the '''Document  Type'''. This will filter the '''Candidates''' list view. Typing “paid” in this example, reduces the candidate list to five '''Document Types''' because their names have the word “paid” in them.
<br/><br/>
[[File:separation_and_review_21a.png]]
</tab>
<tab name="Step2" style="margin:25px">
When the desired '''Document Type''' is in the list, pressing the Tab button will cause the focus to change to the '''Candidates''' list view. Use the Up and Down buttons to highlight the correct document type.
<br/><br/>
[[File:separation_and_review_22.png]]
</tab>
<tab name="Step3" style="margin:25px">
Pressing Enter when the correct '''Document Type''' is selected will classify the pages to the highlighted (from the '''Candidate''' list view) document type.<br/><br/>
[[File:separation_and_review_23.png]]
</tab>
</tabs>


==Property Details==
===Select Span===
===Separation===
{|class="wikitable"
====Training Scope====
| style="padding: 10px" | It is necessary at times to make a multi-selection to grab all the pages pertaining to a particular document. Using the mouse or keyboard to select a page, then shift or control select the remaining is possible, however unnecessary with the Select Span command. One need simply select a page of a document, and use the '''Select Span''' object command whose hotkey is Ctrl+S. || style="padding: 10px" | With the '''Select Span''' object command used, all the pages of the document are selected. From here one could Ctrl+UP or Ctrl+DOWN hotkeys to move the entire document up or down in the list of documents.
This makes it possible to alter the separation training and logic if a Document Type has special attributes, pages, or layout that would normally confuse the ESP Separation logic.
|-
* Normal
| style="padding: 10px" | [[File:separation_and_review_24.png]] || style="padding: 10px" | [[File:separation_and_review_25.png]]
** This is the classic version of capturing training features for a document type.
|}
* FirstLast
** Only trains the first and last page of a document type. See Use Cases for more information.
* FirstOnly
** Only trains the first page of a document type. See Use Cases for more information.


====Repeating Last Page====
===Completing Separation Review===
Some Document Types require duplicate last pages. Normally, Separation would see a last page, complete the separation for a document, and leave duplicate last pages as loose pages. Enabling this property will reduce operator work in Separation Review by reducing the amount of page appending commands.
<tabs style="margin:20px">
* This property is only available when a Document Type’s Pagination is set to Structured.
<tab name="Step1" style="margin:25px">
Above the list of pages is the '''Separate''' button. It will only be enabled when all pages have a classification assigned.
<br/><br/>
-Starting '''Separation Review'''-
<br/>
[[File:separation_and_review_26.png]]
<br/><br/>
-When all pages are classified-
<br/>
[[File:separation_and_review_27.png]]
</tab>
<tab name="Step2" style="margin:25px">
When all pages are classified and the '''Separate''' button is clicked, the list will change and show all pages separated into classified documents. Notice the Folder icons and Folder names.
<br/><br/>
[[File:separation_and_review_28.png|center]]
</tab>
<tab name="Step3" style="margin:25px">
Clicking the button next to the separate button will undo the separation and revert folders back to pages.
<br/><br/>
-All pages are separated into folers.-
<br/>
[[File:separation_and_review_29.png]]
<br/><br/>
Clicking the (blue arched arrow) '''Undo''' button will undo all manual classification performed during '''Separation Review''' and revert the batch of pages to their previous state when '''Separation Review''' started. Note that deleted pages will not be recovered. Deleted pages are permanently unrecoverable.
</tab>
</tabs>


====Secondary Page Extractor====
===Object Commands from the Context Menu===
Any result created by this extractor identifies the page is not the first or last page of a document.
{| class="wikitable"
* This property is only available when a Document Type’s Pagination is set to Unstructured.
| style="padding: 10px" | [[File:separation_and_review_30.png]] || style="padding: 10px" |
===Separation Review===
'''Append to Previous''' – Attaches selected item(s) to the preceding items above the selection.<br/>
The quick brown fox jumps over the lazy dog.
'''Prepend to Next''' – Attaches selected item(s) to the proceeding items below the selection.<br/>
'''Delete''' – Deletes selected item(s).<br/>
'''Select Span''' – Shortcut command to highlight all items in the same color group.<br/>
'''Separate''' – Applies separation to items currently classified.<br/>
'''Unseparate''' – Removes separation to items currently in classified folders.<br/>
'''Move Up''' – Moves the selected item(s) up in the item order.<br/>
'''Move Down''' – Moves the selected items(s) down in the item order.<br/>
'''Flag Item''' – Assign a flag to an item for other workflow needs post-Separation Review.<br/>
'''Goto Next Loose Page''' – Auto-selects the next page not confidently assigned a document type.<br/>
'''Remove PDF Version''' – Removes an existing pdf file attached to the item.<br/>
'''Image''' – Provides a selection of image adjustment commands to apply to the current page.<br/>
'''Scan Once''' – Changes the current display image format.<br/>
'''Send To''' – Provides a selection of export commands to apply to the current page.<br/>
'''Properties''' – Displays metadata of a selected page.<br/>
|}


==Version Differences==
==Version Differences==
===Separation===
Separation in Grooper 2.9 adds three new properties to Document Types.


# Training Scope
===Grooper 2.80 - Classification Review===
# Repeating Last Page
# Secondary Page Extractor


The Pagination property on a Document Type will determine if the new property will appear. See Property Details for more information.
Grooper 2.80 and previous versions relied on Classification Review with its various controls to separate loose pages, merge selected pages into documents, and correct misclassified folders or classify unclassified folders. This user interface required a moderate-to-large effort to complete classification and separation, especially when reverting one or more folders to loose pages in order to combine the same pages into separate documents and classifying each newly created document.


<tabs style="margin:20px">
===Grooper 2.9 - Separation Review===
<tab name="Structured Document Properties" style="margin:25px">
 
{| class="wikitable"
'''Separation Review''' is a new '''Attended Client''' in Grooper version 2.90.
! style="padding: 5px" | Grooper 2.8
! Grooper 2.9
|-
| style="padding: 10px" | [[File:separation_and_review_01.png]] || style="padding: 10px" | [[File:separation_and_review_02.png]]
|}
</tab>
<tab name="Unstructured Document Properties" style="margin:25px">
{| class="wikitable"
! style="padding: 5px" | Grooper 2.8
! Grooper 2.9
|-
| style="padding: 10px" | [[File:separation_and_review_03.png]] || style="padding: 10px" | [[File:separation_and_review_04.png]]
|}
</tab>
<tab name="Fixed Document Properties" style="margin:25px">
{| class="wikitable"
! style="padding: 5px" | Grooper 2.8
! Grooper 2.9
|-
| style="padding: 10px" | [[File:separation_and_review_05.png]] || style="padding: 10px" | [[File:separation_and_review_06.png]]
|}
</tab>
<tab name="Extended Document Properties" style="margin:25px">
{| class="wikitable"
! style="padding: 5px" | Grooper 2.8
! Grooper 2.9
|-
| style="padding: 10px" | [[File:separation_and_review_07.png]] || style="padding: 10px" | [[File:separation_and_review_08.png]]
|}
</tab>
</tabs>


===Separation Review===
When using the ''ESP Auto Separation'' provider to separate and classify documents, the '''Separation Review''' module should replace '''Classification Review''', which will also still be available for legacy support or other use cases. Don’t be concerned about this being a new module. Slap the name “Classification Review 2.0” on Separation Review because Separation Review really is a true upgrade and efficiently redesigned approach of '''Classification Review'''. Consider it a godsend!
====Grooper 2.8 Classification Review====
Grooper 2.8 and previous versions relied on Classification Review with its various controls to separate loose pages, merge selected pages into documents, and correct misclassified folders or classify unclassified folders. This user interface required a moderate-to-large effort to complete classification and separation, especially when reverting one or more folders to loose pages in order to combine the same pages into separate documents and classifying each newly created document.


====Grooper 2.9 Separation Review====
The design of '''Separation Review''' invites the heavy use of the keyboard and keyboard shortcuts to make quick work of any items needing corrective action. Many quality-of-life improvements now exist and are quickly realized if one has previous experience using '''Classification Review'''.
Starting with Grooper 2.9, Separation Review replaces '''[[Classification Review]]''', which will also still be available for legacy support or other use cases. Don’t be concerned about this being a new module. Slap the name “Classification Review 2.0” on Separation Review because Separation Review really is a true upgrade and efficiently redesigned approach of Classification Review. Consider it a godsend!


The design of Separation Review invites the heavy use of the keyboard and keyboard shortcuts to make quick work of any items needing corrective action. Many quality-of-life improvements now exist and are quickly realized if one has previous experience using Classification Review.
Even though this is a review client for separation, it is really designed to work in concert with ''ESP Auto Separation'' which also classifies the separated documents.  Therefore, some classification tasks take place in this module. In the [[#How To|How To]] section, items in the list are color coded. The list shows current classification of the items. This doesn’t necessarily mean they are currently separated into documents. The actual separation occurs later. During use of '''Separation Review''', using the various context menu commands will assist and ensure pages are classified correctly and ready for proper separation.


Even though this is Separation Review, some classification tasks take place in this module. In the How-To section, items in the list are color coded. The list shows current classification of the items. This doesn’t necessarily mean they are currently separated into documents. The actual separation occurs later. During use of Separation Review, using the various context menu commands will assist and ensure pages are classified correctly and ready for proper separation.
[[Category:Articles]]
[[Category:Version 2.90]]

Latest revision as of 11:07, 5 August 2025


This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252.90


This is an example of the Separation Review Attended Client interface.
This is an example of the Separation Review Attended Client interface.

Separation Review is an attended activity client allowing human review of document separation during the Separate activity before loose pages are placed in document folders.


About

Grooper uses various approaches to separate loose pages into document folders. Different Separation Providers establish the separation points (or binding points) for Batch Folder creation differently.

Separation Review was designed specifically with the ESP Auto Separation provider in mind. "ESP" performs page level classification on a Batch of loose pages (and other methods) to both separate and classify the Batch. ESP Auto Separation removes but does not eliminate a lot of the manual work to separate and classify documents. Separation Review is a new review module designed to make the manual work quick and easy.

  • Please Note: Separation Review will only be effective for documents separated by the ESP Auto Separation provider. If you are using a different Separation Provider, you should use a different Review control to verify Grooper separated documents correctly.

Use Cases

Use the Separation Review activity (in a Batch Process) any time a user will need to validate Separation and Classification is 100% accurate. The type of documents being processed (complexity, OCR result variances, etc.) can determine whether a user will need Separation Review.

Following is a (by no means exhaustive) list of industries whose documents have shown to frequently require Separation Review. These documents are considered unstructured document types due to complex and extremely variable language present on the document pages.

  • Oil & Gas
  • Legal
  • Healthcare


Documents that may optionally need Separation Review will depend on the priority of desired accuracy. These documents are considered structured document types because the pages contain predictable locations of repeated fields, tables, and sections used in both simple and complex classification techniques.

  • Accounts Receivable
  • Accounts Payable
  • Tax Documents
  • HR Documents
  • Healthcare

How To

Separation Review will always take place after a Separate activity is performed in a Batch Process.

If you want to take advantage of Separation Review, you don't actually want the Separate step to place the pages into folders. Rather, you want to review the Separation Provider's logic before creating Batch Folders in the Batch.

You must set the Bind Only property to True for the configured Separation Provider.

This can be done on the Separate activity itself or on the referenced Separation Profile.

Configuring Separation Review

Separation Review is a configuration of the Review activity of a Batch Process.

1. First, create a Batch Process.
2. Add a Batch Process Step.
3. Set the Activity Type property to Review.

4. Click the Batch Views property then its ellipsis button.
5. In the Tab Page Setup Collection Editor window, click the Add button.

  • This will add an item to the list named <NO CONTROL TYPE>.

6. In the Batch View Type property drop-down, select Separation Viewer.

7. Click the Control Settings property then its ellipsis button.
8. In the Separation Viewer Settings window click the drop-down arrow next to the Separation Settings property to expose further properties.
9. Click the Scope property, then it's corresponding drop-down arrow, and from the drop-down node tree, select a Content Type.

Property Details

Attachment Rules – Configure how certain document types should be attached to a regular document type.
Respect Original Page Numbers – If a PDF or mult-page TIF file is used to create individual pages, Grooper will keep track of the page’s original number and use that value to prevent the last page of one document from being attached to the first page of the next document.
Bind Only – If this is set to true, all pages will remain individual pages when Separation Review completes, which defeats the purpose of this module. As of this writing, setting this property to False seems to make Separation Review easier to complete.
Goto First Invalid Item – Separation Review will auto-select the first item that couldn’t be assigned a document type because that item’s confidence didn’t meet or exceed the threshold for confident classification set on the Content Model.
Flag Messages – Optional property to predefined acceptable messages to assign to items in Separation Review.

Interface Overview

The documents are highlighted with two alternating colors to easily distinguish how separation will occur without needing to check the name of the Document Type.

If an item has red text, this means the item has a classification flag. In this image, hovering over Page 9 would show a tooltip with the reason the item is flagged.

This image also shows pages because the Separate activity has the Bind Only property set to Truewhich will leave the pages as loose pages in the batch until choosing the Separate command at the end of the review process. If the Bind Only property is set to False, this image will show folders because the Separate activity proceeded to separate the pages into folders.

This is the Candidates list view showing a list of candidates and their respective confidences based on the current selection.

Appending and Prepending to Classified Groups of Documents

Right-clicking on Page 9 allows selection of the Prepend to Next object command. Ctrl+N is the shortcut key. Page 9 is now considered Page 1 of 4.
Right-clicking on Page 13 allows selection of Append to Previous object command. Ctrl+P is the shortcut key. Page 13 is now considered Page 5 of 5.

Multi-selecting Pages and Classifying Them

The keyboard should be the only control needed to work through Separation Review. When one or more pages are selected, start typing the name of the Document Type. This will filter the Candidates list view. Typing “paid” in this example, reduces the candidate list to five Document Types because their names have the word “paid” in them.

When the desired Document Type is in the list, pressing the Tab button will cause the focus to change to the Candidates list view. Use the Up and Down buttons to highlight the correct document type.

Pressing Enter when the correct Document Type is selected will classify the pages to the highlighted (from the Candidate list view) document type.

Select Span

It is necessary at times to make a multi-selection to grab all the pages pertaining to a particular document. Using the mouse or keyboard to select a page, then shift or control select the remaining is possible, however unnecessary with the Select Span command. One need simply select a page of a document, and use the Select Span object command whose hotkey is Ctrl+S. With the Select Span object command used, all the pages of the document are selected. From here one could Ctrl+UP or Ctrl+DOWN hotkeys to move the entire document up or down in the list of documents.

Completing Separation Review

Above the list of pages is the Separate button. It will only be enabled when all pages have a classification assigned.

-Starting Separation Review-


-When all pages are classified-

When all pages are classified and the Separate button is clicked, the list will change and show all pages separated into classified documents. Notice the Folder icons and Folder names.

Clicking the button next to the separate button will undo the separation and revert folders back to pages.

-All pages are separated into folers.-


Clicking the (blue arched arrow) Undo button will undo all manual classification performed during Separation Review and revert the batch of pages to their previous state when Separation Review started. Note that deleted pages will not be recovered. Deleted pages are permanently unrecoverable.

Object Commands from the Context Menu

Append to Previous – Attaches selected item(s) to the preceding items above the selection.
Prepend to Next – Attaches selected item(s) to the proceeding items below the selection.
Delete – Deletes selected item(s).
Select Span – Shortcut command to highlight all items in the same color group.
Separate – Applies separation to items currently classified.
Unseparate – Removes separation to items currently in classified folders.
Move Up – Moves the selected item(s) up in the item order.
Move Down – Moves the selected items(s) down in the item order.
Flag Item – Assign a flag to an item for other workflow needs post-Separation Review.
Goto Next Loose Page – Auto-selects the next page not confidently assigned a document type.
Remove PDF Version – Removes an existing pdf file attached to the item.
Image – Provides a selection of image adjustment commands to apply to the current page.
Scan Once – Changes the current display image format.
Send To – Provides a selection of export commands to apply to the current page.
Properties – Displays metadata of a selected page.

Version Differences

Grooper 2.80 - Classification Review

Grooper 2.80 and previous versions relied on Classification Review with its various controls to separate loose pages, merge selected pages into documents, and correct misclassified folders or classify unclassified folders. This user interface required a moderate-to-large effort to complete classification and separation, especially when reverting one or more folders to loose pages in order to combine the same pages into separate documents and classifying each newly created document.

Grooper 2.9 - Separation Review

Separation Review is a new Attended Client in Grooper version 2.90.

When using the ESP Auto Separation provider to separate and classify documents, the Separation Review module should replace Classification Review, which will also still be available for legacy support or other use cases. Don’t be concerned about this being a new module. Slap the name “Classification Review 2.0” on Separation Review because Separation Review really is a true upgrade and efficiently redesigned approach of Classification Review. Consider it a godsend!

The design of Separation Review invites the heavy use of the keyboard and keyboard shortcuts to make quick work of any items needing corrective action. Many quality-of-life improvements now exist and are quickly realized if one has previous experience using Classification Review.

Even though this is a review client for separation, it is really designed to work in concert with ESP Auto Separation which also classifies the separated documents. Therefore, some classification tasks take place in this module. In the How To section, items in the list are color coded. The list shows current classification of the items. This doesn’t necessarily mean they are currently separated into documents. The actual separation occurs later. During use of Separation Review, using the various context menu commands will assist and ensure pages are classified correctly and ready for proper separation.