2023:Scope (Property): Difference between revisions

From Grooper Wiki
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 8: Line 8:
* ''Folder'' - One task will be created for each folder at a specific level within the batch.
* ''Folder'' - One task will be created for each folder at a specific level within the batch.
* ''Page'' - One task will be created for each page.
* ''Page'' - One task will be created for each page.
== Glossary ==
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Concept}}
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Service}}
<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}}
<u><big>'''Batch Folder'''</big></u>: {{#lst:Glossary|Batch Folder}}
<u><big>'''Batch Page'''</big></u>: {{#lst:Glossary|Batch Page}}
<u><big>'''Batch Process Step'''</big></u>: {{#lst:Glossary|Batch Process Step}}
<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
<u><big>'''Classify'''</big></u>: {{#lst:Glossary|Classify}}
<u><big>'''Clip Frames'''</big></u>: {{#lst:Glossary|Clip Frames}}
<u><big>'''Correct'''</big></u>: {{#lst:Glossary|Correct}}
<u><big>'''Detect Frames'''</big></u>: {{#lst:Glossary|Detect Frames}}
<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
<u><big>'''Execute'''</big></u>: {{#lst:Glossary|Execute}}
<u><big>'''Export Behavior'''</big></u>: {{#lst:Glossary|Export Behavior}}
<u><big>'''Export'''</big></u>: {{#lst:Glossary|Export}}
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
<u><big>'''Image Processing'''</big></u>: {{#lst:Glossary|Image Processing}}
<u><big>'''Image Processing'''</big></u>: {{#lst:Glossary|Image Processing}}
<u><big>'''Initialize Card'''</big></u>: {{#lst:Glossary|Initialize Card}}
<u><big>'''Lexicon'''</big></u>: {{#lst:Glossary|Lexicon}}
<u><big>'''Recognize'''</big></u>: {{#lst:Glossary|Recognize}}
<u><big>'''Render'''</big></u>: {{#lst:Glossary|Render}}
<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
<u><big>'''Scope'''</big></u>: {{#lst:Glossary|Scope}}
<u><big>'''Send Mail'''</big></u>: {{#lst:Glossary|Send Mail}}
<u><big>'''Separate'''</big></u>: {{#lst:Glossary|Separate}}
<u><big>'''Separation'''</big></u>: {{#lst:Glossary|Separation}}
<u><big>'''Service'''</big></u>: {{#lst:Glossary|Service}}
<u><big>'''Split Pages'''</big></u>: {{#lst:Glossary|Split Pages}}
<u><big>'''Split'''</big></u>: {{#lst:Glossary|Split}}
<u><big>'''Undo Separation'''</big></u>: {{#lst:Glossary|Undo Separation}}
<u><big>'''XML Transform'''</big></u>: {{#lst:Glossary|XML Transform}}


== About ==
== About ==
An important thing to understand about '''''Scope''''' is that for nearly every activity you are telling '''Grooper''' ''what'' you want to apply the activity to. For example, '''[[Recognize (Activity)|Recognize]]''', do you want to affect ''Pages'' or ''Folders'' (keep in mind, however, that in nearly every scenario it is considered best practice to scope '''Recognize''' to ''Page''). Or, '''[[Extract (Activity)|Extract]]''', do you want to extract data from folders at level 1, or level 2, etc.
An important thing to understand about '''''Scope''''' is that for nearly every activity you are telling '''Grooper''' ''what'' you want to apply the activity to. For example, '''[[Recognize]]''', do you want to affect ''Pages'' or ''Folders'' (keep in mind, however, that in nearly every scenario it is considered best practice to scope '''Recognize''' to ''Page''). Or, '''[[Extract]]''', do you want to extract data from folders at level 1, or level 2, etc.


=== Batch Scope ===
=== Batch Scope ===
Line 85: Line 18:
The most common setting for the '''''Scope''''' property is ''Folder''. Using this setting exposes the '''''Folder Level''''' property, which is set to an integer like 1, 2, or 3 etc. Understanding '''Batch''' hierarchy as it relates to the '''''Folder Level''''' property is important.
The most common setting for the '''''Scope''''' property is ''Folder''. Using this setting exposes the '''''Folder Level''''' property, which is set to an integer like 1, 2, or 3 etc. Understanding '''Batch''' hierarchy as it relates to the '''''Folder Level''''' property is important.


Setting the '''''Folder Level''''' property to something other than the default of ''1'' is most common with the '''[[Classify (Activity)|Classify]]''', '''Extract''', and '''[[Export (Activity)|Export]]''' activities.
Setting the '''''Folder Level''''' property to something other than the default of ''1'' is most common with the '''[[Classify]]''', '''Extract''', and '''[[Export]]''' activities.


Consider a '''Batch''' with a '''Batch Folder''' at "Folder Level 1" that has two child '''Batch Folders''' at "Folder Level 2". The document at "Folder Level 1" is a packet made up by its two "sub" documents. For this example consider the document at "Folder Level 1" is a "Mortgage Packet" document consisting of a "Closing Disclosure" document and a "Universal Residential Loan Application" document.
Consider a '''Batch''' with a '''Batch Folder''' at "Folder Level 1" that has two child '''Batch Folders''' at "Folder Level 2". The document at "Folder Level 1" is a packet made up by its two "sub" documents. For this example consider the document at "Folder Level 1" is a "Mortgage Packet" document consisting of a "Closing Disclosure" document and a "Universal Residential Loan Application" document.
Line 121: Line 54:


=== Scoping Separate ===
=== Scoping Separate ===
Scoping for '''[[Separate (Activity)|Separate]]''' and '''[[Review (Activity)|Review]]''' is a bit unique as well. For these activities '''''Scope''''' is not ''what'' you are affecting, but to ''where'' you are applying the activity.
Scoping for '''[[Separate]]''' and '''[[Review]]''' is a bit unique as well. For these activities '''''Scope''''' is not ''what'' you are affecting, but to ''where'' you are applying the activity.


You might think "I want to separate the pages of this batch into individual folders", and assume the '''''Scope''''' would be ''Page''. This would be an incorrect assumption. With '''Separate''' you don't scope it to pages, you '''''Scope''''' it to either ''Batch'' or  
You might think "I want to separate the pages of this batch into individual folders", and assume the '''''Scope''''' would be ''Page''. This would be an incorrect assumption. With '''Separate''' you don't scope it to pages, you '''''Scope''''' it to either ''Batch'' or  
Line 183: Line 116:
Consider a '''Batch''' with just one document at Folder Level 1. Consider also that this document has 1,000 pages. If the '''''Scope''''' were set to ''Folder'' and the '''''Folder Level''''' to ''1'' a '''[[Job]]''' would get created with one '''Task'''. As a result only a single CPU thread would pick up that single '''Task''' and take a very long time recognizing the text on that document's 1,000 pages.
Consider a '''Batch''' with just one document at Folder Level 1. Consider also that this document has 1,000 pages. If the '''''Scope''''' were set to ''Folder'' and the '''''Folder Level''''' to ''1'' a '''[[Job]]''' would get created with one '''Task'''. As a result only a single CPU thread would pick up that single '''Task''' and take a very long time recognizing the text on that document's 1,000 pages.


However, if (following best practice) you set the '''Recognize''' '''''Scope''''' to ''Page'', a '''Job''' will get created with 1,000 '''Tasks''' (one '''Task''' per page object). Therefore, depending on how you've structured your '''[[Activity Processing (Service)|Activity Processing Services]]''', you could have a wide array of CPU threads tackling each '''Task''' independantly. This would greatly decress the time required to recognize the text on the document.
However, if (following best practice) you set the '''Recognize''' '''''Scope''''' to ''Page'', a '''Job''' will get created with 1,000 '''Tasks''' (one '''Task''' per page object). Therefore, depending on how you've structured your '''[[Activity Processing]] services, you could have a wide array of CPU threads tackling each '''Task''' independently. This would greatly decrees the time required to recognize the text on the document.


=== Example: Separate > Undo Separation ===
=== Example: Separate > Undo Separation ===
Line 216: Line 149:


==== Batch, Folder, or Page ====
==== Batch, Folder, or Page ====
* [[Execute (Activity)|Execute]]
* [[Execute]]
** Applicable scope is dependent on the command executed by this step.


==== Batch and Folder ====
==== Batch and Folder ====
* [[Deduplicate (Activity)|Deduplicate]]
* [[Deduplicate]]
* [[Export (Activity)|Export]]
* [[Export]]
* [[Remove Level (Activity)|Remove Level]]
* [[Remove Level]]
* [[Review (Activity)|Review]]
* [[Review]]
* [[Send Mail (Activity)|Send Mail]]
* [[Send Mail]]
* [[Separate (Activity)|Separate]]
* [[Separate]]
==== Folder and Page ====
==== Folder and Page ====
* [[Burst Book (Activity)|Burst Book]]
* [[Burst Book]]
* [[Correct (Activity)|Correct]]
* [[Correct]]
* [[Detect Language (Activity)|Detect Language]]
* [[Detect Language]]
* [[Image Processing (Activity)|Image Processing]]
* [[Image Processing]]
* [[Recognize (Activity)|Recognize]]
* [[Recognize]]
* [[Train Lexicon (Activity)|Train Lexicon]]
* [[Train Lexicon]]
==== Batch Only ====
==== Batch Only ====
* [[Batch Transfer (Activity)|Batch Transfer]]
* [[Batch Transfer]]
* [[Dispose Batch (Activity)|Dispose Batch]]
* [[Dispose Batch]]
* [[Launch Process (Activity)|Launch Process]]
* [[Launch Process]]
* [[Spawn Batch (Activity)|Spawn Batch]]
* [[Spawn Batch]]


==== Folder Only ====
==== Folder Only ====
* [[Apply Rules (Activity)|Apply Rules]]
* [[Apply Rules]]
* [[Classify (Activity)|Classify]]
* [[Classify]]
* [[Clip Frames (Activity)|Clip Frames]]
* [[Clip Frames]]
* [[Convert Data (Activity)|Convert Data]]
* [[Convert Data]]
* [[Detect Frames (Activity)|Detect Frames]]
* [[Detect Frames]]
* [[Extract (Activity)|Extract]]
* [[Extract]]
* [[Initialize Card (Activity)|Initialize Card]]
* [[Initialize Card]]
* [[Merge (Activity)|Merge]]
* [[Merge]]
* [[Render (Activity)|Render]]
* [[Render]]
* [[Split Pages (Activity)|Split Pages]]
* [[Split Pages]]
* [[Split Text (Activity)|Split Text]]
* [[Split Text]]
* [[Text Transform (Activity)|Text Transform]]
* [[Text Transform]]
* [[Translate (Activity)|Translate]]
* [[Translate]]
* [[XML Transform (Activity)|XML Transform]]
* [[XML Transform]]
==== Page Only ====
==== Page Only ====
* [[Redact (Activity)|Redact]]
* [[Redact]]

Latest revision as of 10:15, 2 March 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

202520232.72

The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

Activities can be scoped to different levels in a Batch:

  • Batch - One task will be created for the entire batch.
  • Folder - One task will be created for each folder at a specific level within the batch.
  • Page - One task will be created for each page.

About

An important thing to understand about Scope is that for nearly every activity you are telling Grooper what you want to apply the activity to. For example, Recognize, do you want to affect Pages or Folders (keep in mind, however, that in nearly every scenario it is considered best practice to scope Recognize to Page). Or, Extract, do you want to extract data from folders at level 1, or level 2, etc.

Batch Scope

Setting the Scope property to Batch is occasionally used. Scoping to Batch is done when the entire contents of a Batch are to be affected as a whole. Separation and Review are activites that commonly use this configuration.

Folder Scope

The most common setting for the Scope property is Folder. Using this setting exposes the Folder Level property, which is set to an integer like 1, 2, or 3 etc. Understanding Batch hierarchy as it relates to the Folder Level property is important.

Setting the Folder Level property to something other than the default of 1 is most common with the Classify, Extract, and Export activities.

Consider a Batch with a Batch Folder at "Folder Level 1" that has two child Batch Folders at "Folder Level 2". The document at "Folder Level 1" is a packet made up by its two "sub" documents. For this example consider the document at "Folder Level 1" is a "Mortgage Packet" document consisting of a "Closing Disclosure" document and a "Universal Residential Loan Application" document.


Consider this model:


For this example a Batch Process would have two Batch Process Steps configured with the Extract activity. The Scope property on each will be set to Folder. However, one will have the Folder Level property set to 1, and the other would be set to 2.

The extract step set to "folder level 1" will target the "Mortgage Packet" document and will collect the following information:


The extract step set to "folder level 2" will target both the "Closing Disclosure" and the "U.R.L.A" and will collect the following information:


For this example the Batch Process will also have two Batch Process Steps configured with the Export activity. The Scope property on each will be set to Folder. However, one will have the Folder Level property set to 1, and the other would be set to 2.

The export step set to "folder level 1" will target the "Mortgage Packet" document and will leverage an Export Behavior on the Content Model that will export the whole "Mortgage Packet" and its contents to a content management system and apply an index field in that system using the "Borrower Name" field.

The export step set to "folder level 2" will target both the "Closing Disclosure" and the "U.R.L.A." and will leverage an Export Behavior set on each Document Type to send their respective data to tables in a database.

Page Scope

The Scope of Page is unique in that all Batch Page objects are considered a single scope. You may have a Batch with numerous Folder Levels and each different level may have child Batch Pages. However, if an activity's Scope is Page, it doesn't matter at what level in the Batch foldering hierarchy a Batch Page may exist, they will all be targeted.

Scoping Separate

Scoping for Separate and Review is a bit unique as well. For these activities Scope is not what you are affecting, but to where you are applying the activity.

You might think "I want to separate the pages of this batch into individual folders", and assume the Scope would be Page. This would be an incorrect assumption. With Separate you don't scope it to pages, you Scope it to either Batch or Folder. The separation may be affecting the pages, but the activity itself is pointed at the container of the pages, not the pages themselves.

Setting the Scope to Batch is typical when pages have been physically scanned and they exist at the root of a Batch. This would separate the Batch Page objects of the Batch into Batch Folders.

Setting the Scope to Folder and the Folder Level to 1 (assuming there are documents at "Folder Level 1", and the Split Pages activity has been performed) is typical of digital documents that have been imported. This would separate the Batch Page objects of the "Folder Level 1" Batch Folders into Batch Folders that would exist at "Folder Level 2".

Scoping Review

Review is interesting because you want to consider how the work is being done. Again, the Scope in this case is not pointed at what you are reviewing, but rather the contents of where are you reviewing. Let's say you have a Batch with 5 folders at level 1.

Assume the following:

  • Batch Process Step
    • Activity: Review
    • Scope: Folder
    • Folder Level: 1

This will create 5 Review Jobs, one for each document at Folder Level 1. Each Job will have a single Task.

Conversely, assume the following:

  • Batch Process Step
    • Activity: Review
    • Scope: Batch

This will create one Review Job with 5 Tasks in that one job to confirm.

This of course is a consideration of end users interacting with Review. Are five uers each individually assigned their own Review Job with a signle Task, or is one user completing a single Job with five Tasks?

Scoping "Data View"

The "Data View" of the Review activity is unique. It has a property called Processing Level. This "level" is relative to the Scope property set on the Batch Process Step.

For example, assume a Batch has a document at "Folder Level 1". That document consists of two "sub-documents" that would exist at "Folder Level 2".

Assume the following:

  • Batch Process Step
    • Activity: Review
    • Scope: Folder
    • Folder Level: 1
  • "Data View" of the Review activity
    • Processing Level: Level1

This will create 2 Review Jobs, one for each document at "Folder Level 2". Each Job would have a single Task.

Conversely, assume the following:

  • Batch Process Step
    • Activity: Review
    • Scope: Batch
  • "Data View" of the Review activity
    • Processing Level: Level2

This will create the same amount of Jobs and Tasks as the above example.

Finally, assume the following:

  • Batch Process Step
    • Activity: Review
    • Scope: Batch
  • "Data View" of the Review activity
    • Processing Level: Level1

This will create 1 Job for the single document at "Folder Level 1". This single Job will consist of 2 Tasks, one for each document at "Folder Level 2".

Scoping Recognize

Another consideration with Scope is processor efficiency. As mentioned earlier the best practice for the Recognize activity is to set its Scope to Page.

Consider a Batch with just one document at Folder Level 1. Consider also that this document has 1,000 pages. If the Scope were set to Folder and the Folder Level to 1 a Job would get created with one Task. As a result only a single CPU thread would pick up that single Task and take a very long time recognizing the text on that document's 1,000 pages.

However, if (following best practice) you set the Recognize Scope to Page, a Job will get created with 1,000 Tasks (one Task per page object). Therefore, depending on how you've structured your Activity Processing services, you could have a wide array of CPU threads tackling each Task independently. This would greatly decrees the time required to recognize the text on the document.

Example: Separate > Undo Separation

One example of how scope is used in Grooper is seen below. In this example the Separate activity using the Undo Separation provider was run on a Batch containing multiple folder levels. The activity was run at three scope levels:

  1. Scope: Batch
  2. Scope: Folder
    • Folder Level: 1
  3. Scope: Folder
    • Folder Level" 2

The original Batch with three Batch Folder levels Undo Separation ran at the Batch scope.
  • All folders are removed.
Undo Separation ran at Folder > Level 1 scope.
  • All folders below the first level are removed.
Undo Separation ran at Folder > Level 2 scope.
  • All folders below the second level are removed.

Activities by Scope Options

Listed here will be every activity in Grooper organized by what options are available for scoping.

Batch, Folder, or Page

  • Execute
    • Applicable scope is dependent on the command executed by this step.

Batch and Folder

Folder and Page

Batch Only

Folder Only

Page Only