2023:Scope (Property): Difference between revisions

From Grooper Wiki
Line 15: Line 15:


== About ==
== About ==
An important thing to understand about '''''Scope''''' is that for nearly every activity you are telling '''Grooper''' ''what'' you want to apply the activity to. For example, '''Recognize''', do you want to affect ''Pages'' or ''Folders'' (keep in mind, however, that in nearly every scenario it is considered best practice to scope '''Recognize''' to ''Page''). Or, '''Extract''', do you want to extract data from folders at level 1, or level 2, etc.
An important thing to understand about '''''Scope''''' is that for nearly every activity you are telling '''Grooper''' ''what'' you want to apply the activity to. For example, '''[[Recognize (Activity)|Recognize]]''', do you want to affect ''Pages'' or ''Folders'' (keep in mind, however, that in nearly every scenario it is considered best practice to scope '''Recognize''' to ''Page''). Or, '''[[Extract (Activity)|Extract]]''', do you want to extract data from folders at level 1, or level 2, etc.


=== Scoping Review ===
The exception to this is '''[[Separate (Activity)|Separate]]''' and '''[[Review (Activity)|Review]]'''. For these activities '''''Scope''''' is not ''what'' you are affecting, but to ''where'' you are applying the activity.


The exception to this is '''Separate''' and '''Review'''. For these activities '''''Scope''''' is not ''what'' you are affecting, but to ''where'' you are applying the activity.
You might think "I want to separate the pages of this batch into individual folders", and assume the '''''Scope''''' would be ''Page''. This would be an incorrect assumption. With '''Separate''' you don't scope it to pages, you scope it to a specific '''''Folder Level''''' and the contents of that '''''Folder Level''''' are separated.


You might think "I want to separate the pages of this batch into individual folders", and assume the '''''Scope''''' would be ''Page''. This would be an incorrect assumption. With '''Separate''' you don't scope it to pages, you scope it to a specific '''''Folder Level''''' and the contents of that '''''Folder Level''''' are separated.
'''Review''' is interesting because you want to consider how the work is being done. Again, the '''''Scope''''' in this case is not pointed at ''what'' you are reviewing, but rather the contents of ''where'' are you reviewing. Let's say you have a '''[[Batch]]''' with 5 folders at level 1. If you set the '''''Scope''''' for '''Review''' to ''Folder'' and the '''''Folder Level''''' to ''1'', then you will create 5 review jobs, one for each document at '''''Folder Level''''' ''1''. If, however, you set the '''''Scope''''' to ''Batch'', it will create one '''Review''' job with 5 '''[[Task|Tasks]]''' in that one job to confirm.


'''Review''' is interesting because you want to consider how the work is being done. Again, the '''''Scope''''' in this case is not pointed at ''what'' you are reviewing, but rather the contents of ''where'' are you reviewing. Let's say you have a '''Batch''' with 5 folders at level 1. If you set the '''''Scope''''' for '''Review''' to ''Folder'' and the '''''Folder Level''''' to ''1'', then you will create 5 review jobs, one for each document at '''''Folder Level''''' ''1''. If, however, you set the '''''Scope''''' to ''Batch'', it will create one '''Review''' job with 5 '''[[Task|Tasks]]''' in that one job to confirm.
=== Page Scope ===
The '''''Scope''''' of ''Page'' is unique in that all '''Batch Page''' objects are considered a single scope. You may have a '''Batch''' with numerous '''''Folder Levels''''' and each different level may have child '''Batch Pages'''. However, if an activity's '''''Scope''''' is ''Page'', it doesn't matter at what level in the '''Batch''' foldering heirarchy a '''Batch Page''' may exist, they will all be targeted.


=== Scoping Recognize ===
Another consideration with '''''Scope''''' is processor efficiency. As mentioned earlier the best practice for the '''Recognize''' activity is to set its '''''Scope''''' to ''Page''.


The '''''Scope''''' of ''Page'' is unique in that all '''Batch Page''' objects are considered a single scope. You may have a '''Batch''' with numerous '''''Folder Levels''''' and each different level may have child '''Batch Pages'''. However, if an activity's '''''Scope''''' is ''Page'', it doesn't matter at what level in the '''Batch''' foldering heirarchy a '''Batch Page''' may exist, they will all be targeted.
Consider a '''Batch''' with just one document at Folder Level 1. Consider also that this document has 1,000 pages. If the '''''Scope''''' were set to ''Folder'' and the '''''Folder Level''''' to ''1'' a '''[[Job]]''' would get created with one '''Task'''. As a result only a single CPU thread would pick up that single '''Task''' and take a very long time recognizing the text on that document's 1,000 pages.


However, if (following best practice) you set the '''Recognize''' '''''Scope''''' to ''Page'', a '''Job''' will get created with 1,000 '''Tasks''' (one '''Task''' per page object). Therefore, depending on how you've structured your '''[[Activity Processing (Service)|Activity Processing Services]]''', you could have a wide array of CPU threads tackling each '''Task''' independantly. This would greatly decress the time required to recognize the text on the document.


=== Example: Separate > Undo Separation ===
One example of how scope is used in Grooper is seen below.  In this example the '''Separate''' activity using the '''''[[Undo Separation]]''''' provider was run on a Batch containing multiple folder levels.  The activity was run at three scope levels:
One example of how scope is used in Grooper is seen below.  In this example the '''Separate''' activity using the '''''[[Undo Separation]]''''' provider was run on a Batch containing multiple folder levels.  The activity was run at three scope levels:



Revision as of 11:34, 20 March 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

202520232.72

The Scope property of an Activity determines at which level in a Batch hierarchy the Activity runs.

Activities can run at different levels in a Batch.

They can run on the following:

  • Batch
  • Batch Folders within the batch (including document folders and generic folders)
  • Individual Batch Page level 


About

An important thing to understand about Scope is that for nearly every activity you are telling Grooper what you want to apply the activity to. For example, Recognize, do you want to affect Pages or Folders (keep in mind, however, that in nearly every scenario it is considered best practice to scope Recognize to Page). Or, Extract, do you want to extract data from folders at level 1, or level 2, etc.

Scoping Review

The exception to this is Separate and Review. For these activities Scope is not what you are affecting, but to where you are applying the activity.

You might think "I want to separate the pages of this batch into individual folders", and assume the Scope would be Page. This would be an incorrect assumption. With Separate you don't scope it to pages, you scope it to a specific Folder Level and the contents of that Folder Level are separated.

Review is interesting because you want to consider how the work is being done. Again, the Scope in this case is not pointed at what you are reviewing, but rather the contents of where are you reviewing. Let's say you have a Batch with 5 folders at level 1. If you set the Scope for Review to Folder and the Folder Level to 1, then you will create 5 review jobs, one for each document at Folder Level 1. If, however, you set the Scope to Batch, it will create one Review job with 5 Tasks in that one job to confirm.

Page Scope

The Scope of Page is unique in that all Batch Page objects are considered a single scope. You may have a Batch with numerous Folder Levels and each different level may have child Batch Pages. However, if an activity's Scope is Page, it doesn't matter at what level in the Batch foldering heirarchy a Batch Page may exist, they will all be targeted.

Scoping Recognize

Another consideration with Scope is processor efficiency. As mentioned earlier the best practice for the Recognize activity is to set its Scope to Page.

Consider a Batch with just one document at Folder Level 1. Consider also that this document has 1,000 pages. If the Scope were set to Folder and the Folder Level to 1 a Job would get created with one Task. As a result only a single CPU thread would pick up that single Task and take a very long time recognizing the text on that document's 1,000 pages.

However, if (following best practice) you set the Recognize Scope to Page, a Job will get created with 1,000 Tasks (one Task per page object). Therefore, depending on how you've structured your Activity Processing Services, you could have a wide array of CPU threads tackling each Task independantly. This would greatly decress the time required to recognize the text on the document.

Example: Separate > Undo Separation

One example of how scope is used in Grooper is seen below. In this example the Separate activity using the Undo Separation provider was run on a Batch containing multiple folder levels. The activity was run at three scope levels:

  1. Scope: Batch
  2. Scope: Folder
    • Folder Level: 1
  3. Scope: Folder
    • Folder Level" 2
The original Batch with three Batch Folder levels Undo Separation ran at the Batch scope.
  • All folders are removed.
Undo Separation ran at Folder > Level 1 scope.
  • All folders below the first level are removed.
Undo Separation ran at Folder > Level 2 scope.
  • All folders below the second level are removed.