Five Phases of Grooper (Concept): Difference between revisions
removed tabs // via Wikitext Extension for VSCode |
Dgreenwood (talk | contribs) No edit summary |
||
(11 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
<blockquote> | <blockquote>{{#lst:Glossary|Five Phases of Grooper}}</blockquote> | ||
</blockquote> | |||
==About== | ==About== | ||
'''Grooper''' is a transient system that is meant to allow documents to move through it. The work that is done in '''Grooper''' builds the ''scaffolding'' that allows the movement through | '''Grooper''' is a transient system that is meant to allow documents to move through it. The work that is done in '''Grooper''' builds the ''scaffolding'' that allows the movement of documents ''through'' '''Grooper'''. The goal of moving documents through '''Grooper''' is to produce data to be stored in a backend system of your chose that is impactful to your business. In working with '''Grooper''', it is best to think about the flow of the documents through the system in five linear phases. | ||
# '''Acquire''' | |||
== Five Phases of Grooper == | |||
[[File:five_phases_01a.png]] | |||
=== What are the Five Phases of Grooper? === | |||
The '''Five Phases of Grooper''' are a framework to help conceptualize the path odocuments take through Grooper which help a designer more easily understand how to model with the tools available. | |||
[[File:five_phases_02b.png]] | |||
=== Five Process Phases === | |||
# '''Acquire''' the documents by bringing them into a '''[[Batch]]''' in '''Grooper'''. | |||
#* This is the first phase of the life of a document in '''Grooper'''. This is essentially the act of getting the document(s) into '''Grooper''' via means such as scanning or some type of electronic import. | #* This is the first phase of the life of a document in '''Grooper'''. This is essentially the act of getting the document(s) into '''Grooper''' via means such as scanning or some type of electronic import. | ||
# '''Condition''' | # '''Condition''' | ||
#* After documents come into '''Grooper''', they next need to be made ready for processing, or conditioned. This consists of '''[[Image Processing]]''' and | #* After documents come into '''Grooper''', they next need to be made ready for processing, or conditioned. This consists of '''[[Image Processing]]''' and '''[[Recognize|text recognition]]'''. | ||
# '''Organize''' | # '''Organize''' | ||
#* With ''cleaned'' up documents, and access to the documents' text, its critical to next identify/organize them, or specifically in '''Grooper''', '''Classify''' them. '''Grooper''' doesn't know what to do with a document unless it knows what type of document it is, and that's the point of the activities performed in this phase. An example to consider what is done here is to think about a person given a stack of documents who is tasked with putting those documents in a filing cabinet. In order for the documents to go into the right area in the filing cabinet, the person would have to know the differences between the documents and be able to identify them. | #* With ''cleaned'' up documents, and access to the documents' text, its critical to next identify/organize them, or specifically in '''Grooper''', '''[[Classify (Activity)|Classify]]''' them. '''Grooper''' doesn't know what to do with a document unless it knows what type of document it is, and that's the point of the activities performed in this phase. An example to consider what is done here is to think about a person given a stack of documents who is tasked with putting those documents in a filing cabinet. In order for the documents to go into the right area in the filing cabinet, the person would have to know the differences between the documents and be able to identify them. | ||
# '''Collect''' | # '''Collect''' | ||
#* Now knowing what document is what, it is time to | #* Now knowing what document is what, it is time to '''[[Extract (Activity)|Extract]]''' the desired information from the documents. From a construction perspective, the ''scaffolding'' required to accomplish the activities performed in this phase will take the most amount of time and effort. | ||
# '''Deliver''' | # '''Deliver''' | ||
#* Finally, it's time for the collected data to make its way out of '''Grooper''' to whatever destination system is defined by your business. This is, in a way, an inverse of '''Acquire''', and the systems that things come in through, often mirror the systems they go to. | #* Finally, it's time for the collected data to make its way out of '''Grooper''' to whatever destination system is defined by your business via '''[[Export (Activity)|Export]]'''. This is, in a way, an inverse of '''Acquire''', and the systems that things come in through, often mirror the systems they go to. | ||
[[File:five_phases_03.png]] | |||
=== Five Phases Activity Examples === | |||
Below are all the activities you can perform in '''Grooper''' organized by the "phase" they ''belong'' to. | |||
==== Acquire ==== | |||
* Scan (via the [[Review]] activity and the [[Scan Viewer]]) | |||
* Digital files are acquired via an [[Import Provider]], not an Activity step in a Batch Process. | |||
==== Condition ==== | |||
Text recognition | |||
* [[Recognize]] | |||
Image file conditioning | |||
* [[Image Processing]] | |||
PDF and TIF conditioning | |||
* [[Split Pages]] | |||
Microfiche conditioning | |||
* [[Clip Frames]] | |||
* [[Detect Frames]] | |||
* [[Initialize Card]] | |||
Book photo conditioning | |||
* [[Burst Book]] | |||
File format normalization | |||
* [[Render]] | |||
Language conditioning | |||
* [[Detect Language]] | |||
* [[Translate]] | |||
==== Organize ==== | |||
* [[Separate]] | |||
* [[Classify]] | |||
* [[Remove Level]] | |||
* [[Deduplicate]] | |||
Text file organization | |||
* [[Split Text]] | |||
==== Collect ==== | |||
* [[Extract]] | |||
The following typically happen after Extract runs. They relate to normalizing data post-collection. | |||
[[ | * [[Apply Rules]] | ||
* [[Convert Data]] | |||
==== Deliver ==== | |||
* [[Export]] | |||
* [[Dispose Batch]] | |||
These activities build files out of processed content before Export. | |||
[[ | * [[Merge]] | ||
* [[Text Transform]] | |||
* [[XML Transform]] | |||
==== Misc/Multiple Phases ==== | |||
These activities either don't fit well into one of the five phase categories or could potentially be in more than one phase, depending on how they are used. | |||
* [[Review]] | |||
[[ | * [[Execute]] | ||
* [[Correct]] | |||
* [[Redact]] | |||
* [[Spawn Batch]] | |||
* [[Send Mail]] | |||
* [[Train Lexicon]] | |||
* [[Launch Process]] | |||
[[File:five_phases_04.png]] | [[File:five_phases_04.png]] |
Latest revision as of 10:11, 2 March 2025
The "Five Phases of Grooper" is a conceptual term that seeks to build understanding of how documents are processed through Grooper.
About
Grooper is a transient system that is meant to allow documents to move through it. The work that is done in Grooper builds the scaffolding that allows the movement of documents through Grooper. The goal of moving documents through Grooper is to produce data to be stored in a backend system of your chose that is impactful to your business. In working with Grooper, it is best to think about the flow of the documents through the system in five linear phases.
Five Phases of Grooper
What are the Five Phases of Grooper?
The Five Phases of Grooper are a framework to help conceptualize the path odocuments take through Grooper which help a designer more easily understand how to model with the tools available.
Five Process Phases
- Acquire the documents by bringing them into a Batch in Grooper.
- This is the first phase of the life of a document in Grooper. This is essentially the act of getting the document(s) into Grooper via means such as scanning or some type of electronic import.
- Condition
- After documents come into Grooper, they next need to be made ready for processing, or conditioned. This consists of Image Processing and text recognition.
- Organize
- With cleaned up documents, and access to the documents' text, its critical to next identify/organize them, or specifically in Grooper, Classify them. Grooper doesn't know what to do with a document unless it knows what type of document it is, and that's the point of the activities performed in this phase. An example to consider what is done here is to think about a person given a stack of documents who is tasked with putting those documents in a filing cabinet. In order for the documents to go into the right area in the filing cabinet, the person would have to know the differences between the documents and be able to identify them.
- Collect
- Now knowing what document is what, it is time to Extract the desired information from the documents. From a construction perspective, the scaffolding required to accomplish the activities performed in this phase will take the most amount of time and effort.
- Deliver
- Finally, it's time for the collected data to make its way out of Grooper to whatever destination system is defined by your business via Export. This is, in a way, an inverse of Acquire, and the systems that things come in through, often mirror the systems they go to.
Five Phases Activity Examples
Below are all the activities you can perform in Grooper organized by the "phase" they belong to.
Acquire
- Scan (via the Review activity and the Scan Viewer)
- Digital files are acquired via an Import Provider, not an Activity step in a Batch Process.
Condition
Text recognition
Image file conditioning
PDF and TIF conditioning
Microfiche conditioning
Book photo conditioning
File format normalization
Language conditioning
Organize
Text file organization
Collect
The following typically happen after Extract runs. They relate to normalizing data post-collection.
Deliver
These activities build files out of processed content before Export.
Misc/Multiple Phases
These activities either don't fit well into one of the five phase categories or could potentially be in more than one phase, depending on how they are used.