2.80:Layered OCR (OCR Engine): Difference between revisions
Configadmin (talk | contribs) Created page with "<blockquote style="font-size:14pt"> '''Layered OCR''' is an OCR engine that enables you to run multiple OCR profiles in a "layered" manner. </blockquote> == About == This e..." |
Configadmin (talk | contribs) |
||
| Line 27: | Line 27: | ||
{|cellpadding=10 cellspacing=5 style="margin:auto" | {|cellpadding=10 cellspacing=5 style="margin:auto" | ||
|-style="background-color:#36B0A7; color:white" | |-style="background-color:#36B0A7; color:white" | ||
|colspan=2|General | |colspan=2|'''General''' | ||
|-style="background-color:#ddf5f5 | |-style="background-color:#ddf5f5" | ||
|Main OCR Profile||The OCR Profile to be used to establish the base OCR output. | |style="width:17%"|Main OCR Profile||The OCR Profile to be used to establish the base OCR output. | ||
|-style="background-color:#ddf5f5 | |-style="background-color:#ddf5f5" | ||
|Layers|Defines one or more OCR Layers which supplement or repair output from the Main OCR Profile. | |Layers||Defines one or more OCR Layers which supplement or repair output from the Main OCR Profile. | ||
|} | |||
=== Layers === | === Layers === | ||
| Line 39: | Line 40: | ||
{|cellpadding=10 cellspacing=5 style="margin:auto" | {|cellpadding=10 cellspacing=5 style="margin:auto" | ||
|-style="background-color:#36B0A7; color:white" | |-style="background-color:#36B0A7; color:white" | ||
|colspan=2|General | |colspan=2|'''General''' | ||
|-style="background-color:#ddf5f5 | |-style="background-color:#ddf5f5" | ||
|OCR Profile||The OCR Profile to execute. | |style="width:17%"|OCR Profile||The OCR Profile to execute. | ||
If different from the Main OCR Profile, a new OCR operation will be performed using this profile. The Extractor will be executed against the results of this OCR operation, and any matches will be merged into the main OCR output. | If different from the Main OCR Profile, a new OCR operation will be performed using this profile. The Extractor will be executed against the results of this OCR operation, and any matches will be merged into the main OCR output. | ||
| Line 57: | Line 58: | ||
[[image: | [[image:layeredocr001.png|center|900px]] | ||
[[image: | [[image:layeredocr002.png|center|900px]] | ||
Revision as of 17:02, 16 December 2019
Layered OCR is an OCR engine that enables you to run multiple OCR profiles in a "layered" manner.
About
This engine is designed to read documents with specialized print types (such as MICR, handwriting, etc.) and to ensure as close to 100% accuracy as possible for background elements on forms.
How it works
- The "Main OCR Profile" establishes the base OCR output.
- Each subsequent layer is configured to run an additional OCR Profile and an "Extractor". This is key, as these are run independent of the "Main OCR Profile", so specialized OCR Profiles can be created and utilized to ensure the desired data is extracted properly.
- The extraction results of each subsequent layer are then merged into the final OCR output.
Use Cases
Mixed Print Types
Layered OCR's usefulness shines when needing to extract text from documents with multiple print types.
Label Repair
You can also use Layered OCR to "repair" lines of text, eliminating inaccuracies in field labels and greatly simplifying data extraction.
Properties
| General | |
| Main OCR Profile | The OCR Profile to be used to establish the base OCR output. |
| Layers | Defines one or more OCR Layers which supplement or repair output from the Main OCR Profile. |
Layers
Each layer is configured independently and has available the following configurable properties.
| General | |
| OCR Profile | The OCR Profile to execute.
If different from the Main OCR Profile, a new OCR operation will be performed using this profile. The Extractor will be executed against the results of this OCR operation, and any matches will be merged into the main OCR output. If the OCR Profile specified is the Main OCR Profile, then no new OCR is performed. The Extractor will be executed against the main OCR output, and any matches produced will be merged into the main OCR output. For example, in this mode one could fuzzy match full lines of field labels on a structured form, and repair the OCR results with the corrected output from the fuzzy match operation. |
| Extractor | The extractor used here MUST meet special requirements, or it will not produce output instances which can be used for OCR repair. Check the "Log" tab when testing OCR to see error messages for invalid matches.
|

