2.80:Layered OCR (OCR Engine): Difference between revisions
Configadmin (talk | contribs) |
Dgreenwood (talk | contribs) No edit summary |
||
| Line 1: | Line 1: | ||
<blockquote style="font-size:14pt"> | <blockquote style="font-size:14pt"> | ||
''Layered OCR'' enables you to run secondary [[OCR Profile]]s on a single page. The OCR results from these secondary OCR Profiles are merged with (or ''layered'' on top of) the primary OCR Profile's results. | |||
</blockquote> | </blockquote> | ||
== About == | == About == | ||
You can use ''Layered OCR'' by selecting it as your '''''OCR Engine''''' in an '''OCR Profile'''. While not itself an [[OCR Engine]], such as Transym or Tesseract, it allows you to obtain OCR text with multiple OCR Profiles, each using their own OCR engines. | |||
[[File:Layered OCR 01.png|center|1000px]] | |||
For example, certain OCR engines have advantages over others in specific cases. Transym performs well in most cases. However, it does not do well with certain specialized print types, such as [https://en.wikipedia.org/wiki/Magnetic_ink_character_recognition MICR] or handwriting. Another engine may perform better in these cases. Microsoft's [https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ Azure Computer Vision] does better than most OCR engines at recognizing handwriting (but requires a licence key from Microsoft). Google's [https://en.wikipedia.org/wiki/Tesseract_(software) Tesseract] has the capability to train fonts. Grooper ships with both Transym and Tesseract as selectable OCR engines. Furthermore, training files for the MICR, OCR-A, and OCR-B fonts are included. | |||
{|style="margin:auto; text-align:center" | |||
|colspan=2|For the check below, an OCR Profile using Transym performed well generally, but failed to read the MICR line at the bottom. Tesseract got the MICR line, but had issues recognizing other parts of the check. | |||
|- | |||
|colspan=2|[[file:Layered OCR Checks 01.png]] | |||
|- | |||
|Transym accurately reads the check's text ''except'' for the MICR Line||Tesseract has issues with other parts of the check. | |||
|- | |||
|[[file:Layered OCR Checks 02.png|border]]||[[file:Layered OCR Checks 03.png|border]] | |||
|- | |||
|colspan=2|With ''Layered OCR'' you can use an OCR Profile using Transym as your primary or baseline OCR results (seen in teal), and target the MICR line with an extractor to pull the results from an OCR Profile using Tesseract (seen in orange). | |||
|- | |||
|colspan=2|[[file:Layered OCR Checks 04.png|border]] | |||
Layered OCR has three basic steps. | |||
# | |||
<!-- | |||
This engine is designed to read documents with specialized print types (such as [https://en.wikipedia.org/wiki/Magnetic_ink_character_recognition MICR], handwriting, etc.) and to ensure as close to 100% accuracy as possible for background elements on forms. | This engine is designed to read documents with specialized print types (such as [https://en.wikipedia.org/wiki/Magnetic_ink_character_recognition MICR], handwriting, etc.) and to ensure as close to 100% accuracy as possible for background elements on forms. | ||
Revision as of 09:59, 9 July 2020
Layered OCR enables you to run secondary OCR Profiles on a single page. The OCR results from these secondary OCR Profiles are merged with (or layered on top of) the primary OCR Profile's results.
About
You can use Layered OCR by selecting it as your OCR Engine in an OCR Profile. While not itself an OCR Engine, such as Transym or Tesseract, it allows you to obtain OCR text with multiple OCR Profiles, each using their own OCR engines.

For example, certain OCR engines have advantages over others in specific cases. Transym performs well in most cases. However, it does not do well with certain specialized print types, such as MICR or handwriting. Another engine may perform better in these cases. Microsoft's Azure Computer Vision does better than most OCR engines at recognizing handwriting (but requires a licence key from Microsoft). Google's Tesseract has the capability to train fonts. Grooper ships with both Transym and Tesseract as selectable OCR engines. Furthermore, training files for the MICR, OCR-A, and OCR-B fonts are included.



