Azure OCR (OCR Engine): Difference between revisions

From Grooper Wiki
No edit summary
Line 151: Line 151:
[[File:2024 Azure-OCR 02 03 02.png]]
[[File:2024 Azure-OCR 02 03 02.png]]


== Known Concerns ==
== Segment Reprocessing incompatibility ==


Azure OCR is NOT compatible with the Segment Reprocessing feature. The Read API will return an error if you attempt to configure this property in an OCR Profile. The image snippets Grooper uses to reprocess the segments fall under Azure's minimum image dimensions.
Azure OCR is NOT compatible with the Segment Reprocessing feature. The Read API will return an error if you attempt to configure this property in an OCR Profile. The image snippets Grooper uses to reprocess the segments fall under Azure's minimum image dimensions.

Revision as of 14:16, 3 April 2025

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Azure OCR is an OCR Engine option for OCR Profiles that utilizes Microsoft Azure's Read API. Azure's Read engine is an AI-based text recognition software that uses a convolutional neural network (CNN) to recognize text. Compared to traditional OCR engines, it yields superior results, especially for handwritten text and poor quality images. Furthermore, Grooper supplements Azure's results with those from a traditional OCR engine in areas where traditional OCR is better than the Read engine.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2024). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

About

Azure OCR is different from traditional OCR Engines. It uses Microsoft Azure's Read OCR engine which is a CNN (Convolutional Neural Network) based OCR engine. This uses a deep learning neural network to recognize characters from full images (as opposed to the segmentation-based matrix matching methods traditional OCR engines use).

Unlike traditional OCR, Microsoft's Azure Read engine has a far higher accuracy over all when recognizing text from an image. It is capable of recognizing hand printed characters, which many traditional OCR engines cannot do at all. It is also as good or better at a wide variety of machine print fonts as traditional OCR engines. This includes specialized fonts like MICR. Furthermore, due to the way this neural network has been trained, the Azure Read engine is less depending on image pre-processing than traditional OCR engines. This eliminates the need for complicated IP Profiles when using Azure OCR in Grooper.

However, the Azure Read engine does have some drawbacks compared to traditional OCR. Unlike traditional OCR engines, like Transym, the Azure Read engine does not return the pixel-accurate position of characters. It only gives us an approximation. This can cause problems for extractors that are reliant on character/text positions such as Labeled Value, Labeled OMR, or Tabular Layout. The Azure Read engine also does not always capture data correctly from "data dense" documents littered with text on tables. This is particularly the case for tabular data with cells containing single characters, (such "0" or "1" or "A"). This can make collecting certain data structures problematic.

Grooper's implementation of the Azure Read engine compensates for these shortcomings (and more). Using our Azure OCR offering, a traditional OCR engine (Transym) runs in parallel with Azure's Read engine. Results from Transym supplement Azure's results with more accurate character positions and values in areas we have found Azure to be deficient. This gives us the best in both worlds in terms of the Azure Read engine's strengths and traditional OCR engines' strengths.

Supported image types: JPEG, PNG, BMP, PDF, and TIFF

FYI

You will need an API key from Azure when setting up an OCR Profile that uses Azure OCR. If you do not have an API key and need some instructions on how to create one in Azure, visit the following link in our "Grooper and AI" article:

Azure OCR Quickstart

Azure OCR strengths over traditional OCR

Azure OCR has many advantages over traditional OCR engines (such as Transym).

  • Azure OCR is less dependent on image processing than traditional OCR engines.
  • It can handle poor quality images and even photos taken with a digital camera or phone camera without the aid of an IP Profile.
  • Azure OCR has exceptionally good handwritten text recognition. Traditional OCR engines have poor handwriting recognition or none at all.
  • Azure OCR's machine print recognition is generally as good or better than traditional OCR engines. This is particularly the case for lexical data (i.e. words).


In the screenshots below, we can see the difference between using traditional OCR and Azure OCR on a document that has small text and handwriting.

  1. In the first screenshot, we can see the result of using traditional OCR. The traditional OCR is not equipped to handle handwriting, and the small print with minimal spaces between the characters makes it very difficult for traditional OCR.


  1. In this second screenshot, we have used Azure OCR on the same document. Azure OCR relies on the CNN AI training rather than individual character analysis. Azure OCR does a much better job at returning accurate data for this particular document, even the handwritten sections.

How Grooper overcomes Azure OCR drawbacks

Microsoft Azure OCR by itself is not perfect. There are things that traditional OCR is better at capturing than Azure.

Some things Azure OCR is not as adept at on its own include:

  • Pixel-perfect character positions. Azure's character position data is more of an approximation than an exact location. This can be especially problematic when using extractors that heavily rely on positioning data (such as the Labeled OMR extractor or the Tabular Layout table extract method.
  • Totally accurate recognition of numeric data. This is particularly the case for small numbers on highly dense documents (those that have a lot of text on them). It is often the case Azure will miss small digits like 0s and 1s.


Grooper's implementation of Azure OCR does not solely rely on Azure for recognizing text. Instead, when selecting Azure OCR as your OCR Engine both Azure and a traditional OCR engine will run and Grooper will combine the results.

  • By default, Grooper runs an implementation of the Transym OCR engine. However, users may customize the traditional OCR engine by selecting an OCR Profile using the Traditional OCR Profile property.

Next, we will detail an example where Azure OCR fails on its own and Grooper supplements the results with a traditional OCR engine.

  1. In the screenshot below, we are looking at the Diagnostics page after running Recognize configured with Azure OCR on a document.
  2. In the Diagnostics, the "Azure Words.tif" will show what Azure OCR by itself returned.
  3. In this case, there are two 0s that are not being captured at all by Azure OCR. They are small numbers that have been skipped.
  4. We also see that Azure OCR found all numeric values in the PAID AMT column in the table, but the positioning data is not accurate.


  1. If we select the "Alignment.tif" in the Diagnostics tree on the left, we can see the combined result of Azure OCR and the traditional OCR Engine.
  2. The characters and text segments on the document highlighted in orange are corrections made from the results of traditional OCR. The traditional OCR Engine detected the 0s that Azure OCR missed.
  3. Grooper also determined that the traditional OCR Engine did a better job at recognizing one of the numeric values whose position data was not accurately detected by Azure OCR.


The result of combining both OCR Engines is what Grooper will actually recognize from the document.

Azure OCR in a Docker container

Azure AI services may be hosted in a Docker container. This lets you "self-host" Azure OCR. It lets you use the same APIs available in Azure, but on-premises. Reasons to do this include compliance, security or other operational concerns.

More information on deploying Azure AI services in containers can be found in Microsoft's Azure AI containers overview.

When connecting to Azure AI Services hosted in a Docker container, you must enter the URL for the Azure AI container with the "vision" endpoint specified. The URL should look like this:

http://<container ip-address>:5000/vision

Be aware:

  • "vision" must be lower-case.
  • 5000 is the default port for the container.

How to

To use Azure OCR you will need to add and configure an OCR Profile. Then you will need to add that OCR Profile to your Recognize Batch Process Step. Then you can test your Step or run the Batch Process when complete.

Setting up the OCR Profile

Adding an OCR Profile

  1. Right-click on the Project or folder inside of your Project in your Node Tree where you want to add your OCR Profile.
  2. Hover over "Add".
  3. Click on "OCR Profile..."


  1. Enter in your desired name for your OCR Profile in the Name property field.
  2. Click "EXECUTE" in the top right-hand corner of the pop-up window to create your OCR Profile.


  1. Now you should have a new OCR Profile in your Node Tree.


Configuring the OCR Profile

  1. Click the hamburger icon to the right of the OCR Engine property to access the drop down menu.
  2. Select Azure OCR from the drop down menu.


  1. Copy and paste your unique API Key into the API Key property and select your API Region from the drop down menu accessed by clicking on the hamburger icon next to the property.
  2. Optionally, you can add a Traditional Ocr Profile. If this property is left blank, Grooper will run a basic Traditional OCR Engine (Transym) in addition to Azure OCR. If you would like to override the default, you can select a different OCR Profile here.
  3. Click the save icon in the top right of the property grid to save your changes.

Adding the OCR Profile to the Recognize step

You will need to add a Batch Process Step configured with the Recognize Activity to your Batch Process. In the Step Properties, ensure the Activity is set to Recognize and the Scope is appropriate for your processing level. For help with setting up your Batch Process, take a look at our Batch Process article.

  1. Add and select the Recognize step in your Batch Process in the Node Tree.
  2. Click on the hamburger icon to the right of the OCR Profile property to access the navigation drop down.
  3. Navigate to and select the OCR Profile that has been configured with the Azure OCR engine.


  1. Finish configuring your Batch Process Step and then click the save icon located in the top right of the Step Properties property grid to save your changes.

Segment Reprocessing incompatibility

Azure OCR is NOT compatible with the Segment Reprocessing feature. The Read API will return an error if you attempt to configure this property in an OCR Profile. The image snippets Grooper uses to reprocess the segments fall under Azure's minimum image dimensions.