Azure OCR (OCR Engine): Difference between revisions

From Grooper Wiki
// via Wikitext Extension for VSCode
// via Wikitext Extension for VSCode
Line 33: Line 33:


== How To ==
== How To ==
To use ''Azure OCR'' you will need to add and configure an '''OCR Profile'''. Then you will need to add that '''OCR Profile''' to your ''Recognize'' '''Batch Process Step'''. Then you can test your '''Step''' or run the '''Batch Process''' when complete.
=== Setting up the OCR Profile ===
<b><big>Adding an OCR Profile</big></b>
# Right-click on the '''Project''' or folder inside of your '''Project''' in your Node Tree where you want to add your '''OCR Profile'''.
# Hover over "Add".
# Click on "OCR Profile..."
[[File:2024 Azure-OCR 02 01 01.png]]
#<li value=4> Enter in your desired name for your '''OCR Profile''' in the '''''Name''''' property field.
# Click "EXECUTE" in the top right-hand corner of the pop-up window to create your '''OCR Profile'''.
[[File:2024 Azure-OCR 02 01 02.png]]
#<li value=6> Now you should have a new '''OCR Profile''' in your Node Tree.
[[File:2024 Azure-OCR 02 01 03.png]]
<b><big>Configuring the OCR Profile</big></b>
# Click the hamburger icon to the right of the '''''OCR Engine''''' property to access the drop down menu.
# Select ''Azure OCR'' from the drop down menu.
[[File:2024 Azure-OCR 02 02 01.png]]
#<li value=3> Copy and paste your unique API Key into the '''''API Key''''' property and select your '''''API Region''''' from the drop down menu accessed by clicking on the hamburger icon next to the property.
# Optionally, you can add a '''''Traditional Ocr Profile'''''. If this property is left blank, Grooper will run a basic Traditional '''OCR Profile''' that is default for ''Azure OCR''. If you would like to override the default, you can select a different '''OCR Profile''' here.
# Click the save icon in the top right of the property grid to save your changes.
[[File:2024 Azure-OCR 02 02 02.png]]
=== Adding the OCR Profile to the Recognize Step ===
You will need to add a '''Batch Process Step''' configured with the ''Recognize Activity'' to your '''Batch Process'''. You will also need to configure the Step Properties such as the '''''Activity''''' and '''''Scope'''''. For help with setting up your '''Batch Process''', take a look at our [[Batch Process (Object)|Batch Process]] article.
# Add and select the '''Recognize Step''' in your '''Batch Process''' in the Node Tree.
# Click on the hamburger icon to the right of the '''''OCR Profile''''' property to access the navigation drop down.
# Navigate to and select the '''OCR Profile''' that has been configured with the ''Azure OCR'' '''OCR Engine'''.
[[File:2024 Azure-OCR 02 03 01.png]]
#<li value=4> Finish configuring your '''Batch Process Step''' and then click the save icon located in the top right of the Step Properties property grid to save your changes.
[[File:2024 Azure-OCR 02 03 02.png]]


== Glossary ==
== Glossary ==

Revision as of 11:00, 3 October 2024

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.


This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Azure OCR is an OCR Engine option for OCR Profiles that utilizes Microsoft Azure's Read API. Azure's Read engine is an AI-based text recognition software that uses a convolutional neural network (CNN) to recognize text. Compared to traditional OCR engines, it yields superior results, especially for handwritten text and poor quality images. Furthermore, Grooper supplements Azure's results with those from a traditional OCR engine in areas where traditional OCR is better than the Read engine.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2024). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

About

Azure OCR is different from traditional OCR Engines. It is a CNN (Convolutional Neural Network) based OCR Engine meaning that it is AI based. Due to the way this neural network has been trained, Azure OCR is less dependent on Image Processing.

Unlike traditional OCR, Azure OCR has a far higher accuracy when recognizing handwritten text on documents. However, Azure OCR alone does not give 100% accurate position data of characters, it only gives us an approximation. This can cause problems for extractors that are reliant on character/text positions such as Labeled Value, Labeled OMR, or Tabular Layout. Azure OCR also does not always capture smaller numeric values such as 1s and 0s. This can make collecting some data problematic.

To compensate, a traditional OCR Engine (Transym) runs at the same time when using Azure OCR because traditional OCR is highly effective at obtaining position data and can capture smaller values. A traditional OCR Engine is more dependent on Image Processing. When choosing Azure OCR, a default set of Image Processing steps are applied to the document so the traditional OCR Engine to improve OCR accuracy.

Grooper attempts to return the most accurate results from both the Azure OCR and the traditional OCR Engine.


How To

To use Azure OCR you will need to add and configure an OCR Profile. Then you will need to add that OCR Profile to your Recognize Batch Process Step. Then you can test your Step or run the Batch Process when complete.

Setting up the OCR Profile

Adding an OCR Profile

  1. Right-click on the Project or folder inside of your Project in your Node Tree where you want to add your OCR Profile.
  2. Hover over "Add".
  3. Click on "OCR Profile..."


  1. Enter in your desired name for your OCR Profile in the Name property field.
  2. Click "EXECUTE" in the top right-hand corner of the pop-up window to create your OCR Profile.


  1. Now you should have a new OCR Profile in your Node Tree.


Configuring the OCR Profile

  1. Click the hamburger icon to the right of the OCR Engine property to access the drop down menu.
  2. Select Azure OCR from the drop down menu.


  1. Copy and paste your unique API Key into the API Key property and select your API Region from the drop down menu accessed by clicking on the hamburger icon next to the property.
  2. Optionally, you can add a Traditional Ocr Profile. If this property is left blank, Grooper will run a basic Traditional OCR Profile that is default for Azure OCR. If you would like to override the default, you can select a different OCR Profile here.
  3. Click the save icon in the top right of the property grid to save your changes.

Adding the OCR Profile to the Recognize Step

You will need to add a Batch Process Step configured with the Recognize Activity to your Batch Process. You will also need to configure the Step Properties such as the Activity and Scope. For help with setting up your Batch Process, take a look at our Batch Process article.

  1. Add and select the Recognize Step in your Batch Process in the Node Tree.
  2. Click on the hamburger icon to the right of the OCR Profile property to access the navigation drop down.
  3. Navigate to and select the OCR Profile that has been configured with the Azure OCR OCR Engine.


  1. Finish configuring your Batch Process Step and then click the save icon located in the top right of the Step Properties property grid to save your changes.

Glossary