Labeled OMR (Value Extractor): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 91: Line 91:


<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.78; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmk5uhub900m13k0i098qkeew?embed_v=2&utm_source=embed" loading="lazy" title="02 Labeled OMR with Label Extractors" allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>
<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.78; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmk5uhub900m13k0i098qkeew?embed_v=2&utm_source=embed" loading="lazy" title="02 Labeled OMR with Label Extractors" allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>
=== Configuring Labeled OMR: using List Values ===
Instead of configuring a Label extractor for the OMR options, you can type in the text of the OMR options into the List Values property. Listing the options in List Values allows the Data Field to have a drop down menu available during review where a Reviewer can select from multiple options if the extraction is missing or incorrect rather than having to type in the full text.
To set up your Labeled OMR extractor using List Values:
# On the Data Field, set the Value Extractor property to Labeled OMR.
# Scroll to the bottom of the property grid and locate the List Values property.
# Click the "⮞" to expand out the List Values sub properties.
# Click the "..." to the right of the Local Entries property to open the editor.
# Type in the text for the OMR options on the document, hitting enter after each one.
# Click "OK" in the top right of the editor.
# (Optional) Expand the Value Extractor sub properties and set an extractor on the Header Extractor property to return the label or header for the OMR information.
#* Setting a Header Extractor is not always necessary, but can help Grooper better understand where the OMR information is located on the document.
#* A Header Extractor can be useful when the text for the OMR choices shows up in other areas on the document.
# Click over to the "Tester" tab and test your extraction to ensure the desired data is extracted properly.
#* You should see the option to access a drop down for the Data Field which will have all the OMR options you typed into the List Values.
<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.78; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmkfo2d3s0088vf0i7efi6i55?embed_v=2&utm_source=embed" loading="lazy" title="03 Labeled OMR with List Values" allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>
=== Configuring Labeled OMR: using Labelsets ===
Labeled OMR is a Labelset aware Value Extractor. That means that Labeled OMR can work with a Labeling Behavior set on your Content Model. Labelsets work in combination with List Values for Labeled OMR. List Values provide the text that will appear in the Data Field upon extraction. The Labelsets will provide Grooper with the information to determine which OMR option is checked on the document.
'''Prerequisite:''' You will need to enable a Labeling Behavior on your [[Content Model]] before you will have access to the "Labels" tab. For more information on how to set up a Labeling Behavior, visit the [[Labeling Behavior]] wiki page.
# On the Data Field, set the Value Extractor property to a Labeled OMR.
# Scroll to the bottom of the property grid and locate the List Values property.
# Click the "⮞" to expand out the List Values sub properties.
# Click the "..." to the right of the Local Entries property to open the editor.
# Type in text for the OMR options on the document, hitting enter after each one.
#* When using Labelsets with Labeled Value, the List Values do not need to match the text on the document. 
# Click "OK" in the top right of the editor.
# Navigate to the Content Model and click on the "Labels" tab.
# Make sure your documents are Classified.
# Collect text on the document for each of the OMR option labels.
#* The names of the labels will be the same as the text you typed in for the List Values.
# Save your changes to your Labels.
# Navigate to your Data Model and test your extraction.
#* You should see the text from the List Values returned in the Data Field with the ability to select another OMR option using the drop down on the Data Field text box.
<div style="position: relative; box-sizing: content-box; max-height: 80vh; max-height: 80svh; width: 100%; aspect-ratio: 1.78; padding: 40px 0 40px 0;"><iframe src="https://app.supademo.com/embed/cmkfpb9by008vxq0i30x4azsm?embed_v=2&utm_source=embed" loading="lazy" title="04 Labeled OMR with Label Sets" allow="clipboard-write" frameborder="0" webkitallowfullscreen="true" mozallowfullscreen="true" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>

Revision as of 15:39, 15 January 2026

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.


This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 202320212.90
An example of checkboxes.

Labeled OMR is a Value Extractor used to output OMR checkbox labels. It determines whether labeled checkboxes are checked or not. If checked, it outputs the label(s) or a Boolean true/false value as the result.

Introduction

OMR boxes (Optical Mark Recognition) are small shapes printed on documents (typically squares or circles) that users fill or check to indicate a choice (for example, ☑ Yes ☐ No). Grooper detects whether each box is checked and converts these marks into data values for a Data Field.

Labeled OMR detects checkboxes based on nearby labels. It can:

  • Use a configured Label Extractor to find labels.
  • Automatically use labels from a Label Set defined on the Data Field (no Label Extractor required).
  • Optionally use a header label to disambiguate one group of checkboxes from other similar groups on the same page.

How it differs from other Value Extractors:

  • Unlike text-based extractors (e.g., Pattern, List, or Data Type), Labeled OMR reads visual checkboxes and links them to nearby text labels.
  • Unlike Ordered OMR (region + ordered positions) and Zonal OMR (manually defined zones), Labeled OMR anchors to labels found on the page and then locates checkboxes near those labels.

Differences vs. Ordered OMR and Zonal OMR:

  • Ordered OMR assigns values based on the fixed order of checkboxes within a rectangular region. Use it when label text is unreliable or absent, but the positions and order of boxes are consistent.
  • Zonal OMR uses manually configured zones for each checkbox. Use it when checkbox locations are fixed and known per page layout.
  • Labeled OMR uses labels detected at runtime and finds nearby checkboxes. Use it when forms have variable placement or multiple repeated groups, and labels are the reliable anchor.

When to use

Use Labeled OMR when:

  • Checkbox choices are printed with identifiable labels near each box (e.g., Yes, No, Undecided).
  • The form may contain repeated groups or variable layouts that make fixed zones or strict ordering unsuitable.
  • You want automatic label awareness via the Data Field's Label Set or Choice List.

Real-world example (preferred over Ordered OMR or Zonal OMR):

  1. A multi-page survey where “Yes ☐ No ☐ Undecided ☐” appears under several different questions with varying positions. Because labels are reliable, Labeled OMR can find the correct group under a specific header (e.g., Attending next semester?) and read the checkboxes near those labels—even if the exact location shifts among pages and documents.

Prerequisites:

  • For rectangular checkboxes, ensure the page has layout data including Box Removal obtained during Recognize or Image Processing.
  • Provide labels either by:
    • Configuring a Label Extractor, or
    • Defining a Label Set on the Data Field.

How to configure Labeled OMR

There are a few different ways to configure the Labeled OMR extractors. You can use Label Extractors, List Values, and Label Sets. We will first discuss obtaining Layout Data for our documents, then go through each method for obtaining labels for Labeled OMR.

Prerequisite: layout data for documents

Before configuring a Labeled OMR extractor, you must ensure you have Layout Data on our documents that includes the detection of boxes. To do this you will need to configure an IP Profile with (at minimum) a Box Detection IP Step. You will then need to reference the IP Profile in either an Image Processing or Recognize Step in a Batch Process.

  1. In your node tree, create a new IP Profile if you do not already have one available and add a Box Detection IP Step.
  2. Reference your IP Profile on your Recognize Step in one of two ways:
    • If your documents require OCR, reference the IP Profile on the OCR Profile you will be using on your Recognize Step.
    • If your documents do not require OCR, reference the IP Profile on the Alternate IP property on your Recognize Step.
    • If you find that your documents need more comprehensive Image Processing, you can run the IP Profile on an Image Processing Batch Process Step.
  3. Save your changes to your Recognize Step.
  4. Navigate to the "Activity Tester" tab of the Recognize Step and test on the Batch.
    • If you have an Activity Processor running you can submit a job to run the Recognize Step, otherwise, select the objects you want to run Recognize on and click the test icon.
  5. Now select the object you ran Recognize on and click the Renditions icon located at the top right of the Document Viewer.
  6. Select the "Layout" view from the drop down.
  7. Now you should be able to see the layout data that was collected.
    • Empty boxes will be highlighted in pink.
    • Checked boxes will be highlighted in green.

FYI

If you would like to follow along with the demo below, download the Project and Batch at the beginning of this Wiki Article.

Configuring Labeled OMR: using Label Extractors

There are three methods to setting up a Labeled OMR Extractor. The first option is going to be setting up Label Extractors on the Labeled OMR.

  1. On the Data Field, set the Value Extractor property to Labeled OMR.
  2. Expand the Labeled OMR sub properties.
  3. Set the Label Extractor to a List Match and open the List Match editor by clicking the "..." icon to the right of the property.
  4. Type in the name of each of the OMR options. Hit Enter on your keyboard after each entry.
  5. When finished, click "OK" in the top right of the pop up window.
  6. (Optional) Set an extractor on the Header Extractor property to return the label or header for the OMR information. In our example below, we used a List Match.
    • Setting a Header Extractor is not always necessary, but can help Grooper better understand where the OMR information is located on the document.
    • A Header Extractor can be useful when the text for the OMR choices shows up in other areas on the document.
  7. Click over to the "Tester" tab and test your extraction to ensure the desired data is extracted properly.


Configuring Labeled OMR: using List Values

Instead of configuring a Label extractor for the OMR options, you can type in the text of the OMR options into the List Values property. Listing the options in List Values allows the Data Field to have a drop down menu available during review where a Reviewer can select from multiple options if the extraction is missing or incorrect rather than having to type in the full text.

To set up your Labeled OMR extractor using List Values:

  1. On the Data Field, set the Value Extractor property to Labeled OMR.
  2. Scroll to the bottom of the property grid and locate the List Values property.
  3. Click the "⮞" to expand out the List Values sub properties.
  4. Click the "..." to the right of the Local Entries property to open the editor.
  5. Type in the text for the OMR options on the document, hitting enter after each one.
  6. Click "OK" in the top right of the editor.
  7. (Optional) Expand the Value Extractor sub properties and set an extractor on the Header Extractor property to return the label or header for the OMR information.
    • Setting a Header Extractor is not always necessary, but can help Grooper better understand where the OMR information is located on the document.
    • A Header Extractor can be useful when the text for the OMR choices shows up in other areas on the document.
  8. Click over to the "Tester" tab and test your extraction to ensure the desired data is extracted properly.
    • You should see the option to access a drop down for the Data Field which will have all the OMR options you typed into the List Values.


Configuring Labeled OMR: using Labelsets

Labeled OMR is a Labelset aware Value Extractor. That means that Labeled OMR can work with a Labeling Behavior set on your Content Model. Labelsets work in combination with List Values for Labeled OMR. List Values provide the text that will appear in the Data Field upon extraction. The Labelsets will provide Grooper with the information to determine which OMR option is checked on the document.

Prerequisite: You will need to enable a Labeling Behavior on your Content Model before you will have access to the "Labels" tab. For more information on how to set up a Labeling Behavior, visit the Labeling Behavior wiki page.

  1. On the Data Field, set the Value Extractor property to a Labeled OMR.
  2. Scroll to the bottom of the property grid and locate the List Values property.
  3. Click the "⮞" to expand out the List Values sub properties.
  4. Click the "..." to the right of the Local Entries property to open the editor.
  5. Type in text for the OMR options on the document, hitting enter after each one.
    • When using Labelsets with Labeled Value, the List Values do not need to match the text on the document.
  6. Click "OK" in the top right of the editor.
  7. Navigate to the Content Model and click on the "Labels" tab.
  8. Make sure your documents are Classified.
  9. Collect text on the document for each of the OMR option labels.
    • The names of the labels will be the same as the text you typed in for the List Values.
  10. Save your changes to your Labels.
  11. Navigate to your Data Model and test your extraction.
    • You should see the text from the List Values returned in the Data Field with the ability to select another OMR option using the drop down on the Data Field text box.