OMR Reader (Result Processor)

From Grooper Wiki

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 2.80

OLDER TECH DETECTED!!

The OMR Reader Result Processor is still configurable for Data Type extractors in Grooper. However, this is a largely outdated way of doing things as of version 2021.

Now, Labeled OMR extractor is now generally preferred over OMR Reader.


The OMR Reader in Grooper is a result processor configured on Data Type extractor nodes. It is designed to detect and extract values from optical marks—most commonly checkboxes—on scanned documents. It enables organizations to automate the reading of forms, surveys, ballots, and other documents where user selections are indicated by marking boxes. The OMR Reader is a core component for capturing structured data from paper-based sources, ensuring accuracy and efficiency in data extraction workflows.

What is OMR Reader?

The OMR Reader is a result processor that associates each extracted label (such as a question or option) with a nearby optical mark (checkbox) on a document. It determines whether each checkbox is checked or unchecked and outputs the result in a format suitable for downstream processing. The OMR Reader is highly configurable, allowing users to tailor its behavior to match the layout and requirements of their forms.

Key Use Cases

  • Automated survey and ballot processing
  • Reading attendance or consent checkboxes
  • Extracting selections from standardized tests
  • Any scenario where user input is captured via checkboxes or similar marks

How does OMR Reader work?

The OMR Reader operates by searching for checkboxes (OMR boxes) near each label detected by an upstream extractor. It uses spatial rules to determine which box is associated with which label, based on configurable properties such as direction and distance. The result is a set of extracted values indicating which options were selected.

Configuration Overview

To use the OMR Reader effectively, configure the following key properties:

  • Box Location: Specifies the direction(s) (e.g., left, right, above, below) in which to search for checkboxes relative to each label.
  • Max Distance: Sets the maximum allowed distance between a label and its associated checkbox.
  • Mode: Determines how checked/unchecked status is interpreted and output. Options include:
    • CheckOne: Only one box may be checked per label.
    • CheckMulti: Multiple boxes may be checked per label.
    • Boolean: Each label is associated with a single box, outputting a true/false value.
  • Separator String (CheckMulti mode only): Defines how multiple checked values are joined in the output.
  • Value If Checked and Value If Unchecked (Boolean mode only): Specify the output values for checked and unchecked boxes.

Typical Workflow

  1. Preprocessing: Ensure documents are processed with Box Detection or Box Removal during Image Processing or Recognize activities. This step is essential for reliable checkbox detection.
  2. Label Extraction: Use an upstream extractor to identify labels (questions, options, etc.) on the document.
  3. OMR Reader Processing: OMR Reader is configured using Data Type's "OMR Reader" settings. First, set the "Post Processing" property to OMR Reader. Then, configure the OMR Reader settings to search for checkboxes near each label. After the Data Type collects labels, OMR Reader determines their checked status, and outputs the results according to the configured mode and formatting.
  4. Review and Export: Extracted values are available for review, validation, or export to downstream systems.

OMR Reader vs. Labeled OMR, Ordered OMR, and Zonal OMR

Grooper provides several specialized OMR extractors, each suited to different form layouts and extraction needs. Understanding the differences helps you choose the right tool for your scenario.

However, in general you should be aware of the following:

  • OMR Reader is a Result Processor added to a Data Type's configuration. It post-processes a Data Type's results. The Data Type returns labels, then OMR Reader determines if those labels are near checked boxes.
  • Labeled OMR, Ordered OMR, and Zonal OMR are all Value Extractor types. They can be referenced anywhere in Grooper where you can configure a Value Extractor.
  • OMR Reader is older than all other OMR extractors.
  • Generally, Labeled OMR is the easiest to configure and most flexible OMR extractor. It is generally preferred over OMR Reader (as well as Ordered OMR and Zonal OMR).

OMR Reader (Result Processor)

The OMR Reader is a result processor that works in conjunction with upstream extractors to associate labels with checkboxes based on spatial rules.

  • It is not itself a label extractor, but rather a processor that links extracted labels to nearby checkboxes.
  • The OMR Reader is ideal for scenarios where you want to flexibly associate labels and checkboxes using spatial configuration, rather than relying on fixed zones or strict order.
  • It is recommended to use Labeled OMR for the latest and most flexible OMR extraction, but OMR Reader remains useful for many forms.

Labeled OMR

Labeled OMR is designed for forms where checkboxes are positioned near text labels.

  • Associates checkboxes with nearby text using a label extractor or Label Sets defined on the parent Data Field.
  • Supports both rectangular and circular checkboxes.
  • Can use a header extractor to disambiguate between multiple groups of checkboxes.
  • Ideal for forms where the meaning of each checkbox is determined by its proximity to a label (e.g., survey questions, grouped options).
  • Automatically uses Label Sets if the label extractor is left empty.

Example use case: A survey where each question has a Yes/No/Undecided checkbox next to the text.

Ordered OMR

Ordered OMR is intended for forms where checkboxes appear in a fixed order within a defined region, but may lack reliable text labels.

  • You define a rectangular region containing the checkboxes and provide a list of output values.
  • The extractor assigns each value to a checkbox based on its detected order (left-to-right or top-to-bottom).
  • The "Flow Direction" property controls whether checkboxes are ordered horizontally or vertically.
  • Best for structured forms where the position of each checkbox determines its meaning.

Example use case: A standardized test answer row where each bubble corresponds to a specific answer choice (A, B, C, D).

Zonal OMR

Zonal OMR is used for forms with a fixed layout, where the position of each checkbox is known in advance.*

  • You manually define each checkbox's location (zone) and output value.
  • The extractor reads the checked/unchecked state from the specified regions.
  • Supports Boolean, CheckOne, and CheckMulti output modes.
  • Offers precise control over which checkboxes are read and how they are aligned, including registration and distance tolerance settings.

Example use case: A ballot where each candidate's checkbox is always in the same position on the page.

Choosing the Right OMR Tool

Extractor Best For How Checkboxes Are Matched Configuration
Labeled OMR Forms with labels near checkboxes By proximity to text labels Configure label extractor or use Label Sets
Ordered OMR Checkboxes in a fixed order, no reliable labels By position/order in a region Define region and output values
Zonal OMR Fixed-layout forms, known checkbox positions By manually defined zones Manually specify each checkbox's location and value
OMR Reader Flexible association of labels and checkboxes By spatial rules (direction, distance) Configure direction, distance, and output mode

Best Practices

  • Always preprocess documents with Box Detection or Box Removal for reliable checkbox detection.
  • Choose the OMR extractor that matches your form's layout and data requirements.
  • Test extraction on representative samples and adjust settings for optimal accuracy.
  • Use Label Sets and Data Field configuration to streamline label management where possible.

See also