Barcode Detection and Barcode Removal

From Grooper Wiki

Barcode Detection and Barcode Removal are two closely related IP Commands in Grooper. Both detect barcodes and decode their values. Both store the barcodes' locations and values in the page's layout data file. This data can be used downstream for various purposes, including by the Find Barcode extractor. While Barcode Detection only detects barcodes, Barcode Removal will also remove barcodes (and optionally "barcode-like" artifacts) from the image.

What is Barcode Detection?

Barcode Detection is an IP Command that detects and reads barcode data. The detected barcode information is stored as part of the page's layout data.

Barcode Detection is an IP Command in Grooper. It can be added to a step in an IP Profile or IP Group. The Barcode Detection command is designed to identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.


When the Barcode Detection command is executed, it:

  1. Optionally preprocesses the image (such as binarization or cropping to a region of interest).
  2. Applies one or more Barcode Readers to detect and decode barcodes.
  3. Saves layout data describing the position, size, and the barcode's encoded value.

Use cases for Barcode Detection

Grooper can use the layout data generated by Box Detection to:

  • Supply layout data to the Find Barcode extractor.
  • Extract barcode values for indexing or validation.
  • Separate documents based on a barcodes presence or value (using the "Barcode Detected" Separation Event).
  • Classify documents based on barcode presence or value (either during separation or using the Find Barcode extractor).
  • Capturing barcode data for integration with external systems.

General configuration steps

To use Barcode Detection:

  1. Right-click an IP Profile or IP Group to add a Barcode Detection IP Step.
  2. Select "Add Command" then "Feature Detection" then "Barcode Detection".
  3. Enable and configure Barcode Readers to match the barcode symbologies present in your documents.
  4. Adjust "Image Preprocessing" settings as needed.
    • These settings optimize barcode visibility, improving Barcode Detection's ability to detect barcodes. Preprocessing adjustments only occur prior to detecting barcodes and do not alter the final image.
    • "Binarization Settings" control how color or grayscale images are turned black and white. Barcode Detection must occur on a black and white image.
    • "Region of Interest" restricts detection to a predefined zone on the page.
  5. Test the IP Step/IP Group/IP Profile using the "Tester" tab. Use the Diagnostics panel's images and files to visualize results.
    • Ensure barcodes are detected as you expect. Review the "Barcodes.jpeg" diagnostic. Detected barcodes are highlighted.
    • Ensure barcodes are decoded as you expect. Review the "Execution Log.txt" diagnostic. This will list all detected barcodes and their values.
    • Fine-tune the Barcode Detection settings as needed.
  6. The Barcode Detection step will execute as part of the IP Profile's execution flow.

The detected barcode information is stored as layout data, making it available for downstream activities such as data extraction, classification, or routing.

What is Barcode Removal?

Barcode Removal is an IP Command that detects, reads and digitally removes barcodes from an image. The detected barcode information is stored as part of the page's layout data.


Barcode Removal is an IP Command in Grooper. It can be added to a step in an IP Profile or IP Group. Barcode Removal has two purposes.

  • Like Barcode Detection, it can identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.
  • Barcode Removal goes one step further in that it also digitally erases barcodes from the image. This can eliminate unwanted barcode content before OCR occurs where barcodes may interfere with results. This can ultimately improve downstream processing, such as data extraction.


When the Barcode Removal command is executed, it:

  1. Optionally preprocesses the image (such as binarization or cropping to a region of interest).
  2. Applies one or more Barcode Readers to detect and decode barcodes.
  3. Generates a dropout mask to cover each detected barcode.
    • Optionally, the dropout mask can include "barcode-like" regions. These are artifacts that resemble barcodes but could not be decoded (useful for cleaning up non-text noise).
  4. Expands the mask by a configurable border to ensure the entire barcode and its quiet zone are removed.
  5. Removes the masked regions from the image using the "Dropout Method" settings.
  6. Saves layout data describing the position, size, and the barcode's encoded value.

Use cases for Barcode Removal

Barcode Removal's use cases are the same as Barcode Detection plus:

  • Cleaning up document images before OCR to prevent barcodes from interfering with text recognition.
  • Removing barcode-like noise or damaged barcodes that cannot be decoded.
  • Redacting checkboxes if not needed/wanted on the final exported image.

General configuration steps

  1. Right-click an IP Profile or IP Group to add a Barcode Detection IP Step.
  2. Select "Add Command" then "Feature Removal" then "Barcode Removal".
  3. Enable and configure Barcode Readers to match the barcode symbologies present in your documents.
  4. Adjust "Image Preprocessing" settings as needed.
    • These settings optimize barcode visibility, improving Barcode Removal's ability to detect barcodes. Preprocessing adjustments only occur prior to detecting barcodes and do not alter the final image.
    • "Binarization Settings" control how color or grayscale images are turned black and white. Barcode Detection must occur on a black and white image.
    • "Region of Interest" restricts detection to a predefined zone on the page.
  5. (Optional) Enable "Remove Regions" to remove "barcode-like" regions. This is useful to get rid of noise that resembles barcodes for OCR, even if Grooper cannot detect it as a barcode.
  6. Adjust the "Border Expand" property as needed.
    • This will control how thick a border around the detected barcode should be removed.
    • Expanding this value will help ensure no stray barcode remnants remain, but setting it too high may remove nearby content.
  7. Adjust the "Dropout Method" as needed. There are two Dropout Methods to choose from:
    • Fill (Default) - Replaces barcodes with a solid color.
      • By default, the Fill Color is set to "none". When set to "none" Grooper will detect a color from the image's background and use that.
      • This method is good for temporarily removing barcodes when applied by an OCR Profile.
      • This method is good for redacting barcodes when applied permanently by Image Processing when the image's background is a solid color.
    • Inpaint - Digitally restores masked regions by estimating and filling in missing or damaged areas using advanced algorithms.
      • When removing barcodes that overlap text, this method can help preserve overlapping text.
      • This method is good for redacting barcodes when applied permanently by Image Processing on images with patterned backgrounds.
  8. Test the IP Step/IP Group/IP Profile using the "Tester" tab. Use the Diagnostics panel's images and files to visualize results.
    • Ensure barcodes are detected as you expect. Review the "Barcodes.jpg" diagnostic. Detected barcodes are highlighted.
    • Ensure barcodes are decoded as you expect. Review the "Execution Log.txt" diagnostic. This will list all detected barcodes and their values.
    • Ensure barcodes are removed as you expect. Review the "Dropout Mask.jpg" and "Output Image.jpg" diagnostics. The "Dropout Mask.jpg" image shows the dropout mask in red. The "Output Image.jpg" image shows the final image, as altered according to the "Dropout Method" settings.
    • Fine-tune the Barcode Removal settings as needed.
  9. The Barcode Removal step will execute as part of the IP Profile's execution flow.

Similarities and differences between Barcode Detection and Barcode Removal

Similarities

The big takeaway: Both Barcode Detection and Barcode Removal detect barcodes.

  • Both commands detect barcodes and store their locations and encoded values in the Batch Page's layout data.
  • Both commands use the same Barcode Readers
    • They use the same barcode detection technology to locate barcode regions on images.
    • They support the same range of barcode symbologies and can be configured to target specific types.
  • Both can operate on all common pixel formats and convert images for preprocessing as needed (Both have the same "Binarization Settings").
  • Both provide diagnostic output to assist with configuration and troubleshooting.
  • Both are included as steps in an IP Profile, allowing them to be combined with other image processing operations.

Differences

The big takeaway: Barcode Detection does not alter the image. Barcode Removal does alter the image.

  • Both Barcode Detection and Barcode Removal detect barcodes and store their information in the Batch Page's layout data.
  • Only Barcode Removal barcode regions from images, masking them out before further processing.
    • This is most useful for image cleanup prior to OCR.
    • Barcodes will be removed from an image permanently when Barcode Removal is applied by the Image Processing activity.
    • Barcodes will be removed in-memory prior to OCR when Barcode Removal is applied by the Recognize activity.
  • Only Barcode Removal has Feature Removal properties such as "Remove Regions", "Border Expand" and "Dropout Method"

When to use each command

  • Use Barcode Detection when you only need to extract barcode values for indexing, validation, or workflow automation without altering the image.
  • Use Barcode Removal when you want to eliminate barcodes or to remove barcode-like noise prior to OCR. Barcode Removal will extract barcode values as well.

Barcode Readers

The Barcode Detection and Barcode Removal commands both support the same Barcode Reader engines. The different Barcode Readers are designed to read different sets of barcode symbologies, locating the barcode on a page and decoding their value. Understanding the differences helps you select the right reader(s) for your documents.

Standard Reader

  • Purpose: General-purpose barcode detection using Grooper's standard engine. Supports s broad range of symbologies, including types not covered by the specialized readers.
  • Use cases: For symbologies not covered by the specialized readers, serves as the primary Barcode Reader. Acts as a fallback or supplement for the specialized readers.
  • Barcode Symbologies:
    • Australia Post
    • Codabar
    • Code 11
    • Code 128
    • Code 32
    • Code 39
    • Code 93
    • Data Matrix
    • EAN-13
    • EAN-8
    • Interleaved 2 of 5 (I2of5)
    • Intelligent Mail
    • ITF-14
    • Micro QR
    • Patch
    • PDF417
    • PLANET
    • Postnet
    • Plus 2 (2-digit supplementals)
    • Plus 5 (5-digit supplementals)
    • QR Code
    • Royal Mail 4-State (RM4SCC)
    • RSS 14
    • RSS Limited
    • Telepen
    • UPC-A
    • UPC-E
    • Aztec
    • Pharmacode

1D Reader

  • Purpose: Advanced detection of linear (one-dimensional) barcodes.
  • Use cases: Ideal for documents with shipping labels, inventory tags, or forms using linear barcodes.
  • Barcode Symbologies:
    • Codabar
    • Code 128
    • Code 39
    • Pharmacode

2D Reader

  • Purpose: Advanced detection of two-dimensional barcodes.
  • Use cases: Suitable for forms, packaging, or documents containing Data Matrix codes.
  • Barcode Symbologies:
    • Data Matrix

Postal Reader

  • Purpose: Specialized detection of postal barcode symbologies.
  • Use cases: Designed for mail pieces, envelopes, or documents used in postal automation and tracking.
  • Barcode Symbologies:
    • Intelligent Mail
    • Postnet


When are IP Profiles executed?

An IP Profile is executed whenever Grooper needs to process an image using a defined sequence or hierarchy of image processing operations. Execution typically occurs in the following scenarios:

  • By the Image Processing activity: The Image Processing activity will apply the IP Profile and permanently alter the image.
  • By an OCR Profile: OCR Profiles configured with an IP Profile will run the IP Profile on an image prior to handing it to the OCR image. The image will not be permanently altered.
  • By the Recognize activity's "Alternate IP" configuration: IP Profiles executed by this configuration will only execute feature detection commands (such as Barcode Detection) to collect layout data.
  • During a Review step: Users can manually execute an IP Profile from the Thumbnail Viewer (if configured to allow the user to do so).

Execution follows the order and logic defined in the IP Profile, including any conditional flow control or branching. Each step or group within the profile is applied in sequence, transforming the input image and producing results for each subsequent step.