Barcode Detection and Barcode Removal: Difference between revisions

From Grooper Wiki
No edit summary
Line 1: Line 1:
This article explains the purpose, configuration, and usage of the '''Barcode Detection''' and '''Barcode Removal''' IP Commands in Grooper. It also highlights their similarities and differences, helping users understand when and how to use each command in document processing workflows.
'''Barcode Detection''' and '''Barcode Removal''' are two closely related IP Commands in Grooper. Both detect barcodes and decode their values. Both store the barcodes' locations and values in the page's layout data file. This data can be used downstream for various purposes, including by the [[Find Barcode]] extractor. While Barcode Detection only detects barcodes, Barcode Removal will also remove barcodes (and optionally "barcode-like" artifacts) from the image.


== What is Barcode Detection? ==
== What is Barcode Detection? ==


'''Barcode Detection''' is an IP Command in Grooper. It can be added to a step in an [[IP Profile]] or [[IP Group]]. The Barcode Detection command is designed to identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.  
'''Barcode Detection''' is an [[IP Command]] in Grooper. It can be added to a step in an [[IP Profile]] or [[IP Group]]. The Barcode Detection command is designed to identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.  
 
 
When the [[Barcode Detection]] command is executed, it:
When the [[Barcode Detection]] command is executed, it:


# Optionally preprocesses the image (such as binarization or cropping to a region of interest).
# Optionally preprocesses the image (such as binarization or cropping to a region of interest).
# Applies one or more barcode readers to detect and decode barcodes.
# Applies one or more [[#Barcode Readers|Barcode Readers]] to detect and decode barcodes.
# Returns detected barcodes as part of the command result, storing them as layout data for downstream use.
# Returns detected barcodes' locations and decoded values, storing them as layout data for downstream use.


=== Use cases for Barcode Detection ===
=== Use cases for Barcode Detection ===
Line 19: Line 21:
* Routing documents based on barcode content.
* Routing documents based on barcode content.


=== Configuration and usage ===
=== General configuration steps ===


To use Barcode Detection:
To use Barcode Detection:
Line 35: Line 37:
#* Fine-tune the Barcode Detection settings as needed.
#* Fine-tune the Barcode Detection settings as needed.
# The Barcode Detection step will execute as part of the IP Profile's execution flow.
# The Barcode Detection step will execute as part of the IP Profile's execution flow.
#* See [[#When are IP Profiles executed?|When are IP Profiles executed?]] for more information on different ways IP Profiles are used in Grooper.


The detected barcode information is stored as layout data, making it available for downstream activities such as data extraction, classification, or routing.
The detected barcode information is stored as layout data, making it available for downstream activities such as data extraction, classification, or routing.
Line 40: Line 43:
== What is Barcode Removal? ==
== What is Barcode Removal? ==


'''Barcode Removal''' is the process of detecting and masking barcode regions on an image, effectively removing them from the document. The [[Barcode Removal]] IP Command is used to eliminate unwanted barcode content before downstream processing, such as OCR or data extraction, where barcodes may interfere with results.
'''Barcode Removal''' is an [[IP Command]] in Grooper. It can be added to a step in an [[IP Profile]] or [[IP Group]]. Barcode Removal has two purposes.
* Like Barcode Detection, it can identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.
* Barcode Removal goes one step further in that it also digitally erases barcodes from the image. This can eliminate unwanted barcode content before OCR occurs where barcodes may interfere with results. This can ultimately improve downstream processing, such as data extraction.
 


When the [[Barcode Removal]] command is executed, it:
When the Barcode Removal command is executed, it:


# Applies the configured barcode detection settings to locate barcode regions.
# Optionally preprocesses the image (such as binarization or cropping to a region of interest).
# Generates a mask to cover each detected barcode.
# Applies one or more [[#Barcode Readers|Barcode Readers]] to detect and decode barcodes.
# Optionally removes regions that resemble barcodes but could not be decoded (useful for cleaning up barcode-like noise).
# Returns detected barcodes' locations and decoded values, storing them as layout data for downstream use.
# Generates a dropout mask to cover each detected barcode.
#*<li class="fyi-bullet"> Optionally, the dropout mask can include "barcode-like" regions. These are artifacts that resemble barcodes but could not be decoded (useful for cleaning up non-text noise).
# Expands the mask by a configurable border to ensure the entire barcode and its quiet zone are removed.
# Expands the mask by a configurable border to ensure the entire barcode and its quiet zone are removed.
# Drops out the masked regions from the image using the selected dropout method.
# Removes the masked regions from the image using the "Dropout Method" settings.


=== Use cases for barcode removal ===
=== Use cases for Barcode Removal ===


Barcode Removal's use cases are the same as Barcode Detection plus:
* Cleaning up document images before OCR to prevent barcodes from interfering with text recognition.
* Cleaning up document images before OCR to prevent barcodes from interfering with text recognition.
* Removing barcode-like noise or damaged barcodes that cannot be decoded.
* Removing barcode-like noise or damaged barcodes that cannot be decoded.
* Ensuring that only relevant content remains on the image for downstream processing.
* Ensuring that only relevant content remains on the image for downstream processing.


=== Configuration and usage ===
=== General configuration steps ===


To use barcode removal:
# Right-click an [[IP Profile]] or [[IP Group]] to add a Barcode Detection [[IP Step]].
 
# Select "Add Command" then "Feature Removal" then "Barcode Removal".
* Add the Barcode Removal command to an [[IP Profile]].
# Enable and configure Barcode Readers to match the barcode symbologies present in your documents.
* Configure the "Barcode Detection Settings" property to specify which barcode types to detect and how detection is performed.
#* See the [[#Barcode Readers|Barcode Readers]] section for more information.
* Enable "Remove Regions" to also remove "barcode-like" regions.
# Set preprocessing options (if needed)
* Adjust "Border Expand" to ensure the entire barcode and its quiet zone are masked out.
#* "Binarization Settings" control how the image is temporarily converted to a black and white image. Barcode Detection must occur on a black and white image. However, this adjustment only occurs in preprocessing. It will not affect the final image.
* Use diagnostic mode to review the mask and output image, ensuring only unwanted barcode regions are removed.
#* "Region of Interest" restricts detection to a predefined zone on the page.
# (Optional) Enable "Remove Regions" to remove "barcode-like" regions. This is useful to get rid of noise that resembles barcodes for OCR, even if Grooper cannot detect it as a barcode.
# Adjust the "Border Expand" property as needed.
#* This will control how thick a border around the detected barcode should be removed.
#* Expanding this value will help ensure no stray barcode remnants remain, but setting it too high may remove nearby content.
# Adjust the "Dropout Method" as needed. There are two Dropout Methods to choose from:
#* '''Fill''' (Default) - Replaces barcodes with a solid color.
#**<li class="fyi-bullet"> By default, the Fill Color is set to "none". When set to "none" Grooper will detect a color from the image's background and use that.
#** This method is good for temporarily removing barcodes when applied by an OCR Profile.
#** This method is good for redacting barcodes when applied permanently by Image Processing when the image's background is a solid color.
#* '''Inpaint''' - Digitally restores masked regions by estimating and filling in missing or damaged areas using advanced algorithms.
#** When removing barcodes that overlap text, this method can help preserve overlapping text.
#** This method is good for redacting barcodes when applied permanently by Image Processing on images with patterned backgrounds.
# Test the IP Step/IP Group/IP Profile using the "Tester" tab. Use the Diagnostics panel's images and files to visualize results.
#* Ensure barcodes are detected as you expect. Review the "Barcodes.jpg" diagnostic. Detected barcodes are highlighted.
#* Ensure barcodes are decoded as you expect. Review the "Execution Log.txt" diagnostic. This will list all detected barcodes and their values.
#* Ensure barcodes are removed as you expect. Review the "Dropout Mask.jpg" and "Output Image.jpg" diagnostics. The "Dropout Mask.jpg" image shows the dropout mask in red. The "Output Image.jpg" image shows the final image, as altered according to the "Dropout Method" settings.
#* Fine-tune the Barcode Removal settings as needed.
# The Barcode Removal step will execute as part of the IP Profile's execution flow.
#* See [[#When are IP Profiles executed?|When are IP Profiles executed?]] for more information on different ways IP Profiles are used in Grooper.


== Barcode Readers ==
== Barcode Readers ==
Line 167: Line 195:
* '''During a Review step:''' Users can manually execute an IP Profile from the Thumbnail Viewer (if configured to allow the user to do so).
* '''During a Review step:''' Users can manually execute an IP Profile from the Thumbnail Viewer (if configured to allow the user to do so).


Execution follows the order and logic defined in the IP Profile, including any conditional flow control or branching. Each step or group within the profile is applied in sequence, transforming the input image and producing results for further processing.
Execution follows the order and logic defined in the IP Profile, including any conditional flow control or branching. Each step or group within the profile is applied in sequence, transforming the input image and producing results for each subsequent step.

Revision as of 11:03, 7 August 2025

Barcode Detection and Barcode Removal are two closely related IP Commands in Grooper. Both detect barcodes and decode their values. Both store the barcodes' locations and values in the page's layout data file. This data can be used downstream for various purposes, including by the Find Barcode extractor. While Barcode Detection only detects barcodes, Barcode Removal will also remove barcodes (and optionally "barcode-like" artifacts) from the image.

What is Barcode Detection?

Barcode Detection is an IP Command in Grooper. It can be added to a step in an IP Profile or IP Group. The Barcode Detection command is designed to identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.


When the Barcode Detection command is executed, it:

  1. Optionally preprocesses the image (such as binarization or cropping to a region of interest).
  2. Applies one or more Barcode Readers to detect and decode barcodes.
  3. Returns detected barcodes' locations and decoded values, storing them as layout data for downstream use.

Use cases for Barcode Detection

Grooper can use Barcode Detection to use data encoded in barcodes for numerous purposes. This includes:

  • Extracting barcode values for indexing or validation.
  • Automating document separation based on a barcodes presence or value.
  • Automating document classification based on barcode presence or value.
  • Capturing barcode data for integration with external systems.
  • Routing documents based on barcode content.

General configuration steps

To use Barcode Detection:

  1. Right-click an IP Profile or IP Group to add a Barcode Detection IP Step.
  2. Select "Add Command" then "Feature Detection" then "Barcode Detection".
  3. Enable and configure Barcode Readers to match the barcode symbologies present in your documents.
  4. Set preprocessing options (if needed)
    • "Binarization Settings" control how the image is temporarily converted to a black and white image. Barcode Detection must occur on a black and white image. However, this adjustment only occurs in preprocessing. It will not affect the final image.
    • "Region of Interest" restricts detection to a predefined zone on the page.
  5. Test the IP Step/IP Group/IP Profile using the "Tester" tab. Use the Diagnostics panel's images and files to visualize results.
    • Ensure barcodes are detected as you expect. Review the "Barcodes.jpeg" diagnostic. Detected barcodes are highlighted.
    • Ensure barcodes are decoded as you expect. Review the "Execution Log.txt" diagnostic. This will list all detected barcodes and their values.
    • Fine-tune the Barcode Detection settings as needed.
  6. The Barcode Detection step will execute as part of the IP Profile's execution flow.

The detected barcode information is stored as layout data, making it available for downstream activities such as data extraction, classification, or routing.

What is Barcode Removal?

Barcode Removal is an IP Command in Grooper. It can be added to a step in an IP Profile or IP Group. Barcode Removal has two purposes.

  • Like Barcode Detection, it can identify a wide range of barcode types—including 1D, 2D, and postal barcodes—on scanned images or digital documents. Barcodes are commonly used for indexing, routing, and automating document workflows.
  • Barcode Removal goes one step further in that it also digitally erases barcodes from the image. This can eliminate unwanted barcode content before OCR occurs where barcodes may interfere with results. This can ultimately improve downstream processing, such as data extraction.


When the Barcode Removal command is executed, it:

  1. Optionally preprocesses the image (such as binarization or cropping to a region of interest).
  2. Applies one or more Barcode Readers to detect and decode barcodes.
  3. Returns detected barcodes' locations and decoded values, storing them as layout data for downstream use.
  4. Generates a dropout mask to cover each detected barcode.
    • Optionally, the dropout mask can include "barcode-like" regions. These are artifacts that resemble barcodes but could not be decoded (useful for cleaning up non-text noise).
  5. Expands the mask by a configurable border to ensure the entire barcode and its quiet zone are removed.
  6. Removes the masked regions from the image using the "Dropout Method" settings.

Use cases for Barcode Removal

Barcode Removal's use cases are the same as Barcode Detection plus:

  • Cleaning up document images before OCR to prevent barcodes from interfering with text recognition.
  • Removing barcode-like noise or damaged barcodes that cannot be decoded.
  • Ensuring that only relevant content remains on the image for downstream processing.

General configuration steps

  1. Right-click an IP Profile or IP Group to add a Barcode Detection IP Step.
  2. Select "Add Command" then "Feature Removal" then "Barcode Removal".
  3. Enable and configure Barcode Readers to match the barcode symbologies present in your documents.
  4. Set preprocessing options (if needed)
    • "Binarization Settings" control how the image is temporarily converted to a black and white image. Barcode Detection must occur on a black and white image. However, this adjustment only occurs in preprocessing. It will not affect the final image.
    • "Region of Interest" restricts detection to a predefined zone on the page.
  5. (Optional) Enable "Remove Regions" to remove "barcode-like" regions. This is useful to get rid of noise that resembles barcodes for OCR, even if Grooper cannot detect it as a barcode.
  6. Adjust the "Border Expand" property as needed.
    • This will control how thick a border around the detected barcode should be removed.
    • Expanding this value will help ensure no stray barcode remnants remain, but setting it too high may remove nearby content.
  7. Adjust the "Dropout Method" as needed. There are two Dropout Methods to choose from:
    • Fill (Default) - Replaces barcodes with a solid color.
      • By default, the Fill Color is set to "none". When set to "none" Grooper will detect a color from the image's background and use that.
      • This method is good for temporarily removing barcodes when applied by an OCR Profile.
      • This method is good for redacting barcodes when applied permanently by Image Processing when the image's background is a solid color.
    • Inpaint - Digitally restores masked regions by estimating and filling in missing or damaged areas using advanced algorithms.
      • When removing barcodes that overlap text, this method can help preserve overlapping text.
      • This method is good for redacting barcodes when applied permanently by Image Processing on images with patterned backgrounds.
  8. Test the IP Step/IP Group/IP Profile using the "Tester" tab. Use the Diagnostics panel's images and files to visualize results.
    • Ensure barcodes are detected as you expect. Review the "Barcodes.jpg" diagnostic. Detected barcodes are highlighted.
    • Ensure barcodes are decoded as you expect. Review the "Execution Log.txt" diagnostic. This will list all detected barcodes and their values.
    • Ensure barcodes are removed as you expect. Review the "Dropout Mask.jpg" and "Output Image.jpg" diagnostics. The "Dropout Mask.jpg" image shows the dropout mask in red. The "Output Image.jpg" image shows the final image, as altered according to the "Dropout Method" settings.
    • Fine-tune the Barcode Removal settings as needed.
  9. The Barcode Removal step will execute as part of the IP Profile's execution flow.

Barcode Readers

The Barcode Detection and Barcode Removal commands both support the same Barcode Reader engines. The different Barcode Readers are designed to read different sets of barcode symbologies, locating the barcode on a page and decoding their value. Understanding the differences helps you select the right reader(s) for your documents.

Standard Reader

  • Purpose: General-purpose barcode detection using Grooper's standard engine. Supports s broad range of symbologies, including types not covered by the specialized readers.
  • Use cases: For symbologies not covered by the specialized readers, serves as the primary Barcode Reader. Acts as a fallback or supplement for the specialized readers.
  • Barcode Symbologies:
    • Australia Post
    • Codabar
    • Code 11
    • Code 128
    • Code 32
    • Code 39
    • Code 93
    • Data Matrix
    • EAN-13
    • EAN-8
    • Interleaved 2 of 5 (I2of5)
    • Intelligent Mail
    • ITF-14
    • Micro QR
    • Patch
    • PDF417
    • PLANET
    • Postnet
    • Plus 2 (2-digit supplementals)
    • Plus 5 (5-digit supplementals)
    • QR Code
    • Royal Mail 4-State (RM4SCC)
    • RSS 14
    • RSS Limited
    • Telepen
    • UPC-A
    • UPC-E
    • Aztec
    • Pharmacode

1D Reader

  • Purpose: Advanced detection of linear (one-dimensional) barcodes.
  • Use cases: Ideal for documents with shipping labels, inventory tags, or forms using linear barcodes.
  • Barcode Symbologies:
    • Codabar
    • Code 128
    • Code 39
    • Pharmacode

2D Reader

  • Purpose: Advanced detection of two-dimensional barcodes.
  • Use cases: Suitable for forms, packaging, or documents containing Data Matrix codes.
  • Barcode Symbologies:
    • Data Matrix

Postal Reader

  • Purpose: Specialized detection of postal barcode symbologies.
  • Use cases: Designed for mail pieces, envelopes, or documents used in postal automation and tracking.
  • Barcode Symbologies:
    • Intelligent Mail
    • Postnet

Similarities and differences between Barcode Detection and Barcode Removal

Similarities

The big takeaway: Both Barcode Detection and Barcode Removal detect barcodes.

  • Both commands detect barcodes and store their locations and encoded values in the Batch Page's layout data.
  • Both commands use the same Barcode Readers
    • They use the same barcode detection technology to locate barcode regions on images.
    • They support the same range of barcode symbologies and can be configured to target specific types.
  • Both can operate on all common pixel formats and convert images for preprocessing as needed (Both have the same "Binarization Settings").
  • Both provide diagnostic output to assist with configuration and troubleshooting.
  • Both are included as steps in an IP Profile, allowing them to be combined with other image processing operations.

Differences

The big takeaway: Barcode Detection does not alter the image. Barcode Removal does alter the image.

  • Both Barcode Detection and Barcode Removal detect barcodes and store their information in the Batch Page's layout data.
  • Only Barcode Removal barcode regions from images, masking them out before further processing.
    • This is most useful for image cleanup prior to OCR.
    • Barcodes will be removed from an image permanently when Barcode Removal is applied by the Image Processing activity.
    • Barcodes will be removed in-memory prior to OCR when Barcode Removal is applied by the Recognize activity.
  • Only Barcode Removal has Feature Removal properties such as "Remove Regions", "Border Expand" and "Dropout Method"

When to use each command

  • Use Barcode Detection when you only need to extract barcode values for indexing, validation, or workflow automation without altering the image.
  • Use Barcode Removal when you want to eliminate barcodes or to remove barcode-like noise prior to OCR. Barcode Removal will extract barcode values as well.

When are IP Profiles executed?

An IP Profile is executed whenever Grooper needs to process an image using a defined sequence or hierarchy of image processing operations. Execution typically occurs in the following scenarios:

  • By the Image Processing activity: The Image Processing activity will apply the IP Profile and permanantly alter the image.
  • By an OCR Profile: OCR Profiles configured with an IP Profile will run the IP Profile on an image prior to handing it to the OCR image. The image will not be permanently altered.
  • By the Recognize activity's "Alternate IP" configuration: IP Profiles executed by this configuration will only execute feature detection commands (such as Barcode Detection) to collect layout data.
  • During a Review step: Users can manually execute an IP Profile from the Thumbnail Viewer (if configured to allow the user to do so).

Execution follows the order and logic defined in the IP Profile, including any conditional flow control or branching. Each step or group within the profile is applied in sequence, transforming the input image and producing results for each subsequent step.