2023.1:Image Processing (Activity): Difference between revisions

From Grooper Wiki
Line 15: Line 15:


Instead, you can use an [[OCR Profile]] refferencing an [[IP Profile]] containing a Remove Lines command during [[Recognize]].  The image will be temporarily changed according to the [[IP Profile]].  Then, [[OCR]] will run on the altered image.  Last, the image will revert back to its original form.
Instead, you can use an [[OCR Profile]] refferencing an [[IP Profile]] containing a Remove Lines command during [[Recognize]].  The image will be temporarily changed according to the [[IP Profile]].  Then, [[OCR]] will run on the altered image.  Last, the image will revert back to its original form.
* Furthermore, any image based data targeted by the IP Profile (such as the table line locations for this example) will be saved to the [[Batch Page]] for later use.
* Furthermore, any image based data targeted by the IP Profile (such as the table line locations for this example) will still be saved to the [[Batch Page]] for later use.

Revision as of 13:19, 30 December 2019

Image Processing is an Unattended Activity that performs two major functions in Grooper's workflow.

  1. To apply image cleanup operations in order to obtain better text data from OCR.
  2. To obtain image-based data (called "features") such as table line locations, barcode information, OMR checkbox states, and more.

This is done by creating an IP Profile which lists a series of steps to perform individual image processing functions, called IP Commands.

Permanent vs. Temporary Image Processing

The Image Processing activity permanently alters a document's image. However, it is possible to temporarily clean up document images and revert back to the original document image. This is done during the Recognize activity.

For example, you may have a document where table lines are getting in the way of accurate OCR. However, if you remove these lines during the Image Processing activity, they will be permanently removed, making it difficult to review the documents in Data Review and changing the archival image stored later to something that no longer looks like the original document.

Instead, you can use an OCR Profile refferencing an IP Profile containing a Remove Lines command during Recognize. The image will be temporarily changed according to the IP Profile. Then, OCR will run on the altered image. Last, the image will revert back to its original form.

  • Furthermore, any image based data targeted by the IP Profile (such as the table line locations for this example) will still be saved to the Batch Page for later use.