2.80:Shape Removal (IP Command)

From Grooper Wiki

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

202520232.80

Shape Removal is an IP Command detects and removes shapes from documents. The shape locations are then stored as part of the object's layout data

The Shape Removal command builds upon the existing Shape Detection command. Shape Detection finds shapes in a document (such as logos) from a set of sample images given by the user. Shape Removal goes one step further and removes the detected shape from the document. The removed pixels are either filled in with a solid color or inpainted to try and match it to pixels nearby.

Removing a shape from a document may be helpful if it is interfering with another Grooper activity, such as getting OCR text from the Recognize activity.

Version Differences

Shape Removal is a new feature in version 2.80. Prior to 2.80 shapes could be detected via the Shape Detection command. However, Shape Detection was generally used for Visual document classification. Shape Removal would not have been possible in previous versions.

Use Cases

The primary use for the Shape Removal command is to improve a document's readability. Often, images on a page can interfere with OCR results from the Recognize activity. If you can give Shape Removal sample images of what to look for, it can remove those images from a document set. Logos, for example, are often removed by Shape Removal.  As well as removing logos to improve OCR results (which can be done without removing them from the final exported documents), Shape Removal could also be used to permanently de-brand the exported documents.

Original Image Logo detected and removed

How To: Add Shape Removal to an IP Profile

Before you begin

You must have a Test Batch ready with examples of the shape on the document.  Part of configuring the Shape Removal command is collecting sample images of the shape to be removed.  This guide assumes you’ve already created an IP Profile.

Add Shape Removal to your IP Profile

1. Navigate to your IP Profile in the "IP Profiles" folder in the "Global Resources" folder in the Node Tree.

2. Press the "Add" button to add a new IP Command to your IP Profile.



3. Select the "Feature Removal" category. Then, select "Shape Removal".



Open Shape Detection settings

The before you can remove a shape, you have to find it first.  The "Detection Settings" property controls how your desired shape is found on your documents.  Select the property and click the ellipsis button at the end of it.



This will open the "Shape Detection" window seen bellow.



Give sample images

Using the "Sample Images" property, you will select the logo or other image your are attempting to remove from your documents.

1.  Select the "Sample Images" property and press the ellipsis button at the end.



2. This will bring up the "Sample Images" window.  Press the "Add" button.



3. This will bring up another window.  From here, you can select a document from a test batch and, using the mouse selection tool, lasso an image with your mouse.  Press "OK" when finished.



4. After pressing, OK you'll return to the first "Sample Images" window.  You can see your selected sample image here.  You may also add more samples by repeating the process using different documents.  Press the "Done" button when finished.



5. You may name the shape using the "Shape Name" property.



Set how similarity is determined

1. The "Proximity Measure" property sets how similarity is determined between sample images and other images.  There are three methods available: SAD (or sum of absolute differences), CrossCorr (or normalized cross-correlation), and SSD (or sum of squared distances).  Each method uses different different equations to compare the pixels in the sample image to the pixels on the document.  SAD is a very simple way to automate searching for sample images.  It measures the absolute difference between each pixel in the sample image with the corresponding pixel in the block its being compared to.  Potentially, SAD may be unreliable given changes in lighting, color, or image degredation, but is generally the go do method for shape detection.



When choosing a Proximity Measure, you may find the "Shape Locations" diagnostic image useful.  Ideally, you will want a method that does not produce false-positive matches, like the ones seen below (However, take that advice with a grain of salt.  There are other properties available to us to limit these false positive matches).



2. The Grooper logo we've selected as a sample image has a lot of white space around it that isn't actually part of the logo.  That is one reason why we're seeing so may false positive matches in the image above.  The "Background Differencing" property can help resolve this issue.



Matched images are given a confidence similarity score to the sample images.  If a shape contains 90% white pixels, any blank region would match at 90% similarity.  With "Background Differencing" enabled, confidence values are scaled down according to the color balance of the sample image.  Grooper attempts to differentiate the image from the (in this case white) background.  Hence the term, "Background Differencing".  See below, setting this property to "True" eliminates all the false positive matches from before.



3. The "Minimum Confidence" defines the minimum similarity between the detected image and the sample image between 0% and 100%.



Selecting the "Execution Log" for the Shape Removal step in the image diagnostics panel, you can see the detected images similarity confidence score.



Account for alternate image scale and angle

It is unlikely detected shapes will be the exact size of your sample.  The shape may be at a skewed angle on some documents as well.  The "Orientation and Scale" properties can help account for this.  If you expect the shape in your documents to be either 10% larger or smaller than your sample image, you can set "Minimum Scale" 90% and "Maximum Scale" to 110%.  The "Maximum Angle" property can account for any rotation, set in degrees.



! Any adjustments you make to these properties will increase this command's compute time.  If you expect a drastic change in size or angle (i.e. half or double the sample image's size or rotated a full 90 degrees), it may be appropriate to just add a second sample image.  This will significantly reduce the time it takes to detect the image.

Preprocessing options

Optionally, there are some temporary alterations you can make to the sample image and documents before shape detection occurs.



1. The "Processing Resolution" property controls the resolution at which the sample image is compared to a document.  The default resolution used is fairly low (50 dpi).  This is partly a processing time cutting measure.  In general, shape detection doesn't need a perfect match.  Lowering the sample image's resolution lowers the time it takes to scan for it on a document.  It just needs to be close enough that it catches a match.  Using a lower resolution can also help match shapes that are not a "one to one" reproduction of the sample image (as in if they are degraded somewhat on the document).  In a way, it makes the sample a "fuzzy" match.  However, if you lower the resolution too much, it could start matching shapes you don't want.  And, using a higher resolution will make matching tighter.

2. "Binarization" can turn color or grayscale samples into black and white.  Searching for a flat black and white shape on a (also binarized) black and white document may end up giving you more accurate results.  For more information on binarization, visit the Binarize article.

3. "Dilation Factor" will bloat the edges of the sample image.  This is another way of getting a "fuzzy" match from the sample.  It will increase the range of pixels possible to produce a match along a shape's edge.

4. If you know the physical location (give or take) the shape will be on a document, you can limit where Grooper looks for it using the "Region of Interest" property.  Select it and press the ellipsis button at the end.



This will bring up the "Edit Zone" window.  Here, select a document from a Test Batch and, using the "mouse selection tool", lasso the region you expect to find the shape with your mouse.  Press the "OK" button when finished.



A brief aside about masks

Once shapes are detected, a "Shape Mask" is created.  This is overlayed on the document where the shape was detected.  You can see the mask, viewing the "Shape Mask" diagnostic image (If no shapes where detected, you will not see this in the diagnostic panel).



Pixels on the document under the Shape Mask create the "Dropout Mask" which will be ultimately removed (More precisely, pixels under the Shape Mask and any contiguous strings of pixels touching the pixels under the mask form the dropout mask).  You can view the dropout mask by clicking on the "Dropout Mask" image in the diagnostics panel.



As you can see, we have a problem.  While the Shape Mask looks like a slightly blurry silloute of our sample image (we'll talk about why it's blurry later), the Dropout Mask does not contain the three scales underneath the "G".  Only pixels from the dropout mask are removed, giving us a poor result.

Shape Mask
Dropout Mask
Result


This is because the Dropout Mask is created from the binarized version of the document.  Binarization converts color images to black and white by "thresholding" the image. Thresholding is the process of setting a threshold value on the pixel intensity of the original image. Essentially once a midpoint between the most intense ("blackest") and least intense ("whitest") pixel on a page is established, lighter pixels are converted to white and darker are converted to black. This midpoint (or "threshold") can be set manually or found automatically by a software application.  Use the "Binarized" diagnostic image to see what the Shape Mask is actually being applied to.



If you're not getting the results you want, it's likely at least part of the problem is you need to adjust the document's binarization settings.


Binarize the document

1. Binarization turns color or grayscale images into black and white.  You must binarize a document in order to generate the dropout mask, which determines what pixels from a detected shape are removed. In the Shape Removal command's property panel, select "Binarization" and press the ellipsis button at the end.



2.  This will bring up the "Binarize" window.



Binarization converts color images to black and white by "thresholding" the image. Thresholding is the process of setting a threshold value on the pixel intensity of the original image.  Pixel intensity is a pixel's "lightness" or "brightness".  Essentially, once a midpoint between the most intense ("whitest") and least intense ("blackest") pixel on a page is established, lighter pixels are converted to white and darker are converted to black.  Or put another way, pixels with an intensity value above the threshold are converted to white, and those below the threshold are converted to black.  This midpoint (or "threshold") can be set manually or found automatically by a software application. The Thresholding Method can be set to one of four ways:

  • Simple - Thresholds an image to black and white using a fixed threshold value between 1 and 255.
  • Auto - Selects a threshold value automatically using Otsu's Method.
  • Adaptive - Thesholds pixels based on the intensity of pixels in the local neighborhood.
  • Dynamic - Performs adaptive thresholding, while preserving dark areas on the page.

Each method has its own set of configurable properties. For more information on binarization and each method, visit the Binarize article.

Importantly, the document is not permanently binarized.  It is only temporarily turned black and white to figure out which pixels to remove.  Recall from the previous step, the scales below the "G" in the Grooper logo were not removed.  This is because the threshold determined by the default "Auto" setting was too low.  Those pixels were determined to be more intense than the automatically determined threshold of 140.  If we increase the threshold to 200, these pixels' intensity will fall below the the threshold and be converted to black, as seen below.

Threshold 140
Threshold 200


You may be concerned the text quality is affected by increasing the threshold that high.  Remember, we are only temporarily binarizing the image for the purpose of dropping out the shape.  It will have no effect on how the text is read via OCR (unless text is removed as part of the shape).

Remove the shape's pixels

Once the image binarized and a shape is detected, a dropout mask is created and pixel locations from the binarized image matching pixel locations from the shape mask are removed.  They aren't physically erased however.  Rather, they are colored in using one of two "Dropout Methods".  This can be either "Fill" or "Inpaint"



Dropout Method: Fill

Fill is the most common method.  By default, this will replace pixels in the original image with a color matching the image's background.  Alternatively, you can pick which color to fill the dropped out pixels.  In the example we've been using, the background color was identified as a shade of gray seen in the Output Image bellow.



You can change what color fills the dropout mask using the "Fill Color" property.  Expand the "Dropout Method" by double clicking it and select "Fill Color".  If it is blank, it is using the background color.



You can select a new color by expanding the dropdown menu and using the "Custom" "Web" and "System" tabs.



Setting the color to "White", we get a result closer to what we want.



There is still a faint outline of the logo.  This is because those pixels were turned white during binarization and therefore not included in the dropout mask.  We will resolve this issue using the "Mask Dilation Factor".



This property expands the dropout mask to increase the region of pixels to fill. 


No Mask Dilation Factor Mask Dilation Factor of "6".  No more logo!


FYI This property can be to a whole number from -16 to 16.  Where a positive number dilates the dropout mask, a negative number will erode it.  Rather than increasing the region of pixels filled, it will decrease it.  Below is the same logo with a dilation factor of -6.



Dropout Method: Inpaint

Inpaint fills the dropout mask using color information from pixels around the removed pixels.  This method is designed to match removed pixels to a colored or complex background. Student transcripts are a great example. They often are printed on paper with some kind of patterned background. For our example, the result looks odd because the document has just a white background, but it should demonstrate what is happening.


Remember, there is a small halo of pixels around the dropout out mask that were turned white when binarized and thus not included in the dropout mask.  Since these colored pixels are right next to the dropped out pixels, they are included in the pixel information used to fill the void from removing the dropout mask.
Much like we resolved the issue by dilating the dropout mask, we can set the "Mask Dilation Factor" (Set to "6" here) to bloat the dropout mask's size, effectively passing over most of these pixels.  Note, it does not entirely turn the area white because there are still light orange and green pixels in the neighborhood used to restore the dropped out area.


The Inpaint method also has two different methods of filling pixels: "Telea" and "NavierStokes". "Telea" restores pixels by approximating the value of the removed pixels based on the value of pixels around it.  More or less, if 75% of the pixels around it are white and 25% of the pixels around it are black, the pixel would become white.  The area of known pixels is called a "neighborhood".  You probably think about housing demographics the same way.  Let's say for every house on a block you know their household income but one.  75% of them fall into an "upper class" income bracket.  25% fall into "upper middle class".  While that one house's income level could be upper middle class (or even lower), given most of the houses on the block are upper class, it's safer to assume it is upper class as well.

"NavierStokes" uses equations from fluid dynamics to fill in pixels the same way a fluid would fill a void. Imagine pixel colors bleeding into the empty space the same way a liquid would fill a gap. If you had a grey colored liquid and a black colored liquid filling in a gap, they would compete to fill the space in certain ways. If there's less of the black liquid than grey around the gap, ultimately more of the gap will be filled by grey liquid. Furthermore, the black liquid will pool in the gap closer to concentrations of black liquid around the gap. Filling in pixels works much the same way. First, if there's more grey pixels around the empty space, more of that void is going to be filled by grey pixels. Second, if a black pixel is right next to the empty space, at least part of that space should be filled by black pixels.

You can also control the "Inpaint Radius". This property specifies how large the area around the dropped out pixels Grooper is "looking at" to get a picture of how to fill it in. In other words, how big the neighborhood is.  You can really see the difference between "Telea" and "NavierStokes" when configuring this property. "Telea" is looking at the weighted sum of of pixels in the neighborhood around empty pixels to color them. Increasing the Inpaint Radius is going to increase the size of the neighborhood around the pixel to be filled. If we increase the Inpaint Radius to "25px", that much larger radius is going to include more white pixels.  So, we would expect to see at least a lighter image.  However, since "NavierStokes" uses fluid dynamics, this "whiting out" is much less pronounced. With the radius being larger, there's more "fluid" to draw from. But at some point, the void of pixels is filled and the "flow" of pixels into the void should stop.

3px Inpaint Radius 25px Inpaint Radius
Telea
NavierStokes


Keep in mind for our example, "Fill" worked just fine. "Inpaint" is more suited to match the removed area to a more complicated background than just white.

Dilation Factor vs. Dilation Factor vs. Mask Dilation Factor

You may have noticed, we skipped over one Shape Removal property, "Dilation Factor".  You may have also noticed this term has popped up a lot.  There is a "Dilation Factor" property in "Detection Settings". There is a "Dilation Factor" property in the main "Shape Removal" property panel.  There is a "Mask Dilation Factor" sub-property under the "Dropout Method" property.

The "Dilation Factor" property in the main Shape Removal property panel, dilates the Shape Mask.  This is set to "4" by default.  This is why the shape looks bloated when you look at the "Shape Mask" diagnostic image.  It is dilated by default in an attempt to account for variations from the sample image and the image trying to be removed.  Unlike other dilation factors, this can only be altered to positively dilate the image.  The shape mask cannot be eroded in other words.



Refer to below for differences on the various dilation factors within "Shape Removal"

Dilation Factor (during Shape Detection)
This applies dilation to a sample image in order to aid the shape detection portion of the operation.  This property is found in the "Preprocessing" section of the "Detection Settings" property window.  If applied, it can make for a kind of "fuzzy" image matching. This can help match images on documents that don't exactly match the given sample image. However, it can produce false-positive matches as well.
Dilation Factor (of the Shape Mask)
This applies dilation to the Shape Mask.  It is one of the main properties in the Shape Detection property panel.  This will increase the size of the Dropout Mask. If a dilated Shape Mask overlays pixels from the binarized document that an un-dilated one would not, those pixels would be included in the Dropout Mask.
Mask Dilation Factor
This applies dilation to the dropout portion of the operation.  This is a sub-property of the "Dropout Method" property.  After the Dropout Mask is created, and those pixels are removed, applying dilation here fills more pixels in around the removed area.


Property Details

There are four configurable properties available to Shape Removal: Detection Settings, Binarization, Dilation Factor, and Dropout Method. Some of these have substantial subproperies available to them. They are all detailed below.

Detection Settings Details

The properties located in "Detection Settings" are used to set sample images to detect on documents and configure how and where they are detected.  Pressing the ellipsis button at the end of the property will bring up a new window with the properties listed below.


Property Default Value Information
General Properties
Sample Images 0 sample images Here, you will capture sample images of the shape you want to detect.  Press the ellipsis button at the end of the property to bring up a new window to add samples.  You will select documents from a test batch and lasso the image to be detected.
Shape Name Use this property to type a name used to identify the shape.
Proximity Measure SAD This property sets how similarity is determined between sample images and other images.  There are three methods available: SAD (or sum of absolute differences), CrossCorr (or normalized cross-correlation), and SSD (or sum of squared distances).  Each method uses different different equations to compare the pixels in the sample image to the pixels on the document.  SAD is a very simple way to automate searching for sample images.  It measures the absolute difference between each pixel in the sample image with the corresponding pixel in the block its being compared to.  Potentially, SAD may be unreliable given changes in lighting, color, or image degredation, but is generally the go do method for shape detection.
Background Differencing False Setting this property to true can help when dealing with shapes with a lot of blank space in the sample image.  Shapes containing mostly white space can be challenging.  If 90% of the image's pixels are white, the Shape Detection operation will match other regions on a document that also contain 90% white pixels.  This can produce a lot of false-positive matches with high confidence that erroneous regions match the sample.  When background differencing is enabled, the blank areas of the sample confidence values are scaled according to the color balance of the sample image.  If the sample contains 90% white pixels, matched regions on the document falling below 90% confidence are effectively removed as matches.
Minimum Confidence 80% This is the minimum confidence for a successful match (from 0% to 100%).
Orientation and Scale Properties
Maximum Angle 0 degrees This can account for instances when the image on a document is slightly rotated from the sample image's orientation (between 0 and 360 degrees).  Altering this property will also allow you to adjust the "Angle Step" during detection.  For example, if you set the Maximum Angle to 25 degrees and an Angle Step of 5 degrees, Shape Detection would look for a match that is rotated -25, -20, -15, -10, -5, 0, 5, 10, 15, 20, and 25 degrees from the original image instead of every single degree from -25 to 25.  The Maximum Angle must be an even multiple of the Angle Step (as in 5 is an even multiple of 25).
Minimum Scale 100% This can account for instances when the image on a document is scaled slightly smaller than the sample image (between 10% and 100%).  Altering this property will also allow you to adjust the "Scale Step" during detection.  For example, if you set the Minimum Scale to 50% and Scale Step to 10%, Shape Detection would look for a match that is 100%, 90%, 80%, 70%, 60%, and 50% the size of the sample image instead of 100%, 99%, 98%, and so on.
Maximum Scale 100% This can account for instances when the image on a document is scaled slightly larger than the sample image (between 100% and 400%).  Altering this property will also allow you to adjust the "Scale Step" during detection.  For example, if you set the Maximum Scale to 150% and Scale Step to 10%, Shape Detection would look for a match that is 100%, 110%, 120%, 130%, 140%, and 150% the size of the sample image instead of 100%, 101%, 102%, and so on.
Preprocessing Properties
Processing Resolution Dpi50 This sets the resolution at which the image is processed during Shape Detection.  This does not change the output resolution of the document itself.  It only effects the resolution when Grooper is looking for match to the sample image.  A higher dpi will force a more specific 1:1 match to the sample image.  A lower resolution will allow for a "looser" or "fuzzier" match, accounting for differences in the quality of the sample compared to the document set.
Binarization Disabled Binarization converts color images to black and white by "thresholding" the image.  Searching for a flat black and white shape on a (also binarized) black and white document may end up producing more accurate results.  This does not binarize the document itself, it only does so temporarily for Shape Detection.  After detection is performed, the image reverts to its original form.

Thresholding is the process of setting a threshold value on the pixel intensity of the original image.  Pixel intensity is a pixel's "lightness" or "brightness".  Essentially, once a midpoint between the most intense ("whitest") and least intense ("blackest") pixel on a page is established, lighter pixels are converted to white and darker are converted to black.  Or put another way, pixels with an intensity value above the threshold are converted to white, and those below the threshold are converted to black.  This midpoint (or "threshold") can be set manually or found automatically by a software application. The Thresholding Method can be set to one of four ways:

  • Simple - Thresholds an image to black and white using a fixed threshold value between 1 and 255.
  • Auto - Selects a threshold value automatically using Otsu's Method.
  • Adaptive - Thresholds pixels based on the intensity of pixels in the local neighborhood.
  • Dynamic - Performs adaptive thresholding, while preserving dark areas on the page.

Each method has its own set of configurable properties. For more information on binarization and these methods, visit the Binarize article.

Dilation Factor 0 "Dilation Factor" will bloat the edges of the sample image.  This is another way of getting a "fuzzy" match from the sample.  It will increase the range of pixels possible to produce a match along a shape's edge.
Region of Interest (inches) (0,0) : (0,0) If you know the physical location (give or take) the shape will be on a document, you can limit where Grooper looks for it using the "Region of Interest" property.  Pressing the ellipsis button at the end of the property will bring up a new window that allows you to lasso the area you expect to find the shape with your mouse.

Binarization Details

Binarization converts color images to black and white by "thresholding" the image.  Once a sample shape is found on a document, the document is binarized in order to target the pixels to be removed.

Thresholding is the process of setting a threshold value on the pixel intensity of the original image.  Pixel intensity is a pixel's "lightness" or "brightness".  Essentially, once a midpoint between the most intense ("whitest") and least intense ("blackest") pixel on a page is established, lighter pixels are converted to white and darker are converted to black.  Or put another way, pixels with an intensity value above the threshold are converted to white, and those below the threshold are converted to black.  This midpoint (or "threshold") can be set manually or found automatically by a software application. The Thresholding Method can be set to one of four ways:

  • Simple - Thresholds an image to black and white using a fixed threshold value between 1 and 255.
  • Auto - Selects a threshold value automatically using Otsu's Method.
  • Adaptive - Thesholds pixels based on the intensity of pixels in the local neighborhood.
  • Dynamic - Performs adaptive thresholding, while preserving dark areas on the page.

Each method has its own set of configurable properties. For more information on binarization and these methods, visit the Binarize article.

Dilation Factor Details

Dilation Factor here (in the main Shape Removal property panel) controls how dilated the Shape Mask is.  The Shape Mask is overlaid on a binarized document after one of the sample shapes was detected.  All pixels falling under the Shape Mask will be dropped out.  Dilating the mask widens the sample image, adding a pixel border around it, effectively expanding its edges.  Since all pixels underneath the Shape Mask will be removed, dilating it can account for small variations between the sample image and the image being removed.  The objective is to bloat the Shape Mask enough to intersect these small variations, but not too much to intersect other meaningful features on the page, such as text. Only positive numbers are allowed here. Meaning the Shape Mask can only be dilated, not eroded.

Dropout Method Details

This property determines how pixels targeted for removal during the dropout operation will be "removed".  They are not removed in that they are deleted.  They are removed in that they are colored in to match the image's background.  This can be set to "Fill" or "Inpaint".

The "Fill" method replaces dropped out pixels with a given color.  The "Fill Color" property determines what color is used to fill the pixels.  It defaults to a color determined to match the image's background.  Alternatively, you can pick which color to fill the pixels.  The "Mask Dilation Factor" will dilate the filled shape.  Colored pixels will be added to the shape's borders, increasing the size of the removed area.

"Inpaint" fills the dropout mask using color information from pixels around the removed pixels.  This method is designed to match removed pixels to a colored or complex background. Student transcripts are a great example. They often are printed on paper with some kind of patterned background.  There are two "Inpaint Method" options: Telea and NavierStokes.
  • "Telea" restores pixels by approximating the value of the removed pixels based on the average value of pixels around it.  If 75% of the pixels around it are white and 25% of the pixels around it are black, the pixel would become white.  The area of known pixels used to find this color average is called a "neighborhood".
  • "NavierStokes" uses equations from fluid dynamics to fill in pixels the same way a fluid would fill a void.

The "Inpaint Radius" property specifies how large the area around the dropped out pixels Grooper is "looking at" to get a picture of how to fill it in.  It increases the size of the analyzed "neighborhood" of pixels.  The "Mask Dilation Factor" will dilate the filled shape.  Colored pixels will be added to the shape's borders, increasing the size of the removed area.