2023.1:Image Processing (Activity): Difference between revisions

From Grooper Wiki
// via Wikitext Extension for VSCode
No edit summary
 
(21 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{AutoVersion}}
{{AutoVersion}}


{|class="wip-box"
<blockquote>{{#lst:Glossary|Image Processing}}</blockquote>
 
{|class="download-box"
|
|
'''WIP'''
[[File:Asset 22@4x.png]]
|
|
This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains one or more '''Batches''' of sample documents. The second contains one or more '''Projects''' with resources used in examples throughout this article.
 
* [[Media:2023.1 Wiki Image-Processing-(Activity) Batch.zip]]
This tag will be removed upon draft completion.
* [[Media:2023.1 Wiki Image-Processing-(Activity) Project.zip]]
|}
|}


<blockquote>{{#lst:Glossary|Image Processing}}</blockquote>


== About ==
== About ==
The '''''Image Processing Activity''''' (generally applied via a '''Batch Process Step''') applies a preconfigured '''IP Profile''' to a document.


The ''Image Processing Activity'' (generally applied via a '''Batch Process Step''') applies a preconfigured '''IP Profile''' to a document.
An '''IP Profile''' lists a series of steps to performing '''''Image Processing''''' functions called "'''[[IP Command]]s'''".  There are several '''IP Commands''' in Grooper, including ones that remove borders from an image, adjust the skew angle of an image, change the color format of an image, and more. For more information on configuring an '''IP Profile''', visit the [[IP Profile]] wiki page.
 
An '''IP Profile''' lists a series of steps to performing image processing functions called "'''[[IP Command]]s'''".  There are several '''IP Commands''' in Grooper, including ones that remove borders from an image, adjust the skew angle of an image, change the color format of an image, and more. For more information on configuring an '''IP Profile''', visit the [[IP Profile (Object)|IP Profile]] wiki page.


<div style="padding-left: 1.5em">
<div style="padding-left: 1.5em">
=== Permanent vs. Temporary Image Processing ===
=== Permanent vs. Temporary Image Processing ===


The ''Image Processing Activity'' permanently alters a document's image by applying an '''IP Profile'''.  However, it is possible to temporarily clean up document images to benefit OCR results and revert back to the original document image.  This needs to be done during the ''[[Recognize]] Activity'' rather than the ''Image Processing Activity''.
The '''''Image Processing Activity''''' permanently alters a document's image by applying an '''IP Profile'''.  However, it is possible to temporarily clean up document images to benefit OCR results and revert back to the original document image.  This needs to be done during the '''''[[Recognize]] Activity''''' rather than the '''''Image Processing Activity'''''.


For example, you may have a document where table lines are getting in the way of accurate OCR.  However, if you remove these lines during the '''Image Processing''' activity, they will be permanently removed, making it difficult to review the documents in '''[[Review]]''' and changing the archival image stored later to something that no longer looks like the original document.   
For example, you may have a document where table lines are getting in the way of accurate OCR.  However, if you remove these lines during the '''''Image Processing''''' activity, they will be permanently removed, making it difficult to review the documents in '''[[Review]]''' and changing the archival image stored later to something that no longer looks like the original document.   


Instead, you can use an '''[[OCR Profile]]''' referencing an '''IP Profile''' containing a '''Line Removal''' command during '''Recognize'''.  The image will be temporarily changed according to the '''IP Profile'''.  Then, OCR will run on the altered image.  Last, the image will revert back to its original form.
Instead, you can use an '''[[OCR Profile]]''' referencing an '''IP Profile''' containing a '''Line Removal''' command during '''Recognize'''.  The image will be temporarily changed according to the '''IP Profile'''.  Then, OCR will run on the altered image.  Last, the image will revert back to its original form.


For more information on Temporary Image Processing, please see the [[OCR Profile (Object)|OCR Profile]], [[IP Profile (Object)|IP Profile]], and [[Recognize (Activity)|Recognize]] wiki pages.  
For more information on Temporary '''''Image Processing''''', please see the [[OCR Profile]], [[IP Profile]], and [[Recognize]] wiki pages.  


=== The Image Processing Activity ===
=== The Image Processing Activity ===


If you are more interested in making permanent changes to documents to clean up the pages and improve ''OCR'' results, then you might consider adding an ''Image Processing'' '''Batch Process Step''' to your '''Batch Process'''. The following are just a few things you can do by adding an appropriately configured '''IP Profile''' to your ''Image Processing'' '''Batch Process Step''':  
If you are more interested in making permanent changes to documents to clean up the pages and improve ''OCR'' results, then you might consider adding an '''''Image Processing''''' '''Batch Process Step''' to your '''Batch Process'''. The following are just a few things you can do by adding an appropriately configured '''IP Profile''' to your '''''Image Processing''''' step:  


{|style="margin:auto" cellpadding="10" cellspacing="5"
{|style="margin:auto" cellpadding="10" cellspacing="5"
Line 51: Line 51:
|}
|}


For more examples, instructions, and tips on setting up an IP Profile, take a look at the [[IP Profile (Object)|IP Profile]] wiki article.  
For more examples, instructions, and tips on setting up an IP Profile, take a look at the [[IP Profile]] wiki article.  


</div>
</div>
== How To ==
== How To ==
At the top of this article there are two .zip files you can download and upload to your Grooper environment to follow along with this tutorial. One of those .zip files is the '''Batch''' we will be working with containing 5 documents.
The '''IP Profile''' we will be referencing in our '''''Image Processing''''' activity contains the following '''IP Steps''':
* Normalize
* Auto Orient
* Auto Deskew
* Auto Crop
Below you will see what each document looks like before and after the '''''Image Processing''''' step we will be configuring is run on the '''Batch'''.
{|style="margin:auto" cellpadding="10" cellspacing="5"
|-style="text-align:center"
|style="width:400px"|'''Before''': This page is upside down. OCR will have a difficult time with this document the way it is.
|style="width:400px"|'''After''': The page is now right-side up and readable. We should get much better OCR results with this.
|-
|[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 01.png|center|border]]||[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 02.png|border|center]]
|-style="text-align:center"
|style="width:400px"|'''Before''': This page is right-side up and not askew.
|style="width:400px"|'''After''': Since this page was find just as it was, no changes were made.
|-
|[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 03.png|center|border|]]||[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 04.png|border|center]]
|-style="text-align:center"
|style="width:400px"|'''Before''': This page is right-side up and not askew.
|style="width:400px"|'''After''': Since this page was find just as it was, no changes were made.
|-
|[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 05.png|center|border]]||[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 06.png|border|center]]
|-style="text-align:center"
|style="width:400px"|'''Before''': This page is turned slightly askew due to how it was scanned in.
|style="width:400px"|'''After''': Now the page is straight. It looks better and will be easier to perform OCR on.
|-
|[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 07.png|center|border]]||[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 08.png|border|center]]
|-style="text-align:center"
|style="width:400px"|'''Before''': This page is also turned slightly askew due to how it was scanned in.
|style="width:400px"|'''After''': Now the page is straight. It looks better and will be easier to perform OCR on.
|-
|[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 09.png|center|border]]||[[File:2023.1 Image-Processing-(Activity) 02 How-To 01 Before-and-After 10.png|border|center]]
|}


<div style="padding-left: 1.5em">
<div style="padding-left: 1.5em">
=== Adding the Image Processing Step ===
=== Adding the Image Processing Step ===
The first thing we need to do is add an '''''Image Processing''''' '''Batch Process Step''' to our '''Batch Process''' and set the '''''Scope''''' at which the step will run.


# Right click on the '''Batch Process'''.  
# Right click on the '''Batch Process'''.  
Line 67: Line 109:




#<li value=5> Now you should have an ''Image Processing'' '''Batch Process Step''' in your '''Batch Process'''.  
#<li value=5> Now you should have an '''''Image Processing''''' '''Batch Process Step''' in your '''Batch Process'''.  
# By default, the '''''Scope''''' property is set to ''Page''. Generally, you want to keep your '''''Scope''''' to a ''Page'' level for ''Image Processing'' because the step permanently edits an image and must do so at the ''Page'' level.  
# By default, the '''''Scope''''' property is set to ''Page''. Generally, you want to keep your '''''Scope''''' to a ''Page'' level for '''''Image Processing''''' because the step permanently edits an image and must do so at the ''Page'' level.  




Line 75: Line 117:


=== Configuring the Batch Process Step ===
=== Configuring the Batch Process Step ===
Now that we have our '''''Image Processing''''' step in our '''Batch Process''' and have set the '''''Scope''''', we need to configure the properties of the step. We will configure the properties in the right-hand property grid to give instructions to Grooper on how to execute the step.


# We are using an '''IP Profile''' that we have copied and pasted from the "Essentials" '''Project'''. The "Essentials" '''Project''' comes pre-installed with every Grooper Repository.  
# We are using an '''IP Profile''' that we have copied and pasted from the "Essentials" '''Project'''. The "Essentials" '''Project''' comes pre-installed with every Grooper Repository.  
Line 93: Line 137:




=== Image Processing Considerations ===
=== PDF Options: Bursting vs. Rendering ===


Imagine you have a mix of image-based and text-based PDFs in a '''Batch'''.  You could even have PDF files that have a mix of image-based and text-based pages within a single fileSome of these image-based pages may need permanent image cleanup, using an '''IP Profile''' and the '''Image Processing''' activity.  However, there's generally no reason to apply an '''IP Profile''' to a text-based PDF. 
The '''''PDF Options''''' properties only apply if you are processing PDF pages.  If you are using '''''Image Processing''''' to process image pages (JPEGs, TIFs, etc.), you can ignore this section.
* The point of '''Image Processing''' is to clean up an image before handing that image to an OCR engine. You're not going to OCR text-based pages, you're just going to extract their native text data.


'''''Image Processing''''' is designed to selectively apply an '''IP Profile''', depending on the page's type.  The default settings are designed to work for most cases, without further configuration.


So what happens if you feed split pages to the '''Image Processing''' activity, some of whom are PDF page objects, others which are JPEG page objects? 
The '''''Image Processing''''' activity will conditionally apply the '''IP Profile''', given the following:
* For JPEG page objects, the '''Image Processing''' activity will apply the '''IP Profile''' no matter what.
* For PDF page objects, it depends.  '''Image Processing''' will ignore PDF pages depending on two things:
*# The PDF page type (image-based or text-based)
*# How two '''Image Processing''' properties are configured: '''''Bursting''''' and '''''Rendering'''''.


{|class="attn-box"
{|class=wikitable
|'''Page Type'''||'''Result'''||'''Notes'''
|-
|JPEG pages||The '''IP Profile''' will be applied in all cases, no matter what.
|
* This is normal behavior for the '''''Image Processing''''' activity.  It generally expects to process images.
|-
|Image-based PDF pages||The '''IP Profile''' will only be applied if '''''Bursting''''' and/or '''''Rendering''''' are enabled.
|
|
&#9888;
* The PDF will be copied over or overwritten as an image with the applied changes.
* The image-based PDF page types are "Single Image" and "Searchable" pages.  For a complete list of PDF page types, visit the [[PDF Page Types]] article.
|-
|Text-based PDF pages||ONLY '''Orient''' or '''Auto-Orient''' steps will be applied if '''''Rendering''''' is enabled.
|
* This will ONLY rotate the PDF's orientation. Since it does not need OCR, there are no other image cleanup that needs to be made.
|}
 
{|class="fyi-box"
|-
|
|
{|class="inner-box"
'''FYI'''
|
|
'''BEWARE OF COMMONLY USED TERMS ACCROSS MULTIPLE ACTIVITES'''
If both '''''Bursting''''' and '''''Rendering''''' are enabled, then '''''Bursting''''' will always take priority over '''''Rendering''''' if possible. Image-Based PDFs can be burst while Text-Based PDFs cannot.
|}


Let's take a look at 3 different scenarios: With JPEGs, image-based PDFs, and text-based PDFs.


The '''Image Processing''' activity's '''''Bursting''''' and '''''Rendering''''' properties are related to, but distinct from the '''Split Pages''' activity's and '''''Rasterize''''' command's '''''Bursting''''' and '''''Rendering''''' properties.
<b><big>JPEGs</big></b>


{|class=wikitable
|'''Bursting'''||'''Rendering'''||'''Result'''
|-
|''Enabled''||''Enabled''|| Image does not need to be burst or rendered because it is already an image. The whole '''IP Profile''' is applied.
|-
|''Disabled''||''Enabled''|| Image does not need to be burst or rendered because it is already an image. The whole '''IP Profile''' is applied.
|-
|''Disabled''||''Disabled''|| Image does not need to be burst or rendered because it is already an image. The whole '''IP Profile''' is applied.
|}


For '''Image Processing''', the '''''Bursting''''' and '''''Rendering''''' properties only pertain to how different types of PDF file types are processed by an '''IP Profile'''.
* There is no difference in the application of '''''Image Processing''''' to a JPEG if either '''''Bursting''''' or '''''Rendering''''' is ''enabled'' or ''disabled''. A JPEG is always processed the same.
|
* A JPEG does not need to be converted to an image because it already is an image.
[[File:2023_Split-Pages_03_Bursting-and-Rendering-PDFs_09.png]]
* The full '''IP Profile''' can be applied to the image.


''The Bursting and Rendering properties in Image Processing's property grid.''
|}
|}


<b><big>Image-Based PDF</big></b>


The '''Image Processing''' activity will conditionally apply the '''IP Profile''', given the following:
{|class=wikitable
{|class=wikitable
|'''Page Type'''||'''Result'''||'''Notes'''
|'''Bursting'''||'''Rendering'''||'''Result'''
|-
|-
|JPEG pages||The '''IP Profile''' will be applied in all cases, no matter what.
|''Enabled''||''Enabled''|| PDF will use '''''Bursting''''' properties and will be burst into an image. The whole '''IP Profile''' is applied.
|
* This is normal behavior for the '''Image Processing''' activity.  It generally expects to process images.
|-
|-
|Image-based PDF pages||The '''IP Profile''' will only be applied if '''''Bursting''''' is enabled.
|''Disabled''||''Enabled''|| PDF will use '''''Rendering''''' properties and will be rendered into an image. The whole '''IP Profile''' is applied.
|
* This will overwrite the PDF with a JPEG image with the applied changes.
|-
|-
|Text-based PDF pages||ONLY an '''Orient''' or '''Auto-Orient''' step will be applied if present in the '''IP Profile''' ONLY IF '''''Rendering''''' is enabled..
|''Disabled''||''Disabled''|| PDF will remain a PDF and the '''IP Profile''' cannot be applied.  
|
* This will ONLY rotate the PDF's orientation.  The page will still remain a PDF after the orientation change is applied.
|}
|}


{|cellpadding=10 cellspacing=5
* For image-based PDFs, with either '''''Bursting''''' or '''''Rendering''''' enabled, it will convert the PDF to an image using the enabled method (with '''''Bursting''''' taking priority). The '''IP Profile''' can then be applied.
|valign=top|
* An '''IP Profile''' can only be applied to images, not PDFs. So if both properties are disabled, then the '''IP Profile''' cannot be applied.  
For example, imagine you have a three page PDF file.
* The first page is a text-based page.
* The second page is an image-based page requiring some image cleanup.
** It needs to be de-skewed and have its border cropped.
* The third page is a text-based page, but needs to be re-oriented.




The '''Image Processing''' activity's '''''Bursting''''' and '''''Rendering''''' properties will allow you to do this.
<b><big>Text-Based PDF</big></b>


In this scenario, an '''IP Profile''' with the following steps would appropriately clean up the pages' problems:
{|class=wikitable
* '''Auto Orient'''
|'''Bursting'''||'''Rendering'''||'''Result'''
* '''Auto Deskew'''
|-
* '''Auto Border Cleanup'''
|''Enabled''||''Enabled''|| PDF will use '''''Rendering''''' properties and will be rendered into an image since it cannot be burst. ONLY '''Orient''' or '''Auto-Orient''' will be applied from the '''IP Profile'''.
|-
|''Disabled''||''Enabled''|| PDF will use '''''Rendering''''' properties and will be rendered into an image. ONLY '''Orient''' or '''Auto-Orient''' will be applied from the '''IP Profile'''.
|-
|''Disabled''||''Disabled''|| PDF will remain a PDF and the '''IP Profile''' cannot be applied.
|}


With '''''Bursting''''' and '''''Rendering''''' enabled, the '''Image Processing''' activity would affect the pages in the following ways:
* For Text-Based PDF, there is no reason to run any '''IP Profile Step''' other than '''Orient''' or '''Auto-Orient''' because the document is already perfect otherwise and has native text making OCR unnecessary.
* As a text-based PDF page with no orientation issues, the first page would not be processed at all.
* With '''''Rendering''''' enabled, it will render the PDF to an image (text-based PDFs cannot be burst). Then ONLY apply '''Orient''' or '''Auto-Orient''' if present in the '''IP Profile'''.  
* As an image-based PDF page, the second page would be processed by the '''IP Profile'''. Since the '''IP Profile''' made changes, the PDF page is overwritten with the updated image.
* Just like with Image-Based PDFs, the '''IP Profile''' can only be applied to images, not PDFs. So if both properties are disabled, then nothing from the '''IP Profile''' can be applied.   
* As a text-based PDF page with orientation issues, ONLY the '''Auto Orient''' step would be applied to the pageThe remaining steps would be ignored.
|valign=top|
[[File:Split-pages-ip-graphic.png]]
|}


</div>
</div>

Latest revision as of 12:25, 28 April 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

wallpaper Image Processing is an Activity that enhances contract Batch Page images and optimizes them for better OCR text recognition and data extraction results.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.


About

The Image Processing Activity (generally applied via a Batch Process Step) applies a preconfigured IP Profile to a document.

An IP Profile lists a series of steps to performing Image Processing functions called "IP Commands". There are several IP Commands in Grooper, including ones that remove borders from an image, adjust the skew angle of an image, change the color format of an image, and more. For more information on configuring an IP Profile, visit the IP Profile wiki page.

Permanent vs. Temporary Image Processing

The Image Processing Activity permanently alters a document's image by applying an IP Profile. However, it is possible to temporarily clean up document images to benefit OCR results and revert back to the original document image. This needs to be done during the Recognize Activity rather than the Image Processing Activity.

For example, you may have a document where table lines are getting in the way of accurate OCR. However, if you remove these lines during the Image Processing activity, they will be permanently removed, making it difficult to review the documents in Review and changing the archival image stored later to something that no longer looks like the original document.

Instead, you can use an OCR Profile referencing an IP Profile containing a Line Removal command during Recognize. The image will be temporarily changed according to the IP Profile. Then, OCR will run on the altered image. Last, the image will revert back to its original form.

For more information on Temporary Image Processing, please see the OCR Profile, IP Profile, and Recognize wiki pages.

The Image Processing Activity

If you are more interested in making permanent changes to documents to clean up the pages and improve OCR results, then you might consider adding an Image Processing Batch Process Step to your Batch Process. The following are just a few things you can do by adding an appropriately configured IP Profile to your Image Processing step:

This page is slightly askew. A Deskew IP Step can correct this.
The off-white background of this page can make certain things hard to read. A Binarize IP Step can change the image to black and white.
This page has a dark border around it that can make OCR more difficult. The Auto Border Crop and Border Fill IP Steps can remove the border.

For more examples, instructions, and tips on setting up an IP Profile, take a look at the IP Profile wiki article.

How To

At the top of this article there are two .zip files you can download and upload to your Grooper environment to follow along with this tutorial. One of those .zip files is the Batch we will be working with containing 5 documents.

The IP Profile we will be referencing in our Image Processing activity contains the following IP Steps:

  • Normalize
  • Auto Orient
  • Auto Deskew
  • Auto Crop

Below you will see what each document looks like before and after the Image Processing step we will be configuring is run on the Batch.

Before: This page is upside down. OCR will have a difficult time with this document the way it is. After: The page is now right-side up and readable. We should get much better OCR results with this.
Before: This page is right-side up and not askew. After: Since this page was find just as it was, no changes were made.
Before: This page is right-side up and not askew. After: Since this page was find just as it was, no changes were made.
Before: This page is turned slightly askew due to how it was scanned in. After: Now the page is straight. It looks better and will be easier to perform OCR on.
Before: This page is also turned slightly askew due to how it was scanned in. After: Now the page is straight. It looks better and will be easier to perform OCR on.


Adding the Image Processing Step

The first thing we need to do is add an Image Processing Batch Process Step to our Batch Process and set the Scope at which the step will run.

  1. Right click on the Batch Process.
  2. Hover over "Add Activity", then hover over "Cleanup & Recognition". Then click on "Image Processing..."
  3. When the "Add Activity" window pops up, you can change the Step Name if you like. In this tutorial we are going to keep it as the default of "Image Processing".
  4. Click "EXECUTE" at the top right corner of the "Add Activity" window.


  1. Now you should have an Image Processing Batch Process Step in your Batch Process.
  2. By default, the Scope property is set to Page. Generally, you want to keep your Scope to a Page level for Image Processing because the step permanently edits an image and must do so at the Page level.



Configuring the Batch Process Step

Now that we have our Image Processing step in our Batch Process and have set the Scope, we need to configure the properties of the step. We will configure the properties in the right-hand property grid to give instructions to Grooper on how to execute the step.

  1. We are using an IP Profile that we have copied and pasted from the "Essentials" Project. The "Essentials" Project comes pre-installed with every Grooper Repository.
  2. Click the hamburger icon to the right of the IP Profile property.
  3. Navigate to and select the IP Profile.


  1. If you want Grooper to save an unedited copy of the file attached to the document, click the check box next to Enable Undo to set the property to True.


  1. You can click the hamburger icon to the right of the Compression property to set a custom format for this Batch Process Step. If left as the default (none), then it will use the compression settings specified on the root node of the repository.


PDF Options: Bursting vs. Rendering

The PDF Options properties only apply if you are processing PDF pages. If you are using Image Processing to process image pages (JPEGs, TIFs, etc.), you can ignore this section.

Image Processing is designed to selectively apply an IP Profile, depending on the page's type. The default settings are designed to work for most cases, without further configuration.

The Image Processing activity will conditionally apply the IP Profile, given the following:

Page Type Result Notes
JPEG pages The IP Profile will be applied in all cases, no matter what.
  • This is normal behavior for the Image Processing activity. It generally expects to process images.
Image-based PDF pages The IP Profile will only be applied if Bursting and/or Rendering are enabled.
  • The PDF will be copied over or overwritten as an image with the applied changes.
  • The image-based PDF page types are "Single Image" and "Searchable" pages. For a complete list of PDF page types, visit the PDF Page Types article.
Text-based PDF pages ONLY Orient or Auto-Orient steps will be applied if Rendering is enabled.
  • This will ONLY rotate the PDF's orientation. Since it does not need OCR, there are no other image cleanup that needs to be made.

FYI

If both Bursting and Rendering are enabled, then Bursting will always take priority over Rendering if possible. Image-Based PDFs can be burst while Text-Based PDFs cannot.

Let's take a look at 3 different scenarios: With JPEGs, image-based PDFs, and text-based PDFs.

JPEGs

Bursting Rendering Result
Enabled Enabled Image does not need to be burst or rendered because it is already an image. The whole IP Profile is applied.
Disabled Enabled Image does not need to be burst or rendered because it is already an image. The whole IP Profile is applied.
Disabled Disabled Image does not need to be burst or rendered because it is already an image. The whole IP Profile is applied.
  • There is no difference in the application of Image Processing to a JPEG if either Bursting or Rendering is enabled or disabled. A JPEG is always processed the same.
  • A JPEG does not need to be converted to an image because it already is an image.
  • The full IP Profile can be applied to the image.


Image-Based PDF

Bursting Rendering Result
Enabled Enabled PDF will use Bursting properties and will be burst into an image. The whole IP Profile is applied.
Disabled Enabled PDF will use Rendering properties and will be rendered into an image. The whole IP Profile is applied.
Disabled Disabled PDF will remain a PDF and the IP Profile cannot be applied.
  • For image-based PDFs, with either Bursting or Rendering enabled, it will convert the PDF to an image using the enabled method (with Bursting taking priority). The IP Profile can then be applied.
  • An IP Profile can only be applied to images, not PDFs. So if both properties are disabled, then the IP Profile cannot be applied.


Text-Based PDF

Bursting Rendering Result
Enabled Enabled PDF will use Rendering properties and will be rendered into an image since it cannot be burst. ONLY Orient or Auto-Orient will be applied from the IP Profile.
Disabled Enabled PDF will use Rendering properties and will be rendered into an image. ONLY Orient or Auto-Orient will be applied from the IP Profile.
Disabled Disabled PDF will remain a PDF and the IP Profile cannot be applied.
  • For Text-Based PDF, there is no reason to run any IP Profile Step other than Orient or Auto-Orient because the document is already perfect otherwise and has native text making OCR unnecessary.
  • With Rendering enabled, it will render the PDF to an image (text-based PDFs cannot be burst). Then ONLY apply Orient or Auto-Orient if present in the IP Profile.
  • Just like with Image-Based PDFs, the IP Profile can only be applied to images, not PDFs. So if both properties are disabled, then nothing from the IP Profile can be applied.