2.90:Content Action: Difference between revisions
Dgreenwood (talk | contribs) |
Dgreenwood (talk | contribs) No edit summary |
||
| (17 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
__NOINDEX__ | |||
{|cellpadding=10 cellspacing=5 style="margin:12px" | |||
|- | |||
|style="font-size:200%; background-color:#662d91; color:white; width:28px; text-align:center"|'''‼''' | |||
|style="border: 4px solid #662d91"| | |||
'''OLD TECH DETECTED!!!''' | |||
As of version 2021, '''Content Action''' no longer exists in Grooper. It has been replaced by the '''[[Split Pages]]''' and '''[[Merge]]''' activities and PDF '''[[Execute]]''' commands. | |||
|} | |||
<blockquote style="font-size:14pt"> | <blockquote style="font-size:14pt"> | ||
'''Content Action''' is an '''Activity''' providing additional functionality for multi-page file formats (PDF and TIF files) through one of five actions: ''Split'', ''Merge'', ''ClearChildren'', ''ClearContent'', and ''RepairPDF''. | '''Content Action''' is an '''Activity''' providing additional functionality for multi-page file formats (PDF and TIF files) through one of five actions: ''Split'', ''Merge'', ''ClearChildren'', ''ClearContent'', and ''RepairPDF''. | ||
| Line 155: | Line 165: | ||
See below for more information on each '''Content Action''' action type changes in version '''2021'''. | See below for more information on each '''Content Action''' action type changes in version '''2021'''. | ||
=== Split === | === Split (Becomes the Split Pages Activity) === | ||
[[File:Content-action-about-split-18.png|thumb|left|The '''Content Action''' ''Split'' action is replaced by the '''Split Pages''' activity.]] | [[File:Content-action-about-split-18.png|thumb|left|The '''Content Action''' ''Split'' action is replaced by the '''Split Pages''' activity.]] | ||
| Line 191: | Line 201: | ||
|} | |} | ||
=== Merge === | {|cellpadding="10" cellspacing="5" | ||
|-style="background-color:#36b0a7; color:white" | |||
|style="font-size:14pt"|'''FYI''' | |||
| | |||
Splitting out pages in a PDF can be time consuming, particularly for large multi-page PDFs. This is due to basic thread processing requirements. For every page in the file, Grooper must create a new '''Batch Page''' object for each page each with its own PDF content, essentially drawing a new image for each split page rendered (or, for each '''Batch Page''' object created). This "drawing" operation is called "rasterization". However, the '''Split Pages''' activity will utilize one processing thread per document folder in the '''Batch'''. If your documents are on the larger side, this can cause a bottleneck in your '''Batch Process''', waiting for '''Split Pages''' to finish rasterizing each page in the file. | |||
In '''Grooper Version 2021''', this process can be sped up with the ''Rasterize'' '''Execute''' command. The idea here is to apply the '''Split Pages''' activity to break out the PDF into ''un-rendered'' pages first. Then, fully render them with the ''Rasterize'' command. This speeds up the process by allowing the time consuming rendering and rasterizing of each page's image to run multithreaded, with one thread available to process each split '''Batch Page''' instead of each the single '''Batch Folder''' they were split from. | |||
: 1. To do this, first run the '''Split Pages''' activity at the appropriate '''Batch Folder''' level with ''both'' the '''''Image Bursting''''' and '''''Rendering''''' properties ''Disabled''. | |||
: 2. Then, run the '''Execute''' command on the '''Batch Page''' level, selecting the ''Batch Page'' '''''Object Type''''' and the ''Rasterize'' '''''Command'''''. | |||
This is furthermore a significant improvement from the '''Content Action''' ''Split'' way of doing this before '''Version 2021'''. This capability to improve the performance of the split operation utilizing the ''Rasterize'' '''Execute''' command was not possible in previous versions. | |||
|} | |||
=== Merge (Becomes the Merge Activity) === | |||
[[File:Content-action-about-split-19.png|thumb|left|The '''Content Action''' ''Merge'' action is replaced by the '''Merge''' activity.]] | [[File:Content-action-about-split-19.png|thumb|left|The '''Content Action''' ''Merge'' action is replaced by the '''Merge''' activity.]] | ||
<br clear = all> | <br clear = all> | ||
{|cellpadding=10 cellspacing=5 | |||
The | |valign=top style="width:50%"| | ||
The '''Content Action''' ''Merge'' action is now its own activity, named '''Split Pages'''. | |||
You still have two options for merging child content into a multipage filetype, controlled by the '''''Merge Format''''' property. | You still have two options for merging child content into a multipage filetype, controlled by the '''''Merge Format''''' property. | ||
# PDF | # ''PDF Format'' | ||
# TIF | # ''TIF Format'' | ||
| | |||
[[File:Content-action-2021changes-03.png]] | |||
|- | |||
|valign=top| | |||
'''Changes - PDF Format''' | |||
''' | The ''TIF Format'' option configuration is unchanged. However, there is a lot more you can do with the ''PDF Format'' with the '''Merge''' activity in '''2021''' than you could with the ''Merge'' '''Content Action''' in '''2.90'''. | ||
# The '''''Make Searchable''''' property has moved to '''''Build Options''''' > '''''Searchable''''' | |||
# The '''''Linearized''''' property has moved to '''''Build Options''''' > '''''Linearized'''''. | |||
# The '''''Jpeg Quality''''' property has been replaced with the Boolean '''''Compressed''''' property. | |||
# The '''''PDF Page Source''''' and '''''Prefer Child Versions''''' properties have been normalized to a single '''''Always Build''''' property. | |||
#* Setting this property to ''True'' will ''always'' generate the PDF from the document folder's child '''Batch Page''' and '''Batch Folder''' objects, even if a PDF version already exists on the document folder. This property setting mostly applies to PDF generation outside of the '''Merge''' activity (for example PDF generation during the '''Export''' activity). By default the '''Merge''' activity is using a document folder's child objects' content to generate a PDF file. That said, if for whatever reason, you find the '''Merge''' activity is ''not'' replacing a document folder's native file, you may try enabling this property as your first point of troubleshooting. | |||
| | |||
[[File:Content-action-2021changes-04.png]] | |||
|- | |||
|valign=top| | |||
'''New Properties - PDF Format''' | |||
'''New Properties''' | # Additional '''''Build Options''''' | ||
#* This includes options for inserting bookmarks, deduplicating content used to build the PDF, and generating a PDF with timestamps for hash-based testing mechanisms. | |||
# '''''Display Mode''''' - Lets you configure the initial viewing mode when the PDF is opened (in a PDF viewer such as Adobe Acrobat). | |||
#* For example, setting this to ''Bookmarks'' will open the PDF with the Bookmarks tab displayed. | |||
# '''''Viewer Preferences'''''- Lets you configure how the viewing application (such as Adobe Acrobat) should initialize when the PDF is viewed. | |||
#* For example, the ''Hide Toolbar'' option will hide the toolbar when the PDF is viewed. | |||
# '''''Generate Mode''''' - This property will generate diagnostic versions of the PDF allowing the text-behind content to be reviewed for quality. | |||
#* This can be a tool to help you asses the accuracy of your OCR results, both the character level accuracy and their alignment with the text pixels on the document's image. | |||
#* The '''''Searchable''''' option ''must'' be enabled for this to work. | |||
| | |||
[[File:Content-action-2021changes-05.png]] | |||
|- | |||
|valign=top| | |||
'''New Properties - All Formats''' | |||
# '''''Clear on Completion''''' – Lets you delete or keep the pages that were merged. | # '''''Clear on Completion''''' – Lets you delete or keep the pages that were merged. | ||
# '''''Output Filename''''' – Lets you specify a filename for the merged file. | |||
#* Otherwise, the file will be named according to the '''Document Type''' assigned to the folder during classification (or in the case of unclassified document folders, generically as "Document (#).pdf") | |||
| | |||
[[File:Content-action-2021changes-06.png]] | |||
|} | |||
=== ClearChildren, ClearContent, and RepairPDF (Become Execute Commands) === | |||
[[File:Content-action-about-split-20.png|thumb|left|The '''Content Action''' ''ClearChildren'', ''ClearContent'', and ''RepairPDF'' actions are replaced by the '''Execute''' activity.]] | |||
<br clear = all> | |||
The remaining '''Content Action''' actions are applied using the '''Execute''' activity. The '''Execute''' activity is designed to process a variety of simple document processing commands. These are simple manipulations of one kind of object or another. Do "x" command to "y" type of object. That's it. Typically, no further property configuration is required other than listing what command you want to apply to what object type. | |||
Such is also the case for the ''ClearChildren'', ''ClearContent'' and ''RepairPDF'' actions. These are very simple actions requiring no property configuration. For example, ''ClearChildren'' just deletes the child objects of a '''Batch Folder''' in a '''Batch'''. No further configuration required. It is a very simple command we're applying to '''Batch Folder''' objects in the '''Batch'''. | |||
Because they are simple execution commands applied to a particular type of object (the '''Batch Folder''' object), these actions can be applied using the '''Execute''' activity. In fact, the '''Execute''' activity could always functionally apply these actions. In version '''2021''', we decided to trim some fat and normalize this functionality into a single '''Activity'''. Going forward, the ''ClearChildren'', ''ClearContent'', and ''RepairPDF'' actions will be applied using the '''Execute''' activity. | |||
{|cellpadding=10 cellspacing=5 | |||
|valign=top style="width:50%"| | |||
The '''Execute''' activity applies one or more execution commands to an object type. | |||
To configure the activity, you must define what command you want to execute. Do this using the '''''Commands''''' property. | |||
| | |||
[[File:Content-action-2021changes-07.png]] | |||
|- | |||
|valign=top| | |||
Pressing the ellipsis button at the end of the '''''Commands''''' property will bring up list editor window to add one or more '''''Commands'''''. | |||
# Press the "Add" button to add a '''''Command'''''. | |||
#* There are two components of a '''''Command''''': | |||
#*# The '''''Object Type''''' the command is applied to. | |||
#*# The specific '''''Command''''' executed. | |||
# Depending on what you want to do (the ''ClearChildren'', ''ClearContent'', and ''RepairPDF'' actions in our case), the first step is figuring out what the appropriate '''''Object Type''''' is. | |||
| | |||
[[File:Content-action-2021changes-08.png]] | |||
|- | |||
|valign=top| | |||
Once you've selected an '''''Object Type''''', the '''''Command''''' property allows you to choose the specific command you want to apply to that object in a '''Batch'''. | |||
See below for the analogous '''Execute''' '''''Object Type''''' and '''''Command''''' combination for the ''ClearChildren'', ''ClearContent'', and ''RepairPDF'' actions. | |||
| | |||
[[File:Content-action-2021changes-09.png]] | |||
|} | |||
<tabs style="margin:20px"> | |||
<tab name="ClearChildren" style="margin:20px"> | |||
=== ClearChildren === | === ClearChildren === | ||
{|cellpadding=to cellspacing=5 | |||
|valign=top style="width:30%"| | |||
'''''Object Type''''' - ''Batch Folder'' | |||
'''''Command''''' - ''Clear Children'' | |||
| | |||
[[File:Content-action-2021changes-10.png]] | |||
|} | |||
</tab> | |||
<tab name="ClearContent" style="margin:20px"> | |||
=== ClearContent === | === ClearContent === | ||
''' | {|cellpadding=to cellspacing=5 | ||
|valign=top style="width:25%"| | |||
'''''Object Type''''' - ''Batch Folder'' | |||
'''''Command''''' - ''Remove Attachment'' | |||
Note: You may have noticed one of the '''Batch Folder''' '''''Commands''''' is ''Remove PDF Version''. | |||
''Remove Attachment'' and ''Remove PDF Version'' are similar in that the both remove files attached to the '''Batch Folder''' object in the filestore. | |||
The difference is ''Remove PDF Version'' will remove a '''Grooper''' generated PDF file. | |||
''Remove Attachment'' will remove the native file version of the document folder (or in other words, the original imported PDF file associated with the '''Batch Folder'''). | |||
| | |||
[[File:Content-action-2021changes-11.png]] | |||
|} | |||
</tab> | |||
<tab name="RepairPDF" style="margin:20px"> | |||
=== RepairPDF === | === RepairPDF === | ||
''' | {|cellpadding=to cellspacing=5 | ||
|valign=top style="width:30%"| | |||
'''''Object Type''''' - ''PDF Document'' | |||
'''''Command''''' - ''Repair'' | |||
| | |||
[[File:Content-action-2021changes-12.png]] | |||
|} | |||
</tab> | |||
</tabs> | |||
[[Category:Articles]] | |||
[[Category:Version 2.90]] | |||
Latest revision as of 10:57, 5 August 2025
| ‼ |
OLD TECH DETECTED!!! As of version 2021, Content Action no longer exists in Grooper. It has been replaced by the Split Pages and Merge activities and PDF Execute commands. |
Content Action is an Activity providing additional functionality for multi-page file formats (PDF and TIF files) through one of five actions: Split, Merge, ClearChildren, ClearContent, and RepairPDF.
Most commonly, this Activity is used to split multipage documents in a Batch, creating child Batch Page objects for each page in the file (using the Split action). This Activity is also used to merge child Batch Pages and Batch Folders into a multipage file, stored on the parent Batch Folder (using the Merge action). Though less common, there is additional functionality to delete child objects (using the ClearChildren action), remove a PDF file from a Batch Folder (using the ClearContent action), and repair a PDF file (using the RepairPDF action).
About
|
The Content Action activity manipulates the content files of a Batch Folder in a Batch, either a native file stored on the Batch Folder (typically a multipage PDF or TIF file) or the Batch Folder's child folder and page objects. What happens is determined by its Action property. This can be one of five choices:
Each Action option has its own property configuration options (with ClearChildren, ClearContent, and RepairPDF having no further configuration required). |
Split
|
The Split Action is the most commonly used functionality for the Content Action activity. It will split out the pages of an imported multipage PDF or TIF file. When PDF files are imported into a new Batch, a Batch Folder object is created for each multipage PDF file imported. The PDF file is stored as the "native file version" of the Batch Folder. It lives in the file store location associated with the Batch Folder (or in layman's terms it lives "on the document folder"). However, what if Grooper needs to process each page instead of the full document? It needs a page-level object to do page-level processing. This is what the Split Action accomplishes. It takes that native file living on the Batch Folder and creates child Batch Page objects from it, one Batch Page object for each page in the native multipage file. |
||
|
For this Batch, a single PDF was imported when the Batch was created.
|
||
|
You may further configure the Split Action with the Split Options properties. These properties control how the child page objects are created, including their resolution, file format, and color depth settings. |
||
Merge
|
The Merge Action will create a multipage PDF or TIF file, stored on the Batch Folder, created from the Batch Folder's child objects. Some people think of the Merge Action as Split in reverse. Whereas Split creates child Batch Page objects out of a PDF or TIF file on the parent document folder, Merge does the opposite. Merge creates a PDF or TIF file, stored on the parent document folder, out of child Batch Page objects. Note: A PDF or TIF file will be merged from all child content for a Batch Folder. A single file will be created even if the the Batch Folder has its own subfolders with their own child pages. One common use of the Merge action is to create a single PDF from an email, merging the email message text file with an attachment document (often itself a PDF file). |
||
|
In this case, the Batch Folder in this Batch is just a generic folder with three child Batch Page objects.
|
||
|
The child content can be merged into a PDF or TIF file. Depending on which format you choose, there are additional options for creating the merged file. For example, the PDF format includes an option to include the OCR text data for each page in the merged PDF via the Make Searchable property. |
ClearChildren
|
The ClearChildren Action is a destructive action. Whereas the Split action creates objects, the ClearChildren action deletes them. When applied to a Batch Folder, ClearChildren will delete all child pages and folders below it. Furthermore, it deletes all child objects. If you have a hierarchy of Batch Folder and Batch Pages with their own child Batch Folders and Batch Pages, all of them are deleted. Not just the Batch Pages. Not just the Batch Pages at the first child level. ClearChildren clears all children. |
||
|
The ClearChildren Action has no further configuration options. |
||
ClearContent
|
The ClearContent Action is a destructive action. Whereas the Merge action creates a file stored on a document folder, the ClearContent action deletes it. When applied to a Batch Folder, ClearContent will delete a native PDF (or TIF) version, if present. |
||
|
The ClearContent Action has no further configuration options. |
||
RepairPDF
Changes to Content Action in Version 2021
There is a big change to the Content Action activity in Grooper Version 2021. It doesn't exist anymore.
Don't fret! It's functionality is still accessible depending on the Action type.
- For the Split action, a new activity named Split Pages replaces and supplements its functionality.
- For the Merge action, a new activity named Merge replaces and supplements its functionality.
- The ClearChildren, ClearContent and RepairPDF actions are replaced by different commands using the Execute activity.
Why did we do this? This has to do with our evolving "Smart PDF Architecture". In version 2021, we started digging into ways we can more fully utilize the capabilities of the PDF file format. The big ticket item for this is the PDF Generate Behavior. However, anything PDF related ended up getting touched, including splitting and merging PDFs. As the split and merge capabilities grew in version 2021 with increased attention to the PDF file format, it made more sense to isolate these two document processing functions as whole activities (Split Pages and Merge respectively) than as two different property configurations of a single activity (the Split and Merge action types for the Content Action activity).
As for the remaining actions (ClearChildren, ClearContent, and RepairPDF), there was always a way of accomplishing the exact same thing in a Batch Process using the Execute activity. What's the difference between the Content Action activity set to ClearChildren and the Execute activity set to a Clear Children command for a Batch Folder? Nothing. There is no difference. They do the exact same thing. They delete the child objects of a Batch Folder in both cases.
Why have two activities that do the same thing? To simplify things, we just got rid of the Content Action activity entirely. In version 2021, you'll use the analogous Execute activity command for the ClearChildren, ClearContent, and RepairPDF actions.
See below for more information on each Content Action action type changes in version 2021.
Split (Becomes the Split Pages Activity)

|
Changes The Content Action Split action is now its own activity, named Split Pages. Some properties have had their names changed and/or moved around in the property grid.
|
|
New Properties
|
| FYI |
Splitting out pages in a PDF can be time consuming, particularly for large multi-page PDFs. This is due to basic thread processing requirements. For every page in the file, Grooper must create a new Batch Page object for each page each with its own PDF content, essentially drawing a new image for each split page rendered (or, for each Batch Page object created). This "drawing" operation is called "rasterization". However, the Split Pages activity will utilize one processing thread per document folder in the Batch. If your documents are on the larger side, this can cause a bottleneck in your Batch Process, waiting for Split Pages to finish rasterizing each page in the file. In Grooper Version 2021, this process can be sped up with the Rasterize Execute command. The idea here is to apply the Split Pages activity to break out the PDF into un-rendered pages first. Then, fully render them with the Rasterize command. This speeds up the process by allowing the time consuming rendering and rasterizing of each page's image to run multithreaded, with one thread available to process each split Batch Page instead of each the single Batch Folder they were split from.
This is furthermore a significant improvement from the Content Action Split way of doing this before Version 2021. This capability to improve the performance of the split operation utilizing the Rasterize Execute command was not possible in previous versions. |
Merge (Becomes the Merge Activity)

|
The Content Action Merge action is now its own activity, named Split Pages. You still have two options for merging child content into a multipage filetype, controlled by the Merge Format property.
|
|
|
Changes - PDF Format The TIF Format option configuration is unchanged. However, there is a lot more you can do with the PDF Format with the Merge activity in 2021 than you could with the Merge Content Action in 2.90.
|
|
|
New Properties - PDF Format
|
|
|
New Properties - All Formats
|
ClearChildren, ClearContent, and RepairPDF (Become Execute Commands)

The remaining Content Action actions are applied using the Execute activity. The Execute activity is designed to process a variety of simple document processing commands. These are simple manipulations of one kind of object or another. Do "x" command to "y" type of object. That's it. Typically, no further property configuration is required other than listing what command you want to apply to what object type.
Such is also the case for the ClearChildren, ClearContent and RepairPDF actions. These are very simple actions requiring no property configuration. For example, ClearChildren just deletes the child objects of a Batch Folder in a Batch. No further configuration required. It is a very simple command we're applying to Batch Folder objects in the Batch.
Because they are simple execution commands applied to a particular type of object (the Batch Folder object), these actions can be applied using the Execute activity. In fact, the Execute activity could always functionally apply these actions. In version 2021, we decided to trim some fat and normalize this functionality into a single Activity. Going forward, the ClearChildren, ClearContent, and RepairPDF actions will be applied using the Execute activity.
|
The Execute activity applies one or more execution commands to an object type. To configure the activity, you must define what command you want to execute. Do this using the Commands property. |
|
|
Pressing the ellipsis button at the end of the Commands property will bring up list editor window to add one or more Commands.
|
|
|
Once you've selected an Object Type, the Command property allows you to choose the specific command you want to apply to that object in a Batch. See below for the analogous Execute Object Type and Command combination for the ClearChildren, ClearContent, and RepairPDF actions. |
ClearContent
|
Object Type - Batch Folder Command - Remove Attachment
Remove Attachment and Remove PDF Version are similar in that the both remove files attached to the Batch Folder object in the filestore. The difference is Remove PDF Version will remove a Grooper generated PDF file. Remove Attachment will remove the native file version of the document folder (or in other words, the original imported PDF file associated with the Batch Folder). |





























