2021:Split Pages (Activity)
Split Pages is an activity that will split a multi-page PDF or TIF document into individual pages.
When applied to a Batch Folder with an attached PDF or TIF file, the Split Pages activity will create a Batch Page object for each page in the file, which are created as children of the Batch Folder.
About
|
Split Pages if often a critical component to a Batch Process where documents are imported into new Batches from a digital source (as opposed to scanned paper documents). When a digital file is imported into Grooper, two things happen:
|
|
|
We can also process this document at this point. We can apply Grooper activities at the folder level, to this Batch Folder (by setting a Batch Process Step's Scope property to Folder). An activity running on the folder level can manipulate the content in the attached file. For example, if we ran the Recognize activity at the folder level, it would obtain text data from the attached PDF file. |
|
|
|
Why Split Pages?
There are two reasons to use the Split Pages activity to split out pages from a multipage document.
- To apply activities that require Batch Page objects to function.
- Chiefly the Image Processing and Separate activities.
- To increase compute efficiency.
- A Batch Folder is a single object, which can be processed by a single processing thread. If you split out the attached document's pages, each page becomes its own object in the Batch. Each page can also only be processed by a single thread, but with multiple page objects now present, multiple threads can now be used to process the document (one for each page).


