2023.1:PDF Data Mapping (Behavior): Difference between revisions

From Grooper Wiki
No edit summary
 
(148 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{WIP}}
{{AutoVersion}}
{{AutoVersion}}
<section begin="glossary" />
 
<blockquote>
<blockquote>{{#lst:Glossary|PDF Data Mapping}}</blockquote>
'''''PDF Data Mapping''''' is a '''Content Type''' '''''Behavior''''' designed to create an exportable PDF file with additional native PDF elements.
 
</blockquote>
<section end="glossary" />
'''''PDF Data Mapping''''' builds a data rich "Smart PDF" from a document folder's content.  Classification results, extracted data, and more can be used to insert native PDF elements into the generated PDF.
'''''PDF Data Mapping''''' builds a data rich "Smart PDF" from a document folder's content.  Classification results, extracted data, and more can be used to insert native PDF elements into the generated PDF.


Line 18: Line 15:
|
|
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1).  The first contains a '''Project''' with resources used in examples throughout this article.  The second contains one or more '''Batches''' of sample documents.
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1).  The first contains a '''Project''' with resources used in examples throughout this article.  The second contains one or more '''Batches''' of sample documents.
* [[Media:2023.1 Wiki PDF-Data-Mapping Batch.zip]]
* [[Media:2023.1 Wiki PDF-Data-Mapping Batches.zip]]
* [[Media:2023.1 Wiki PDF-Data-Mapping Project.zip]]
* [[Media:2023.1 Wiki PDF-Data-Mapping Project.zip]]
|}
|}


== About ==
== About ==
 
The '''''PDF Data Mapping''''' behavior allows Grooper users to more fully leverage the capabilities of the PDF file type.  The standard PDF '''''Export Format''''' (and '''''Merge Format''''') in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon '''''Export''''' (or '''''Merge''''').   
The '''''PDF Data Mapping''''' behavior allows Grooper users to more fully leverage the capabilities of the PDF file type.  The standard PDF '''''Export Format''''' in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon '''Export'''.   


However, this is just the "display information" required to open and read the document.  There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text.  PDF content can also include metadata, keywords, bookmarks, annotations, and more!   
However, this is just the "display information" required to open and read the document.  There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text.  PDF content can also include metadata, keywords, bookmarks, annotations, and more!   


'''''PDF Data Mapping''''' expands the Grooper's standard PDF generation capabilities.  It creates an exportable PDF file that includes additional content available to the PDF file type.  '''''PDF Data Mapping''''' merges Grooper collected data like classifation results and extracted data into the PDF by mapping these values to native PDF elements like bookmarks and annotation.
'''''PDF Data Mapping''''' expands Grooper's standard PDF generation capabilities.  It creates an exportable PDF file that includes additional content available to the PDF file type.  '''''PDF Data Mapping''''' merges data collected by Grooper into the PDF by mapping these values to native PDF elements like bookmarks and annotations.


The expanded '''''PDF Data Mapping''''' functionality can be divided into three categories:
The expanded '''''PDF Data Mapping''''' functionality can be divided into three categories:
* '''''Annotations'''''
* '''''Annotations''''': Highlight important text, insert comments, and embed interactive widgets like editable form fields and checkboxes.
* '''''Bookmarks'''''
* '''''Bookmarks''''':  Organize complex documents with bookmarks linking to child documents and/or extracted '''Data Fields'''.
* '''''Metadata'''''
* '''''Metadata''''':  Alter the PDFs default metadata, add searchable keywords and export custom metadata using data collected by Grooper.


=== Annotations ===
=== Annotations ===
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
Annotations are additional objects you can add to PDF documents.
Annotations are native PDF elements used to highlight and comment text in a PDF file. For '''''PDF Data Mapping''''', "annotations" also refer to interactable "widgets" such as checkbox and text form fields.  The '''''Annotations''''' functionality allows you to embed many of these native PDF annotations and widgets into Grooper generated PDFs.
* These annotations can increase the readability, such using a highlight annotation to call out important information.
* These annotations can add components for the reader to interact with the document, such as checkboxes and signature widgets.


'''''Annotations''''' can serve many purposes:
* '''''Annotations''''' can increase the readability, such using a highlight annotation to call out important information.
* '''''Annotations''''' can add components for the reader to interact with the document, such as checkboxes and signature widgets.


'''''PDF Data Mapping''''' can add the following kinds of annotations:
 
'''''PDF Data Mapping''''' can add the following kinds of annotations/widgets:
# Highlighting
# Highlighting
# Radio group buttons
# Radio group buttons
Line 96: Line 95:


'''''PDF Data Mapping''''' allows Grooper to alter and store additional metadata, including:
'''''PDF Data Mapping''''' allows Grooper to alter and store additional metadata, including:
# The PDF's default metadata fields, including its Title, Author, Subject values and more.
# The PDF's default metadata fields, including its "Title", "Author", "Subject" and more.
# Keywords
# Keywords
# Custom metadata fields
# Custom metadata fields
Line 145: Line 144:
[[File:2023.1 PDF-Data-Mapping 03-02.png]]
[[File:2023.1 PDF-Data-Mapping 03-02.png]]


== How To: Configure PDF Data Mapping ==
== About the documents used in these tutorials ==
=== About the documents used in this tutorial ===
The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail a more specific set up for a '''''PDF Data Mapping'''''.  This is a packet of documents from a single applicant containing a cover page and five different kinds of documents.
The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail a more specific set up for a '''''PDF Data Mapping'''''.  This is a packet of documents from a single applicant containing a cover page and five different kinds of documents.


Line 154: Line 152:
* Easily navigable bookmarks
* Easily navigable bookmarks


{|cellpadding=10 cellspacing=5
{|class="how-to-table"
|valign=top style="width:40%"|
|
'''Cover Page and Application'''
'''Cover Page and Application'''


Line 200: Line 198:
|valign=top|
|valign=top|
[[File:Pdf-generate-howto-docset-06.png]]
[[File:Pdf-generate-howto-docset-06.png]]
|-
|}
|style="width:40%" valign=top|
=== Notes on PDF Data Mapping, child documents and bookmarking ===
'''Notes on how this source file was separated in Grooper'''
{|class="how-to-table"
 
|
The original document was imported as a single document into Grooper.  We have separated it into child documents which will allow us to insert bookmarks for each separated document.
The original document was imported as a single document into Grooper.  We have separated it into child documents which will allow us to insert bookmarks for each separated document.


Line 217: Line 215:
|}
|}


=== Configuring Annotations ===
== How To: Configure Annotations ==
Annotations are native PDF elements used to highlight and comment text in a PDF file.  For '''''PDF Data Mapping''''' "annotations" also refer to interactable "widgets" such as checkbox and text form fields.  In this tutorial we will configure at least one example of each '''''Annotation''''' option. In this tutorial we will configure at least one example of each '''''Annotation''''' option.
* '''''Text Annotation''''' - Inserts a text-based comment in the PDF.
* '''''Highlight Annotation''''' - Highlights text on the PDF.
* '''''Radio Group Widget''''' - Inserts a group of selectable radio buttons in the PDF.
* '''''Checkbox Widget''''' - Inserts checkable checkboxes in the PDF.
* '''''Signature Widget''''' - Inserts a signature block in the PDF.
* '''''Textbox Widget''''' - Inserts an editable form field in the PDF.


In this tutorial we will configure at least one example of each '''''Annotation''''' option.
{|class="attn-box"
* '''''Text Annotation'''''
|
* '''''Highlight Annotation'''''
&#9888;
* '''''Radio Group Widget'''''
|
* '''''Checkbox Widget'''''
'''''BE AWARE:  PDF Data Mapping cannot insert annotations on PDF pages with form fields.'''''
* '''''Signature Widget'''''
 
* '''''Textbox Widget'''''
If a PDF page is form-fillable, it is ill advised to insert annotations and widgets on top of these form fields.  This can result in a corrupted PDF when it is generated by '''''Merge''''' or '''''Export''''''''''PDF Data Mapping''''' will not allow you to insert annotations and widgets on PDF pages with form fields.
|}


==== Prereqs - Data Fields and extracted data ====
=== Prereqs: Data Fields and extracted data ===
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
* For '''''Annotations''''' this means extracted '''Data Fields'''.
* For '''''Annotations''''' this means '''Data Fields'''.
* The '''''Extract''''' activity must run ''before'' '''''Merge''''' or '''''Export''''' generates the PDF.
* Data must be saved for each '''Data Field''' prior to the PDF being generated.
** The '''''Extract''''' activity must run ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
** If performing user assisted data review, the '''''Review''''' activity must complete ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.




Each of the '''''Annotation Types''''' references a '''Data Field''' in a '''Data Model''' as part of their configuration.  If the '''Data Field''' does not collect data during the '''Extract''' activity, the '''''PDF Data Mapping''''' won't know where to place the annotation.
Each of the '''''Annotation Types''''' references a '''Data Field''' in a '''Data Model''' as part of their configuration.  If the '''Data Field''' does not collect data during the '''Extract''' activity, the '''''PDF Data Mapping''''' won't know where to place the annotation.


===== About the Data Model used for this tutorial =====
==== About the Data Model used for this tutorial ====
 
The '''Data Model''' we're working with has several '''Data Fields''' that will allow '''''PDF Data Mapping''''' to place annotations and widgets.
The '''Data Model''' we're working with has several '''Data Fields''' that will allow '''''PDF Data Mapping''''' to place annotations and widgets.


Line 242: Line 251:
The "Last Name" "First Name" and "Middle Initial" '''Data Fields''' (in the "Applicant Information" '''Data Section''') will demonstrate the '''''Highlight Annotation'''''
The "Last Name" "First Name" and "Middle Initial" '''Data Fields''' (in the "Applicant Information" '''Data Section''') will demonstrate the '''''Highlight Annotation'''''
* These fields use '''''Labeled Value''''' to extract field values next to a label.
* These fields use '''''Labeled Value''''' to extract field values next to a label.
* Be aware, nearly any extractor type can be used to insert a highlight annotation.  Grooper just needs a location on the document to draw the highlight boundaries.
* Be aware, nearly any kind of Value Extractor can be used to insert a highlight annotation.  Grooper just needs a location on the document to draw the highlight boundaries.
|style="width:20% !important"|
|style="width:20% !important"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-02.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-02.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-08.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-08.png]]
Line 254: Line 263:
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the radio group widget as long as its '''''Check Mode''''' is set to ''CheckOne''.
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the radio group widget as long as its '''''Check Mode''''' is set to ''CheckOne''.
|
|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-03.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-03.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-09.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-09.png]]
Line 263: Line 272:
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the checkbox widget.
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the checkbox widget.
|
|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-04.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-04.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-10.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-10.png]]
Line 272: Line 281:
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
|
|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-05.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-05.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-11.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-11.png]]
Line 283: Line 292:
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
|
|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-06.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-06.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-12.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-12.png]]
|-
|-
|style="width:33% !important"|
|style="width:33% !important"|
The "Processed" '''Data Field''' will demonstrate the '''''Text Annotation'''''.
The "IsProcessed" '''Data Field''' will demonstrate the '''''Text Annotation'''''.
* '''''Text Annotation''''' inserts a text comment in the PDF.
* '''''Text Annotation''''' inserts a text comment in the PDF.
** Compare this to a '''''Textbox Widget''''' which adds an actual form field to the PDF to store a field value.
** Compare this to a '''''Textbox Widget''''' which adds an actual form field to the PDF to store a field value.
Line 295: Line 304:
** This is a technique common to '''''Text Annotation''''' use cases and will be explained in further depth below.
** This is a technique common to '''''Text Annotation''''' use cases and will be explained in further depth below.
|style="width:20% !important"|
|style="width:20% !important"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-01.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-01.png]]
|style="text-align: center;"|
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-07.png]]
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-07.png]]
|}
|}


=== Adding Annotations ===
'''''PDF Data Mapping''''' inserts various types of PDF annotations and widgets by configuring its '''''Annotations''''' property.  Users can add one or more '''''Annotation Types''''' to the '''''Annotations''''' list.  Adding a new '''''Annotation''''' to the list is simple.
With a '''''PDF Data Mapping''''' behavior added to a '''Content Type''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Select the '''''Annotations''''' property and press the ellipsis button at the end.
# This will bring up the '''''Annotations''''' editor.
# Press the "+" button.
# Select the '''''Annotation Type''''' you want to add from the dropdown list.
[[File:2023.1 PDF-Data-Mapping 03 02 02 Adding-Annotations-01.png]]
# <li value=6> Once added, you will see the '''''Annotation Type''''' added to the '''''Annotations''''' list.
# All '''''Annotation Types''''' will have a set of '''''General''''' properties to configure.
# Some '''''Annotation Types''''' have additional properties you can configure.
#* For example, the '''''Highlight Annotation''''' has '''''Appearance''''' properties you can configure to adjust the highlight's color and other appearance properties.
# Press "OK" when finished.
[[File:2023.1 PDF-Data-Mapping 03 02 02 Adding-Annotations-02.png]]
==== Notes on shared properties ====
All '''''Annotation Types''''' share a set of '''''General''''' properties.
* '''''Fields'''''
** Select '''Data Fields''' to map the '''Data Fields''' to the PDF annotation with this property.
** The '''''Fields''''' property is ''required''.
*** One or more '''Data Field''' must be selected to generate the annotation.
*** If you don't select any '''Data Fields''' ''or'' the selected '''Data Fields''' are not extracted, '''''PDF Data Mapping''''' will not insert an annotation in the output PDF.
*** Be aware, all '''Data Fields''' are selected by default.
* '''''Padding'''''
** The '''''Padding''''' property can adjust the size of the annotation.
** Grooper uses a '''Data Field's''' result instance to draw the annotation's boundaries.
*** The size of the '''Data Field's''' instance may be too small for what you want to appear on the output PDF.
*** If so, use '''''Padding''''' to increase the annotation's size on the PDF generated by '''''PDF Data Mapping'''''.
* '''''Allow Edit'''''
** '''''Allow Edit''''' refers to a reader's ability to edit the annotation as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the annotation (or widget).
** Enabling this property (turning it ''True'') will allow users to fully adjust the annotation in the PDF, including its size, location and other properties.
** Be aware, even when ''False'', users will still be able to interact with widgets, such as the '''''Checkbox Widget''''' or '''''Textbox Widget'''''.
* '''''Print'''''
** In a PDF viewing application, like Adobe Acrobat, all annotations and widgets '''''PDF Data Mapping''''' generates will be visible. The '''''Print''''' property determines whether or not the annotation is visible when the PDF is printed.
** Be aware, the default is ''False''.
*** Grooper presumes you will open the "Smart PDF" output by '''''PDF Data Mapping''''' will be opened in a PDF viewer (where all annotations will be visible). 
*** Grooper also presumes if you want to print the PDF, you want something more like the original document printed, not the one with additional PDF elements Grooper inserts.  If you ''do'' want those annotations and widgets visible when the PDF is printed, turn '''''Print''''' to ''True.
=== Annotation Types ===


==== Annotation Types ====
There are currently six types of annotations Grooper can add to the PDF it creates:
* '''''[[#Highlight Annotation|Highlight Annotation]]'''''
* '''''[[#Radio Group Widget|Radio Group Widget]]'''''
* '''''[[#Checkbox Widget|Checkbox Widget]]'''''
* '''''[[#Signature Widget|Signature Widget]]'''''
* '''''[[#Textbox Widget|Textbox Widget]]'''''
* '''''[[#Text Annotation|Text Annotation]]'''''


===== Highlight Annotation =====
==== Highlight Annotation ====
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
Line 311: Line 373:
* Like all '''''Annotations''''', this highlight can be printable or not.  When the '''''Print''''' property is ''False'', the highlight will show up when viewed in a PDF viewer but not if the PDF is printed.
* Like all '''''Annotations''''', this highlight can be printable or not.  When the '''''Print''''' property is ''False'', the highlight will show up when viewed in a PDF viewer but not if the PDF is printed.


In this example, we will use the '''''Highlight Annotation''''' to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form.
 
In this example, we will use the '''''Highlight Annotation''''' to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form.  To configure this '''''Annotation''''' we will:
* Select the '''Data Fields''' we wish to highlight.
* Adjust how we want the highlight to look.
|
|
{|
{|
Line 317: Line 382:
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-08.png]]
[[File:Pdf-generate-howto-08.png]]
|-
|-
Line 323: Line 388:
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-09.png]]
[[File:Pdf-generate-howto-09.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Highlight Annotation''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
 
# In the '''''Annotations''''' collection editor, press the "+" button to add the '''''Highlight Annotation''''' annotation.
# Use the '''''Fields''''' property to select the '''Data Fields''' you wish to highlight.
#* Refer to the previous tab if you are unclear how we got to this window.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# Select ''Highlight Annotation'' from the list.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 03 Highlight Annotation 01.png]]
|-
|valign=top|
# This will add a '''''Highlight Annotation''''' to the '''''Annotations''''' list.
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to highlight.  Click the ellipsis icon next to the '''''Fields''''' property to select which '''Data Fields''' you wish to highlight.
#* Whatever result is returned by the selected '''Data Fields''' will be used to create the highlighted annotation.
# In the window that pops up, mark the checkboxes next to the '''Data Fields''' you wish to highlight.
# In the window that pops up, mark the checkboxes next to the '''Data Fields''' you wish to highlight.
#* In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''. Once collected by the '''Extract''' activity, Grooper will know where these results are located on the document.  The '''''Highlight Annotation''''' annotation will then highlight the document as seen in the "''After Annotation''" image above.
#* In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''.
|
#* Be aware, these fields must be extracted by the '''''Extract''''' activity or nothing will be highlighted.
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 03 Highlight Annotation 02.png]]
# Press "OK" when finished.
|-
 
|valign=top|
[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-01.png]]
Optionally, you can control how the highlight looks.  Its color, size, opacity and whether or not there's a stroke around the highlighted rectangle.


# For instance, we set the '''''Padding''''' property to ''0.1in''
#* This will increase the size of the highlight rectangle by 0.1 inches on all sides.
#* All annotations have the ability to be padded to increase their size, not just '''''Highlight Annotation'''''.
#* You can also expand the '''''Padding''''' property's sub properties to adjust specific configurations for padding the '''''Left''''', '''''Top''''', '''''Right''''', and '''''Bottom'''''' edges.
# While we did not choose to do so, you can add a colored border around the highlighted rectangle by choosing a '''''Border Style''''' (such as ''Solid'' for a solid border or ''Dashed'' for a dashed line border)
#* The '''''Border Color''''' and '''''Border Width''''' properties will further help you configure the border produced.
#* Note:  While the '''''Border Color''''' and '''''Border Width''''' properties are configured to ''64, 64, 64'' and ''1pt'' by default, the '''''Border Style''''' is set to ''None'' by default.  With no border produced, these properties are ignored.  They will not be used to create a border until you choose a '''''Border Style'''''.
# We also set the '''''Fill Color''''' to ''Yellow''.
#* Grooper defaults to green.  This is the same green you see extraction results highlighted when you're testing out extractors in Grooper.
#* You can select colors using a dropdown list or use comma-separated values in the RBG color space.  For example, "yellow" is also ''255, 255, 128'' in the RBG color space.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 03 Highlight Annotation 03.png]]
|}


===== Radio Group Widget =====
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
{|cellpadding=10 cellspacing=5
#* Adjusting '''''Padding''''' for '''''Highlight Annotations''''' is common.  In this example, we increased the highlights size by 0.1 in on each side.
|valign=top style="width:50%"|
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.


[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-02.png]]




#<li value=6> Adjust the highlight's appearance, as desired, using the '''''Appearance''''' properties.
# Most commonly, users will adjust the '''''Fill Color'''''.
#* Use the dropdown to select from a list of system colors.
#* Or, enter an RGB value using the format <code>#, #, #</code>
#* This property defaults to the "Grooper green" highlight seen in '''Review's''' '''''Data View'''''.  In this example, we've changed it to ''Yellow''.
# Press "OK" when finished (or continue adding more '''''Annotations''''').


The '''''Radio Group Widget''''' annotation allows you to add radio buttons to the document. Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.  This '''''Annotation Type''''' uses OMR extraction techniques (such as '''''Labeled OMR''''' and '''''Zonal OMR''''') to find existing checkboxes on the document.  A group of radio buttons are then overlaid on top of the checkboxes when the '''''PDF Data Mapping''''' behavior builds the PDF file.
[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-03.png]]


For example, we will create a '''''Radio Group Widget''''' annotation from the "US Citizen" '''Data Field's''' resultWe have two choices, either "Yes" or "No".  Only one or the other can be chosenSo, this is well suited for a radio button group.
==== Radio Group Widget ====
|
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Radio Group Widget''''' overlays a group of radio button PDF elements on top of where a Grooper extractor finds OMR checkboxes on a document. 
* Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.
** Note radio buttons (inserted by '''''Radio Group Widget''''') differ from checkboxes (inserted by '''''Checkbox Widget''''')For radio buttons, only one choice out of a group may be selectedFor checkboxes, any number of choices may be selected.
* The '''Data Field(s)''' this annotation references ''must'' use an OMR extractor to return results: '''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR'''''
** This extractor ''must also'' have its '''''Mode''''' set to ''CheckOne'' (Only one box out of many may checked/selected).
* '''''PDF Data Mapping''''' will insert one radio button for each checkbox the extractor locates.
|valign=top|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-13.png]]
[[File:Pdf-generate-howto-13.png]]
|-
|-
Line 383: Line 442:
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-14.png]]
[[File:Pdf-generate-howto-14.png]]
|}
|}
|}
|}
With a '''''Radio Group Widget''''' added to the '''''Annotations''''' list:
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the group of radio buttons.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
#* In our case, we are choosing the "US Citizen" '''Data Field'''.
#* Be aware, this fields must (1) use an OMR extractor to return results (2) with its '''''Mode''''' set to ''CheckOne'' (3) have already been extracted by the '''''Extract''''' activity and (4) have located checkboxes during extraction or no radio buttons will be placed.
# Press "OK" when finished.
[[File:2023.1 PDF-Data-Mapping 03 04 02 Radio-Group-Widget-01.png]]
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the widget (press a radio button).
# Press "OK" when finished (or continue adding more '''''Annotations''''').
[[File:2023.1 PDF-Data-Mapping 03 04 02 Radio-Group-Widget-02.png]]
===== Be Aware:  Annotations are overlaid on a page's image =====


{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
|valign=top style="width:50%"|
# In the '''''Annotations''''' collection editor, click the "+" button to add the '''''Radio Group Widget''''' annotation.
'''''BE AWARE:''''' The '''''Radio Group Widget''''' overlays radio buttons on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
* Notice the original image for this document used checkboxes, not radio buttons.  We see an "X" inside of a square box.
# Select ''Radio Group Widget'' from the list.
|valign=top|
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 04 Radio Group Widget 01.png]]
[[File:Pdf-generate-howto-18.png]]
|-
|-
|valign=top|
|valign=top|
# This will add a '''''Radio Group Widget''''' to the '''''Annotations''''' list.
You can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the radio buttons.  Click the ellipsis icon to the right of the '''''Fields''''' property to select these '''Data Fields'''.
* In this case, the boxes were detected by the "detection only" '''''Box Detection''''' IP command and not removed by the "detection and removal" '''''Box Removal''''' command.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the radio buttons.
* '''''Box Detection''''' finds and store the checkbox locations and check states but does not actually alter the image in any way.
#* You may use the '''''Padding''''' property to adjust the size of the radio button if you desire.
#* These '''Data Fields''' ''must'' use an OMR based extraction method ('''''Labeled OMR''''', '''''Ordered OMR''''', or '''''Zonal OMR''''') to insert the radio buttons.
# In the "Fields" window that pops up, click the checkbox next to the '''Data Fields''' you wish to use to create the group of radio buttons.
#* In our case, we are choosing the "US Citizen" '''Data Field'''.  Once collected by the '''Extract''' activity, Grooper will know which results you want to use to create the radio buttons.  This will include the checkbox locations and check states stored in the document's layout data.  The '''''Radio Group Widget''''' annotation will then insert radio buttons into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 04 Radio Group Widget 02.png]]
[[File:Pdf-generate-howto-19.png]]
|-
|-
|valign=top|
|valign=top|
Let's briefly look at this "US Citizen" '''Data Field''' and see what's happening behind the scenes when '''''PDF Data Mapping''''' creates the radio buttons.
Maybe you care about this, and maybe you don't.  If you do, use '''''Box Removal''''' instead. 
 
* '''''Box Removal''''' will also find and store the checkbox locations and their check states, but it will ''also'' digitally remove the checkboxes from the document's image.  This will allow Grooper to extract the checkboxes and allow '''''PDF Data Mapping''''' to overlay the radio buttons on a field of blank pixels.
# We have selected the "US Citizen" '''Data Field''' in the '''Grooper''' Node Tree.
* Run '''''Box Removal''''' in an '''IP Profile''' using the '''''Image Processing''''' activity prior to running the '''''Extract''''' activity to do this.
# This '''Data Field''' uses the ''Labeled OMR'' extractor to return its result, looking for checkboxes next to the labels "Yes" and "No" on the document.
# We're going to test the extraction by going to the "Tester" tab.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 04 Radio Group Widget 03.png]]
|-
|valign=top|
|valign=top|
# Click the play icon to test the extraction.
[[File:Pdf-generate-howto-20.png]]
# The box next to "Yes" is checked.  This is ultimately the result returned to the "US Citizen" '''Data Field'''.
#* This is how the '''''Radio Group Widget''''' annotation knows where to place the radio button.  The data instance used to insert the PDF radio button is drawn around the detected box (in this case highlighted in green in the Document Viewer).
#* Since this is the detected checked result, the radio button is configured as "pressed" upon outputting the generated PDF.
# '''''Labeled OMR''''' on this document is returning "Yes" as the result of extraction.
# The box next to "No" is not checked.  The '''''Radio Group Widget''''' will also create radio buttons for the unchecked boxes next to labels on the document as well.
#* The alternate candidate data instances are used to insert the other PDF radio buttons in the group (in this case highlighted in red in the Document Viewer).
#* The unchecked boxes ''must'' be detected from a '''Box Detection''' or '''Box Removal''' '''IP Command''' in order to be inserted in the generated PDF.  They ''must'' be present in the document's layout data file ''before'' the '''Extract''' activity runs.
#* Since this is detected as an unchecked result, the radio button is not pressed upon outputting the generated PDF.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 04 Radio Group Widget 04.png]]
|}
|}


{|class="fyi-box"
==== Checkbox Widget ====
|
{|cellpadding=10 cellspacing=5
'''FYI'''
|
{|class="inner-box" cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
In the case of every '''''Annotation Type''''', '''''PDF Data Mapping''''' inserts the annotation by overlaying it on top of the documentThis can be important to keep in mind for all annotations but is often particularly relevant when inserting radio buttons using the '''''Radio Group Widget'''''.
The '''''Checkbox Widget''''' inserts one or more form-fillable checkboxes into the PDF on top of where a Grooper extractor finds OMR checkboxes.
* Checkboxes are common PDF elements used to indicate a choice from one or many options.
** Note checkboxes (inserted by '''''Checkbox Widget''''') differ from radio buttons (inserted by '''''Radio Group Widget''''').  For radio buttons, only one choice out of a group may be selectedFor checkboxes, any number of choices may be selected.
* The '''Data Field(s)''' this annotation references ''must'' use an OMR extractor to return results:  '''''Labeled OMR''''', '''''Ordered OMR''''', or '''''Zonal OMR'''''
* However, this extractor may use any of the OMR '''''Modes''''' (''CheckOne'', ''CheckMulti'' or ''Boolean'').
* '''''PDF Data Mapping''''' will insert a simple checkbox PDF element for each checkbox the extractor locates.


Notice the original image for this document used checkboxes, not radio buttons. We see an "X" inside of a square box.
 
|
In this example, we will create a '''''Checkbox Widget''''' for the checkboxes extracted using the "Checklist" '''Data Field'''.  This is a '''''Labeled OMR''''' extractor that uses the ''CheckMulti'' '''''Mode''''', indicating one of any number of checkboxes may be checked for each label.  Checked or not, the ''Checkbox Widget'' will insert a checkbox element into the generated PDF.
[[File:Pdf-generate-howto-18.png]]
|valign=top|
{|
|style="text-align:center"|
''Before Annotation''
|-
|style="text-align:center"|
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-03.png]]
|-
|style="text-align:center"|
''After Annotation''
|-
|-
|valign=top|
|style="text-align:center"|
The radio button annotations are simply overlaid on the page's imageYou can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-04.png]]
|}
|}
 
With a '''''Checkbox Widget''''' added to the '''''Annotations''''' list:
 
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the group of radio buttons.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
#* In our case, we are choosing the "Checklist" '''Data Field'''.
#* Be aware, this fields must (1) use an OMR extractor to return results (2) have already been extracted by the '''''Extract''''' activity and (3) have located checkboxes during extraction or no checkboxes will be placed.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-01.png]]
 
 
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printableAdjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the widget (check the checkboxes).
# Press "OK" when finished (or continue adding more '''''Annotations''''').
 
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-02.png]]


In this case, the boxes were stored in the layout data using the '''Box Detection''' '''IP Command'''.  This will find and store the checkbox locations and check states, but not actually alter the image in any way.
{|class="attn-box"
|
&#9888;
|
|
[[File:Pdf-generate-howto-19.png]]
'''''BE AWARE:''''' The '''''Checkbox Widget''''' overlays checkboxes on a page's image.  Any printed checkbox on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
|-
|valign=top|
Maybe you care about this, and maybe you don't.  If you do, you may consider using the '''Box Removal''' '''IP Command''' instead.  '''Box Removal''' will also find and store the checkbox locations and their check states, but it will ''also'' digitally remove the checkboxes from the document's image.


In this case, the boxes were stored in the layout data using the '''Box Removal''' '''IP Command'''.  Since the boxes are removed before the '''Export''' activity, the edges of the boxes are not present on the final image.  The radio button annotations are placed on blank pixels.
For more information, [[#Be Aware: Annotations are overlaid on a page's image|see above.]]
|
[[File:Pdf-generate-howto-20.png]]
|}
|}
|}
 
===== Checkbox Widget =====
==== Signature Widget ====
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
The '''''Signature Widget''''' inserts a signature block into the PDF.
* Signature blocks allow PDFs to capture digital signatures.  This allows you to create a document that can be digitally signed straight from Grooper on export.
* The '''Data Field(s)''' this annotation references ''will typically'' use a zonal extractor to define where the signature block should be: '''''Detect Signature''''' or '''''Highlight Zone''''' most commonly
* Other Value Extractors may work, but these are most typical.  '''''PDF Data Mapping''''' will insert the signature block using the geometric boundaries of the extraction instance.  Zonal extractors are well suited to define fixed boundaries of extraction results.




 
In this example, we will create a '''''Signature Widget''''' annotation for the signature line on the application form, using the "Signature" '''Data Field''' of our '''Data Model'''.  The '''''Signature Widget''''' will insert an interactable signature element into the generated PDF.
 
'''''PDF Data Mapping''''' also has the capability to insert form-fillable checkboxes as well, using the '''''Checkbox Widget''''' '''''Annotation Type'''''.  This '''''Annotation Type''''' also uses OMR extraction techniques (such as ''Labeled OMR'' and ''Zonal OMR'') to find existing checkboxes on the documentIt works a lot like the '''''Radio Group Widget''''' annotation, just instead of radio buttons, editable checkboxes are overlaid on the document.
 
For example, we will create a '''''Checkbox Widget''''' annotation for the checkboxes in the "Checklist" section of this document, the "Application", "Proposal Summary", "Essay", "Resume" and "Recommendation Letter" '''Data Fields'''.  These are Boolean OMR checkboxes, returning "true" if the box next to the corresponding label is checked, and "false" if unchecked.  In either case, checked or not, the ''Checkbox Widget'' will insert an editable checkbox element into the generated PDF.
|
|
{|
{|
Line 472: Line 563:
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|[[File:Pdf-generate-howto-23.png]]
bad picture
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
bad picture
[[File:Pdf-generate-howto-24.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Signature Widget''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
# In the '''''Annotations''''' collection editor, click the "+" button to add the '''''Checkbox Widget''''' annotation.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
# Select ''Checkbox Widget'' from the list.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 05 Checkbox Widget 01.png]]
|-
|valign=top|
# This will add a '''''Checkbox Widget''''' to the '''''Annotations''''' list.
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the checkboxes.  Click the ellipsis icon to the right of the '''''Fields''''' property to select these '''Data Fields'''.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the checkboxes.
#* You may use the '''''Padding''''' property to adjust the size of the checkboxes if you desire.
#* These '''Data Fields''' ''must'' use an OMR based extraction method ('''''Labeled OMR''''', '''''Ordered OMR''''', or '''''Zonal OMR''''') to insert the checkboxes.
# In the window that pops up, check the boxes next to the '''Data Fields''' you wish to use to create the checkboxes.
#* In our case, we are choosing the "Application", "Proposal Summary", "Essay", "Resume" and "Recommendation Letter" '''Data Fields'''.  Once collected by the '''Extract''' activity, Grooper will know which results you want to use to create the checkboxes.  This will include the checkbox locations and check states stored in the document's layout data.  The '''''Checkbox Widget''''' annotation will then insert checkboxes into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 05 Checkbox Widget 02.png]]
|}


===== Signature Widget =====
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the signature block.
{|cellpadding=10 cellspacing=5
#* Press the ellipsis button at the end of the '''''Fields''''' property.
|valign=top style="width:50%"|
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
#* In our case, we are choosing the "Signature" '''Data Field'''.
#* Be aware, this fields must (1) have already been extracted by the '''''Extract''''' activity and (2) have drawn a zone defining the location and size of the signature block (Most commonly, '''''Detect Signature''''' or '''''Highlight Zone''''' is used to do this).
# Press "OK" when finished.


[[File:2023.1 PDF-Data-Mapping 04 03 04 Signature-Widget-01.png]]




#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the element (submit a signature).
# Press "OK" when finished (or continue adding more '''''Annotations''''').


Form-fillable signature boxes can be inserted using the '''''Signature Widget''''' annotation.  This '''''Annotation Type''''' uses a zonal extraction type (such as '''''Detect Signature''''' or '''''Highlight Zone''''') to draw the boundaries of the inserted signature widget.  This allows you to create a document that can be digitally signed straight from Grooper upon exporting the generated PDF.
[[File:2023.1 PDF-Data-Mapping 04 03 04 Signature-Widget-02.png]]


For example, we will create a '''''Signature Widget''''' annotation for the signature line on the application form, using the "Signature" '''Data Field''' of our '''Data Model.  The '''''Signature Widget''''' will insert an interactable signature element into the generated PDF.
{|class="attn-box"
|
&#9888;
|
|
'''''BE AWARE:''''' The '''''Signature Widget''''' overlays a signature block on a page's image.  If present, any printed signature on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
For more information, [[#Be Aware: Annotations are overlaid on a page's image|see above.]]
|}
==== Textbox Widget ====
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Textbox Widget''''' inserts text-editable form fields into the generated PDF.
* Form fields allow PDFs to collect and store data entered by a user.
* Users can configure a '''''Textbox Widget''''' to create blank form fields or form fields with a value Grooper extracts already populated.
** For blank form fields, the '''Data Field(s)''' this annotation references should use '''''Highlight Zone''''' to place a blank zone where the field should be inserted.
** For populated form fields, the '''Data Field(s)''' this annotation references can use any extractor that returns a single-instance value (most typically '''''Labeled Value''''').
*** This allows Grooper to not only generate a PDF with form fields where they weren't present in the source document, but prefill them with data Grooper collects.
* Be aware, a '''''Textbox Widget''''' differs from a '''''Text Annotation'''''.  Where '''''Textbox Widget''''' will insert a text-editable form field, '''''Text Annotation''''' adds a text comment to to PDF.
|valign=top|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-23.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-04.png]]
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-24.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-05.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
In this example, we will use the '''''Textbox Widget''''' to insert a form field for the "Signature Date" '''Data Field'''.  This used '''''Labeled Value''''' to extract the date. '''''PDF Data Mapping''''' will overlay the form field on top of the extraction result.
|valign=top style="width:40%"|
* FYI:  We will also adjust the generated widget's size using the '''''Padding''''' property. This is common when configuring '''''Textbox Widgets''''' when the font size you want to use for the form field is larger than the printed typeface on the document.
# In the '''''Annotations''''' collection editor, click the "+" button to add the '''''Signature Widget''''' annotation.
 
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
 
# Select ''Signature Widget'' from the list.
With a '''''Textbox Widget''''' added to the '''''Annotations''''' list:
|valign=top|
 
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 06 Signature Widget 01.png]]
# Use the '''''Fields''''' property to select the '''Data Field(s)''' you wish to use to create text-editable form fields.
|-
#* Press the ellipsis button at the end of the '''''Fields''''' property.
|valign=top|
# In the window that pops up, mark the checkboxes next to the '''Data Field(s)''' you wish to select.
# This will add a '''''Signature Widget''''' to the '''''Annotations''''' list.
#* In our case, we are choosing the "Signature Date" '''Data Field'''.
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the signature box.  Click the ellipsis icon to the right of the '''''Fields''''' property to select these '''Data Fields'''.
#* Be aware, these fields must be extracted by the '''''Extract''''' activity or no textbox will be generated.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the signature box widget.
# Press "OK" when finished.
#* You may use the '''''Padding''''' property to adjust the size of the signature box if you desire.
#* Zonal based extraction methods (such as '''''Signature Detection''''' and '''''Highlight Zone''''') are typically used as the '''Data Field's''' extractor type.
# When the window pops up, check the boxes next to the '''Data Fields''' you wish to use to create the checkboxes.
#* In our case, we are choosing the "Signature" '''Data Field'''. Once collected by the '''Extract''' activity, Grooper will be supplied the size and location of the '''Data Field's''' extraction zone, which will form the size and location of the PDF signature widget. The '''''Signature Widget''''' annotation will then insert the form-fillable signature box into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 06 Signature Widget 02.png]]
|}


Just like any '''''Annotation Type''''', the extraction result from the '''Data Field''' is critical for placing the signature annotation on the generated PDF.  Let's look at the "Signature" '''Data Field's''' result to understand a little better how these results are used to create the signature widget.
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-01.png]]


In our case, we're using the '''''Detect Signature''''' extractor type to supply these results.  The '''''Detect Signature''''' extractor is perfectly suited for the '''''Signature Widget''''' '''''Annotation Type'''''. 
* It actually combines both Zonal and OMR based extraction techniques to determine if a signature is present in the zone.  It sets the boundaries of where you expect to find a signature using Zonal based methods and detects if the signature is present by counting the percentage of filled pixels in the zone, which is the basis of OMR based extraction methods.  You can then output different values if the zone is filled above or below a certain percentage.  In this case, the extractor returns "Not Signed" because there aren't enough pixels present in the extraction zone to count as filled.  If there were a signature present, there'd be more pixels present, accounting for a higher filled percentage.


This is great for our purposes because it gives us the exact information we need for the '''''Signature Widget''''', which is an extraction zoneGrooper needs a data instance indicating the size and location for the generated signature widget.
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
* But wait there's more!  We also get some bonus information about whether or not there's a signature presentDoes the '''''Signature Widget''''' '''''Annotation Type''''' need to know if there's a signature present?  No.  It does not.  It will place the widget no matter what the result isBut might that information be otherwise useful to you?  Probably.
#* Adjusting '''''Padding''''' for '''''Textbox Widgets''''' is common if the desired font size in the textbox differs from that printed on the source documentIn this example, we increased the textbox's size by 0.1 in on each side.
# Determine if you need to adjust if the annotation is editable or printableAdjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to edit the value ''inside'' the textboxTo configure that, use the '''''Read Only''''' property.


{|cellpadding=10 cellspacing=5
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-02.png]]
|valign=top style="width:40%"|
# We have selected the "Signature" '''Data Field''' in our '''Data Model'''.
# This '''Data Field''' uses the '''''Detect Signature''''' extractor to draw the extraction zone used to insert the signature widget.
# This extractor uses the '''''Text Region''''' '''''Location''''' option.
# This gives us the ability to anchor the extraction zone to an extractable text anchor, using the '''''Text Extractor''''' property.
#* In this case we've anchored the zone to the word "Signature" outlined in blue in the document viewer.  Where do we want to place the extraction zone (and ultimately the signature widget)?  On the signature line.  How do we know where that line is?  It's above the text label "Signature".
# The extraction zone itself is drawn using the '''''Translation''''' and '''''Adjustment''''' properties.
#* This allows us to set the size ('''''Adjustment''''') and location ('''''Translation''''') of the extraction zone (and ultimately the signature widget) relative to the '''''Text Extractor's''''' result. 
#* The extraction zone will be the green rectangle in the document viewer.
# Click over to the "Tester" tab and test the extraction.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 06 Signature Widget 03.png]]
|-
|valign=top|
# When the '''''PDF Data Mapping''''' behavior builds the PDF, using the '''''Signature Widget''''' annotation, the extraction zone's size and location forms the inserted signature widget.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 06 Signature Widget 04.png]]
|}


===== Textbox Widget =====
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|


#<li value=6> Adjust the textbox's other properties as desired.
#* These properties give you the ability to adjust the font and font size inside the textbox.
#* Please note: If you want to prevent a reader from editing the Grooper collected value inside the textbox, turn '''''Read Only''''' to ''True''.
# Press "OK" when finished (or continue adding more '''''Annotations''''').


[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-03.png]]


==== Text Annotation ====
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Text Annotation''''' inserts a text comment in the PDF. 
* This has two primary uses:
** Insert comments into the PDF that are viewable when opening the PDF in a PDF viewer, but not printable.
** Print a simple text note on a page.
*** Commonly, users will want to print a word like "PROCESSED" on the output PDF.  This notes the document has been processed through Grooper.
* The '''Data Field(s)''' may use any kind of extractor as long as it produces a result with (1) a location on the page to place the comment and (2) a text value to add to the comment.
* Be aware, a '''''Textbox Widget''''' differs from a '''''Text Annotation'''''.  Where '''''Textbox Widget''''' will insert a text-editable form field, '''''Text Annotation''''' adds a text comment to to PDF.




The '''''Textbox Widget''''' '''''Annotation Type''''' will insert editable text boxes into the generated PDF.  One simple way to use this functionality is to use the '''''Highlight Zone''''' extractor type to place a blank zone where you want to place an empty text box on the PDF. However, any extractor type can be used to define the textbox's locationFurthermore, if the '''Data Field''' used to create the annotation collects a valued during the '''Extract''' activity, not only will a textbox be inserted into the generated PDF, but it will be prefilled with the '''Data Field's''' extracted value upon export.
In this example, we will use a '''''Text Annotation''''' to print the word "PROCESSED" on the first page of the PDF generated by '''''PDF Data Mapping'''''.
* We will use the "IsProcessed" '''Data Field''' to do this. The extraction logic to make this happen requires a less-than-common techniqueWe will show you how we build this '''Data Field''' in the [[#Technique: "IsProcessed" Data Field]] section.


For example, we will use the '''''Textbox Widget''''' functionality to fill out the blank coversheet on the first page of our application packet.  We will end up using a '''''Highlight Zone''''' extractor to define the size and location of the text box.  However, we're going to go one step further and populate the '''Data Field's''' used with some information from other '''Data Field's''' in our '''Data Model'''.  By the end of it, '''''PDF Data Mapping''''' will not only insert editable textboxes into the generated PDF, but fill them in with text.  By the end of it, we end up with this blank coversheet automatically populated with some information collected during the '''Extract''' activity.
|valign=top|
|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-28.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-04.png|500px]]
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-29.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-05.png|500px]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Text Annotation''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
 
# In the '''''Annotations''''' collection editor, click the "+" button to add the '''''Textbox Widget''''' annotation.
# Use the '''''Fields''''' property to select the '''Data Field(s)''' you wish to use to insert the text comment.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# Select ''Textbox Widget'' from the list.
# In the window that pops up, mark the checkboxes next to the '''Data Fields''' you wish to select.
|valign=top|
#* In our case, we are choosing the "IsProcessed" '''Data Fields'''.
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 01.png]]
#* Be aware, these fields must (1) be extracted by the '''''Extract''''' activity and (2) hold a location and value or no comment will be added.
|-
# Press "OK" when finished.
|valign=top|
 
# This will add a '''''Textbox Widget''''' to the '''''Annotations''''' list.
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-02.png]]
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the signature box. Click the ellipsis icon to the right of the '''''Fields''''' property to select these '''Data Fields'''.
 
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the textbox widget.  If that '''Data Field''' collected a value during the '''Extract''' activity, it will also be filled with the returned value.
# In the window that pops up, check the box next to the '''Data Fields''' you wish to use to create the checkboxes.
#* In our case, we are choosing the "Candidate", "Title of Proposal" and "Country of Travel" '''Data Fields'''. Once collected by the '''Extract''' activity, Grooper will be supplied the sizes and locations of the '''Data Field's''' data instances for each result.  This will form the size and location of the textbox widget. The ''Textbox Widget'' annotation will then insert the form-fillable textbox into the generated PDF as seen in the "''After Annotation''" image above.  These boxes will also be prefilled with the extraction results from each '''Data Field'''.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 02.png]]
|-
|valign=top|
The '''''Textbox Widget''''' annotation has some additional configuration options as well.


# As with all '''''Annotation Types''''', you can optionally adjust the size of the annotation using the '''''Padding''''' property.
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# You can also change the font and font size of the editable text in the textbox using the '''''Font Name''''' and '''''Font Size'''''.
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
|
#* Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 03.png]]
#* In our case, we ''do'' want this comment printed when the document is printed.  So, we've changed '''''Print''''' to ''True''.
|-
|valign=top|
As far as looking behind the scenes, there's at least two things going on with how we've set up these '''Data Fields'''' extraction, ultimately supplying the result used to insert the ''Textbox Widget'' annotation.


First, we used the ''Highlight Zone'' extractor type to draw the textbox, defining the size and location of the annotation upon generating the PDF.
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-02.png]]


# We have selected the "Candidate" '''Data Field''' in our ''''Data Model'''.
# Each '''Data Field's''' '''''Value Extractor''''' is set to ''Highlight Zone''.
# We used the ''Relative Region'' '''''Location''''' option to anchor an extraction zone to the box next to the label "Candidate".
#* This will form the size and and location of the inserted textbox annotation.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 04.png]]
|-
|valign=top|
Second, we used an expression to return a value, using the results of other '''Data Fields''' in our '''Data Model'''.


# We've used the '''''Calculated Value''''' property (in '''''Calculate Mode''''' ''Always Set'') to return the full name of the candidate extracted by the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''
#<li value=6> Adjust the comment's appearance, as desired, using the '''''Appearance''''' properties.
#* The full expression is as follows: <code>Applicant_Information.First_Name + " " + Applicant_Information.Middle_Initial + " " + Applicant_Information.Last_Name</code>
#* Users may change the comment's font and font size with the '''''Font Name''''' and '''''Font Size''''' properties.
# This will take the extraction results of these three '''Data Fields''' and concatenate them with space characters in between.
#* Users may select a '''''Fill Color''''' and '''''Text Color''''' in one of two ways:
|
#** Using the the dropdown to select from a list of system colors
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 05.png]]
#** Or, entering an RGB value using the format <code>#, #, #</code>
|-
#** Be aware, there is no true "transparent" '''''Fill Color''''' option. The selectable ''Transparent'' option is a system color that equates to "white".
|valign=top|
# Press "OK" when finished (or continue adding more '''''Annotations''''').
# However, if we go to the "Tester" tab...
# ... and test extraction, we're going to get an error.
#* We're in the wrong scope!  We need to go up to the '''Data Model's''' level and test extraction there.  We need the full '''Data Model's''' results to do what we're trying to do here.  Testing extraction on this "Candidate" '''Data Field''', it can't "see" the "Last Name", "First Name" and "Middle Initial" '''Data Fields''' results to combine them.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 06.png]]
|-
|valign=top|
# Once we test extraction on the '''Data Model''' you'll see what results are actually collected by the '''Extract''' activity.
# Make sure you're on the "Tester" tab and test the extraction.
# The '''''Calculated Value''''' expression we configured forms one result for the "Candidate"...
# ...using the results of the "Last Name", "First Name" and "Middle Initial" '''Data Field's''' results.
# With a result returned and zone drawn upon extract, the ''Textbox Widget'' annotation has all the information it needs to place the form-fillable textbox and fill it with the results.


[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-03.png]]


{|class="fyi-box"
===== Technique:  "IsProcessed" Data Field =====
|
'''FYI'''
|
This certainly isn't the '''only''' way to set up a '''Data Field''' for a ''Textbox Widget''.  This is just how we did it for the point of illustrating the ''Textbox Widget'' functionality.  You are not '''required''' to use the ''Highlight Zone'' extractor type.  You can use whatever extractor type best suits your document's needs.  Often Grooper users will use the ''Reference'' extractor to point to a '''Data Type's''' results and adjust the size of the ''Textbox Widget'' using its '''''Padding''''' property.
|}
|
[[File:2023 PDF Data Mapping - 2023 02 How To 02 Annotations 07 Textbox Widget 07.png]]
|}


===== Text Annotation =====
To print the word "PROCESSED" on the PDF, we used a specific technique.  A '''''Text Annotation''''' just needs two things from a '''Data Field''' to insert the annotation:  (1) a location on the page to place the comment and (2) a text value to add to the comment.  The word "PROCESSED" did not exist on the source PDF.  So, we had to figure out a way to use a '''Data Field''' to ''generate'' a result rather than extract it.


The '''''Text Annotation''''' inserts a text comment in the PDF. This has two primary uses:
We did this in essentially two steps:
* Insert comments into the PDF that are viewable when opening the PDF in a PDF viewer, but not printable.
# Use the '''''Highlight Zone''''' extractor to define where the annotation should be printed.
* Print a simple text note on a page.
# Use a '''''Calculated Value''''' to define the text we want to print (the word "PROCESSED").
** Commonly, users will want to print a word like "PROCESSED" on the output PDF.  This notes the document has been processed through Grooper.


To be continued


=== Configure PDF Data Mapping for Bookmarks ===
This gives a '''''Text Annotation''''' everything it needs to insert the comment: (1) A location and (2) some text


<tabs style="margin:20px">
== How To: Configure Bookmarks ==
<tab name="About" style="margin:20px">
=== About ===


Bookmarks in PDFs aid readers when navigating through multipage documents.  '''''PDF Data Mapping''''' can insert bookmarks into the generated PDF to take advantage of this functionality.  This can be done in one of two ways (or both):
Bookmarks in PDFs aid readers when navigating through multipage documents.  '''''PDF Data Mapping''''' can insert bookmarks into the generated PDF to take advantage of this functionality.  This can be done in one of two ways (or both):


# Using a '''Batch Folder's''' child document folders.
# Using a document folder's ('''Batch Folder''') child folders ('''Batch Folder''').
# Using the document's extracted '''Data Fields'''.
# Using a document folder's extracted '''Data Fields'''.


{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
We will focus on the bookmarking method (as it is more common).  Often it is the case you will import a file into Grooper that has multiple documents inside you want to separate and classify, but otherwise all belong together in one way or another.
In this tutorial we take an application packet separated into component child documents and use '''''PDF Data Mapping's''''' '''''Bookmarking''''' property to create bookmarks for each one.


Such is the case with our study abroad application packet.  The application packet as a whole consists of five separate and distinguishable documents.
The application packet as a whole consists of five separate and distinguishable documents.
# The application itself (and a coversheet)
# The application itself (and a coversheet)
# A proposal summary
# A proposal summary
Line 709: Line 753:
|-
|-
|valign=top|
|valign=top|
Our goal is to create a bookmark in the generated PDF file for each of these component documents (or child documents as we will come to call them).   
Our goal is to create a bookmark in the generated PDF file for each of these component documents (child documents).   


Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks corresponding to each component document.
Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks.
# Application - For the application itself (and its coversheet)
 
# Proposal Summary - For the proposal summary
 
# Resume - For the student's resume
We we also demonstrate how to use '''Data Fields''' for bookmarking.  This allows us to insert PDF bookmarks for locations of extracted data.
# Rec Letter - For the letter of recommendation
* The "Signature" bookmark in this example would take the reader to the signature line of the PDF, using the location extracted by the "Signature" '''Data Field''' in our '''Data Model'''.
# Essay - For the essay
|
|
[[File:Pdf-generate-about-06.png]]
[[File:2023.1 PDF-Data-Mapping 01 02 About-Bookmarks-01.png]]
|}
|}


</tab>
=== Bookmarking Option 1:  Child document/folder bookmarks ===
<tab name="Prereqs - Split Pages, Separation, and Classification" style="margin:20px">
 
=== Prereqs - Split Pages, Separation, and Classification ===
There are two ways the '''''Bookmarking''''' feature can insert bookmarks into a PDF generated by '''''PDF Data Mapping'''''.
# '''It can insert a bookmark for each child document/folder.'''
# It can insert a bookmark for selected (single instance) '''Data Fields'''.
 
This section will detail how to insert bookmarks using child documents.
 
==== Option 1 Prereqs: Separated child documents ====
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
 
If enabled, '''''Bookmarking''''' will automatically add bookmarks to a PDF if a document has ''child'' documents in the '''Batch's''' folder hierarchy.
* If a document at folder level 1 is exported and has two child documents, the generated PDF will have two bookmarks in the generated PDF.
* Clicking on the bookmark will take the reader to that child document's page in the PDF.
 
 
For this to work:
* The parent document folder must have separable child pages.
** Either from scanning pages in with a scanner or using the '''''Split Pages''''' activity to generate pages from an imported PDF.
* These child pages must then be separated into child folders.
** Either using a '''Separation Profile''' when scanning or using the '''''Separate''''' activity.


{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
In order to accomplish this goal, we're going to have to do some things to this application packet before we configure '''''PDF Data Mapping'''''.


By the end of it, we're looking for a '''Batch''' whose documents have a structure like thisThe documents in this batch consist of two '''Batch Folder''' levels.
Technically speaking, that's all you need.  '''''PDF Data Mapping''''' will add PDF bookmarks for every child document and name it using each child folder's name.
# '''Folder Level 1''':  This is the parent document folder. It is the container for the full document.  All seven pages of the application packet in this case.
* Be aware, without classifying the child documents these names will just be "Folder (1)" "Folder (2)" "Folder (3)" and so on.
# '''Folder Level 2''':  These are the child document folders for the parent document.  They are the containers for each component document of the full application packet.


This is what we want to end up with.  How did we get there?  Long story short, we have some document separation and classification requirements before we can insert bookmarks in the generated PDF.  The bookmarks are inserted for each child document folder and named after their classified '''Document Type's''' name.  In order to do that, we need to split out the pages of the imported document, separate them into child document folders, and classify them first.
{|style="text-align:center;"
|
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 01 Prereqs 01.png]]
''Not separated''
|-
|valign=top|
The full application document came into Grooper like this. A 7 page PDF file with each of these 5 component documents was imported into a new '''Batch'''.  This is now the parent document folder at '''Folder Level 1'''.


But there's documents in them there document!  How do we get them out?
''No child folders''
|
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 01 Prereqs 02.png]]
''Separated''
|-
|valign=top|
First, we need to use the '''Split Pages''' activity to create child '''Batch Page''' objects. 


This will split out the pages of the imported PDF file, creating one child '''Batch Batch''' for each page in PDF on the parent document folder.  Now we have page objects we can manipulate in our '''Batch'''.
''Has child folders''
|
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 01 Prereqs 03.png]]
''Separated and classified''
 
''Has child document folders.''
 
''
|-
|-
|valign=top|
|valign=top|
Now that we have '''Batch Page''' objects in our '''Batch''', we can use the '''Separate''' activity to insert the second folder level. This is the first step in organizing these pages into child documents. We need to distinguish between one collection of pages as a document and another collection of pages as a document. Creating a folders is the first part of that equation.
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-01.png]]
|valign=top|
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-02.png]]
|valign=top|
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-03.png]]


Now, we have child document folders for this parent document folder, but they are just blank folders.  There is nothing to distinguish one folder from the next.
''What about Page 1 there?''


{|class="attn-box"
''Is it in a folder?  NoThen it won't get a bookmark.''
|
&#9888;
|
By default, the '''Separate''' activity runs on the '''Batch''' level scope, inserting folders at Folder Level 1When separating child documents like this, you will need to change the '''''Scope''''' property of the '''Separate''' activity to run it at the '''Folder Level 1''' scope.  This will separate the loose pages of folders at Level 1, inserting child document folders at Level 2 below the parent folder at Level 1.
|}
|}
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 01 Prereqs 04.png]]
|-
|valign=top|
And, that's the second part of the organization equation, classification.  Next, these folders will be assigned a '''Document Type''' from our '''Content Model''' using the '''Classify''' activity.


{|class="attn-box"
==== Adding bookmarks for child documents/folders ====
|
'''''PDF Data Mapping''''' will create bookmarks for child documents/folders by default.  There is no configuration required besides enabling the '''''Bookmarking''''' property.
&#9888;
 
|
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
By default, the '''Classify''' activity runs on the '''Folder Level 1''' scope, classifying document folders at the first folder level in the '''Batch''' hierarchyWe want to classify the child document folders at '''Folder Level 2'''. When classifying child document folders like this, you will need to change the '''''Scope''''' property of the '''Classify''' activity to run at the '''Folder Level 2''' scope.
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Bookmarking''''' property to ''Enabled''.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 05 04 Bookmarks-for-Child-Docs-01.png]]
 
 
That's it!  It's that simple! 
 
As long as the document folder '''''PDF Data Mapping''''' is applied to has child documents/folders, bookmarks will be created for each child document.
 
 
[[File:2023.1 PDF-Data-Mapping 05 04 Bookmarks-for-Child-Docs-02.png|900px]]
 
=== Bookmarking Option 2:  Data Field bookmarks ===
 
There are two ways the '''''Bookmarking''''' feature can insert bookmarks into a PDF generated by '''''PDF Data Mapping'''''.
# It can insert a bookmark for each child document/folder.
# '''It can insert a bookmark for selected (single instance)''' '''Data Fields'''.
 
This section will detail how to insert bookmarks using '''Data Fields'''.  This allows '''''PDF Data Mapping''''' to bookmark important field value locations extracted by Grooper in the output PDF.
 
==== Option 2 Prereqs: Data Fields and extracted data ====
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
 
'''''Bookmarking''''' can also insert PDF bookmarks using extracted data and their location'''Data Fields''' collect results using extractors which return results from the source document.  '''''Bookmarking''''' will use these results' locations to embed this kind of bookmark.
 
For this to work:
* You must have these '''Data Fields''' defined in a '''Data Model''' and configured to return results.
* Data must be saved for each '''Data Field''' prior to the PDF being generated.
** The '''''Extract''''' activity must run ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
** If performing user assisted data review, the '''''Review''''' activity must complete ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
 
==== Adding bookmarks for Data Fields ====


Furthermore, that parent document folder would need a '''Document Type''' assigned to it at some point as well.  The '''Batch Process''' for this '''Batch''' might have two '''Classify''' activitiesOne running on Folder Level 1 to classify the parent document folder and another running on Folder Level 2 to classify the child document folders.
'''''PDF Data Mapping''''' will insert bookmarks for extracted '''Data Field''' value locations by simply selecting which '''Data Field(s)''' you want to bookmark.
|}
* Please note: Only single-instance '''Data Fields''' may be bookmarked.


Now, we have everything we need to configure the bookmarking functionality of '''''PDF Data Mapping'''''.  Bookmarks will be created every time a new child document is encountered and named after the '''Document Type''' assigned to that folder. 


When the full PDF is generated, a bookmark named "Application" will be inserted at the first page of the PDF. That child document is two pages long. The third page of the full PDF will be the proposal summary. So a bookmark named "Proposal Summery" will be inserted at page three. A "Resume" bookmark will be inserted at page four. And so on.
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
|
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 01 Prereqs 05.png]]
# Change the '''''Bookmarking''''' property to ''Enabled'' and expand its sub-properties.
|}
# Select '''''Data Elements''''' and press the ellipsis button at the end.
# A "Data Elements" selection editor will appear.
# Select the '''Data Field''' whose location you wish to bookmark.
#* Please note:  Only single-instance '''Data Fields''' may be bookmarked.
# Press "OK" when finished selecting '''Data Fields'''.
# Press "OK" when finished configuring '''''PDF Data Mapping'''''.
 
[[File:2023.1 PDF-Data-Mapping 05 02 02 Bookmarks-for-Data-Fields-01.png]]


{|class="fyi-box"
|
'''FYI'''
|
There are many ways to separate and classify documents, including ''[[ESP Auto Separation]]'' which both separates and classifies documents with a single activity (just '''Separate''').  But this is the general idea to get us where we need to go. 


One way or another, create classified child document folders from a parent document folder.  That way when we generate the PDF for the parent document folder upon export, bookmarks will be created for the classified child document folders.
As long as the document folder '''''PDF Data Mapping''''' is applied to has extracted the selected '''Data Field(s)''' with an '''''Extract''''' activity, bookmarks will be created for each '''Data Field''' selected.
|}
</tab>
<tab name="Add the Behavior and Configure It for Bookmarking" style="margin:20px">
=== Add the Behavior and Configure It for Bookmarking===
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
Bookmarking is one of the configuration options for the ''PDF Data Maping'' '''''Behavior'''''.  A '''Content Type''' '''''Behavior''''' can tell an activity (specifically the '''Export''' activity, in the case of '''''PDF Data Mapping''''') how to use the '''Content Type''' to do something (in this case, how to use the '''Content Model's''' '''Document Types''' to insert bookmarks into the PDF upon export).


# All '''''Behaviors''''' are added to a '''Content Type''' object.
[[File:2023.1 PDF-Data-Mapping 05 02 02 Bookmarks-for-Data-Fields-02.png|center]]
#* We will add the '''''PDF Data Mapping''''' behavior to this '''Content Model''' named "PDF Data Mapping - UNESCO Packet".
# All '''''Behaviors''''' are added using the '''''Behaviors''''' property.  Select the '''''Behaviors''''' property and press the ellipsis button at the end to add '''''PDF Data Mapping'''''.
# In the '''''Behaviors''''' editor window that pops up, click the "+" button to add a '''''Behavior'''''.
# Choose ''PDF Data Mapping'' from the list.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 02 Add Behavior and Configure It 01.png]]
|-
|valign=top|
# Once added, you will see '''''PDF Data Mapping''''' added to the list on the left.  Select it.
# To enable the bookmarking functionality, in the right panel, click the checkbox next to '''''Bookmarking''''' property.
# Open up the subproperties and we see we have two Label properties. Here you can change the '''''Label Style''''' and '''''Label Color''''' to your preference.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 02 Add Behavior and Configure It 02.png]]
|-
|valign=top|
For our purposes, this is all we need to configure at this point.  However, be aware of the '''''Bookmarking''''' configuration options.


# Click the ellipsis icon to the right of the '''''Data Elements''''' property.
== How To: Configure Metadata ==
# In the new "Data Elements" window that pops up, click the check boxes next to the elements you want bookmarked.
#* You can add bookmarks to any of the data elements. You can expand the '''Data Sections''' to add individual '''Data Fields''' within those sections as well if you like. However, if you add a '''Data Field''' that is a child of a '''Data Section''', then that '''Data Section''' must be added too.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 03 Bookmarks 02 Add Behavior and Configure It 03.png]]
|}
</tab>
</tabs>
=== Configure PDF Data Mapping for Metadata ===


<tabs style="margin:20px">
<tab name="About" style="margin:20px">
=== About ===
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|style="width:50%" valign=top|
|style="width:50%" valign=top|
The '''''PDF Data Mapping''''' behavior has the ability to create and insert additional metadata into the generated PDF as well, using information collected during Grooper's document processing.  The metadata you are able to create falls into one of three categories:
The '''''PDF Data Mapping''''' behavior has the ability to create and insert additional metadata into the generated PDF as well, using information collected during Grooper's document processing.  The metadata you are able to create falls into one of three categories:


# Editing the PDF's default metadata fields.
# Editing the PDF's default metadata fields, including:
#*This includes the following metadata fields that are standard to every PDF file:
#* Title
#** Title
#* Author
#** Author
#* Subject
#** Subject
#* Created Date
#** Created Date
#* Modified Date
#** Modified Date
#* Application
#** Application (Used to establish the "creator" application which created the original file.  This can be useful if the original file was created in a different application, like Microsoft Word, and converted to a PDF format by Grooper with a '''''PDF Data Mapping''''' behavior.)
# Creating custom metadata fields
#* This is done using extracted '''Data Field''' values collected during the '''Extract''' activity.
# Adding "Keywords" to the PDF metadata  
# Adding "Keywords" to the PDF metadata  
#* This can be done using expression based or extraction based methods.
#* This can be done using expression based or extraction based methods.
# Creating custom metadata fields and values
#* Custom metadata can be stored for any (single instance) '''Data Field''' values collected during the '''Extract''' activity.


{|class="attn-box"
{|class="attn-box"
Line 851: Line 900:
&#9888;
&#9888;
|
|
Notice what's not included in this list is the exported document's ''filename'' (e.g. "Im_a_file.pdf").  Filename mappings are always configured using an ''Export Behavior''.
Notice what's not included in this list is the exported document's ''filename'' (e.g. "Im_a_file.pdf").  Filename mappings are always configured using an '''''Export Behavior'''''.
|}
|}
|
|valign=top|
[[File:Pdf-generate-howto-43.png]]
[[File:2023.1 PDF-Data-Mapping 01 02 About-Metadata-01.png]]
|}
|}


=== Prereqs: Data extraction ===
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
For '''''Metadata''''', data coming from Grooper can be mapped to the PDF in one of two ways:
# Using '''Data Field''' results
#* To embed custom PDF metadata, the custom fields are generated from '''Data Fields''' in the document's '''Data Model''' and their collected results.
#* This means the document ''must'' be processed by the '''''Extract''''' activity in order to create and populate these custom fields.
#* Or, if performing user assisted data review, the values must be previously recorded during the '''''Review''''' activity.
# Using code expressions
#* In the case of the default PDF metadata fields and keywords, expressions can be used to populate the metadata.
#* This gives you access to not only extracted '''Data Field''' results but also system data, classification information, and various functions to manipulate it.
=== Mapping default PDF metadata ===
'''''PDF Data Mapping's''''' '''''Metadata''''' settings can edit a PDF's default metadata values for its "Title", "Author", "Subject", "Application", "Created" and "Modified" properties.


</tab>
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
<tab name="Prereqs - Data Extraction" style="margin:20px">
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
=== Prereqs - Data Extraction ===
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# Use a code expression to create custom values for the following default PDF metadata:
#* '''''Title''''' for the PDF's "Title" field
#* '''''Author''''' for the PDF's "Author" field
#* '''''Creator''''' for the PDF's "Application" field
#* '''''Subject''''' for the PDF's "Subject" field
#* '''''Creation Date''''' for the PDF's "Created" field
#* '''''Modification Date''''' for the PDF's "Modified" field
# Press "OK" when finished.


If we're going to insert some metadata into these PDFs, that data has to come from somewhere. In broad terms, the metadata creation is done in one of two ways (or a combination of the two):
[[File:2023.1 PDF-Data-Mapping 06 01 Default-Metadata-01.png]]


# Using expression based creation
#* In the case of the default PDF metadata fields and keywords, expressions can be used to populate the metadata.  This gives you access to system data, classification information, extracted '''Data Field''' results, and various .NET functions to manipulate it.
# Using '''Data Field''' results
#* In the case of the custom PDF metadata, the custom fields are generated from '''Data Fields''' in the document's '''Data Model''' and their collected results from the '''Extract''' activity.
#* This means the document ''must'' be processed by the '''Extract''' activity in order to create and populate these custom fields.
</tab>
<tab name="Add the Behavior and Enable Metadata" style="margin:20px">
=== Add the Behavior and Enable Metadata ===
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
Metadata is one of the configuration options for the '''''PDF Data Mapping''''' behavior.  A '''Content Type''' '''''Behavior''''' can tell an activity (specifically the '''Export''' activity, in the case of '''''PDF Data Mapping''''') how to use the '''Content Type''' to do something (how to use the '''Content Model's''' collected '''Data Fields''' and other information to edit the generated PDF's metadata, in this case).


# All '''''Behaviors''''' are added to a '''Content Type''' object.
In our example, we made the following changes to the default PDF metadata:
#* We will add the '''''PDF Data Mapping''''' behavior to this '''Content Model''' named "PDF Data Mapping - UNESCO Packet".
* '''''Title''''':
# All '''''Behaviors''''' are added using the '''''Behaviors''''' propertySelect the '''''Behaviors''''' property and press the ellipsis button at the end to add the '''''PDF Data Mapping''''' behavior.
** This defaults to the expression <code>CurrentDocument.ContentTypeName</code>. This will make the title whatever the document's '''Document Type''' classification is.
# In the '''''Behaviors''''' editor window that pops up, click the "+" button to add a '''''Behavior'''''.
** We did not change Grooper's default.
# Choose ''PDF Data Mapping'' from the list.
* '''''Author''''':
|
** This defaults to the expression <code>LDAP.CurrentUserDisplayName</code>.  This will set the author to the Windows username for the Grooper user or service who created the PDF.
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 01 Add Behavior and Enable Metadata 01.png]]
** We changed this to evaluate to the applicant's first name, middle initial, and last name as collected by Grooper using the following expression:
|-
*** <code>$"{Applicant_Information.First_Name} {Applicant_Information.Middle_Initial} {Applicant_Information.Last_Name}"</code>
|valign=top|
* '''''Creator''''':
# Once added, you will see '''''PDF Data Mapping''''' added to the list on the leftSelect it.
** This will adjust the PDF's "Application" fieldThis field is left blank by default.
# To enable the metadata functionality, in the right panel, click the checkbox next to the '''''Metadata''''' property.
** We changed this to the simple string <code>"Grooper PDF Data Mapping"</code>.
|
* '''''Subject''''':
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 01 Add Behavior and Enable Metadata 02.png]]
** This field is left blank by default.
|}
** We changed this to use the value of the "Proposal Title" '''Data Field''' in the "Proposal Information" '''Data Section''' with the expression <code>Proposal_Information.Proposal_Title</code>
</tab>
* '''''Creation Date''''':
<tab name="Edit Default PDF Metadata" style="margin:20px">
** This sets the PDF's "Created" date value and defaults to the expression <code>DateTime.Now</code>.  This returns the current system time of your machine at the time the PDF is generated.   
=== Edit Default PDF Metadata ===
** We did not change Grooper's default.
* '''''Modification Date''''':
** This sets the PDF's "Modified" date value and defaults to the expression <code>DateTime.Now</code>.  This returns the current system time of your machine at the time the PDF is generated. 
** We did not change Grooper's default.


Once enabled, the first six '''''Metadata''''' sub-properties all pertain to the default PDF metadata fields Grooper can edit:  Title, Author, Subject, Creation Date, Modified Date, and Creator
=== Mapping Keywords ===


These are edited with code expressions.
The '''''Metadata''''' settings can add terms to the PDF's "Keywords" field in one of two ways:
# Using a code expression
# Using an extractor ('''Data Type''', '''Value Reader''' or '''Field Class''')


{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
# The '''''Title''''' property corresponds to the PDF's "Title" field.
#* By default, this expression is set to <code>CurrentDocument.ContentTypeName</code>
#** This will make the title whatever the document's '''Document Type''' classification is.
#** In our case, these document folders are assigned the "UNESCO Application Packet" '''Document Type''' of our '''Content Model'''.
# The '''''Author''''' property corresponds to the PDF's "Author" field.
#* By default, this expression is set to <code>LDAP.CurrentUserDisplayName</code>
#** This will make the author the display name of whatever user is logged into the machine exporting the documents.
#* We've changed this to <code>Candidate</code>
#** This will make the author the result of the "Candidate" '''Data Field''' (which is "Dog O Doggerson" for our example document).
# The '''''Creator''''' property corresponds to the PDF's "Application" field.
#* This field is intended to be used when generating PDFs from different file types.  For example, if the file was originally a Microsoft Word document, you might enter <code>"Microsoft Word"</code> to fill this field.
#* This field is blank by default, and we have left it so.
# The '''''Subject''''' property corresponds to the PDF's "Subject" field.
#* This field is blank by default.
#* We've decided to populate this field with the extracted proposal title, using the results of the "Title of Proposal" '''Data Field''' and the expression <code>Title_of_Proposal</code>
#** Note: Spaces in '''Data Fields''' must be replaced with underscores in expressions.
# The '''''Creation Date'''''' and '''''Modification Date''''' properties correspond to the PDF's "Created" and "Modified" fields.
#* By default, these both use the expression <code>DateTime.Now</code>
#** This will return the current system time of your machine at the time of export.
|valign=top|
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 02 Edit Default PDF Metadata 01.png]]
|-
|valign=top|
# When we open the document in Adobe Acrobat and view these fields using the "Document Properties" window, you can see the metadata this configuration generated for the PDF.
|
[[File:Pdf-generate-howto-46.png]]
|}
</tab>
<tab name="Add Keywords" style="margin:20px">
=== Add Keywords ===


Grooper can add keywords into the PDF's "Keywords" field in one of two ways, either using an expression or a referenced extractor's results.
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# To add keyword terms with a code expression, add the expression to the '''''Keywords''''' property.
#* This expression should evaluate to a string value.  This string will be added to the PDF's "Keywords" field.
# To add keyword terms with an extractor, reference the extractor with the '''''Keywords Extractor''''' property.
#* This extractor should return a string value.  This string will be added to the PDF's "Keywords" field.
# Press "OK" when finished.


{|cellpadding=10 cellspacing=5
[[File:2023.1 PDF-Data-Mapping 06 02 Keywords-01.png]]
|style="width:40%" valign=top|
In our case, we're going to use an expression to determine if the word count of the "Essay" document in the application packet is "Long", "Short", or "Normal".


# We will use the results of the "Essay Word Count" '''Data Field''' of our '''Data Model''' to do this.
# This '''Data Field's''' extraction is configured to count the number of words in the essay.


If the word count is above 600 words, we'll call that a long essay.  If it's below 400 words, we'll call that a short essay.  And if it's anywhere in between, we'll call it a normal essay.
In our example, we used an expression to insert a keyword based on the word count of the "Essay" document in the application packet.
* "Short Essay" for essays under 400 words
* "Long Essay" for essays over 600 words
* "Normal Essay" for essays between 400 and 600 words


The expression below uses a series of nested conditional statements using the IIf() function to accomplish this. 


: <code>IIf(Essay_Information.Essay_Word_Count > 600, "Long Essay", IIf(Essay_Information.Essay_Word_Count > 400, "Normal Essay", "Short Essay"))</code>
We also used an extractor to add a "signed" keyword if the application was signed and "not signed" if the application was not signed.


If the result is greater than 600 the keyword will evaluate to "Long Essay".  Otherwise, if the result is less than 400, the keyword will evaluated to "Short Essay".  If neither condition is met, the keyword evaluates to "Normal Essay".
=== Mapping custom Metadata ===
|
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 03 Add Keywords 01.png]]
|-
|valign=top|
To use this expression to add the keyword to the generated PDF's metadata, we will configure the '''''Keywords''''' property.


# In the '''''Metadata''''' sub-properties, select the '''''Keywords''''' property and click the ellipsis button at the end.
{|class="attn-box"
# In the expression editor that pops up, enter the expression you wish to use create the keywords.
#* As is the case with any expression editor, Grooper's IntelliSense code completion will aid you when writing your code expressions.
# Click "OK" when finished.
|
|
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 03 Add Keywords 02.png]]
&#9888;
|-
|valign=top|
# When we open the generated PDF in Adobe Acrobat and view the "Document Properties" window, you can see the metadata this configuration generated for the PDF.
#* The keyword "Normal Essay" has been added to the keywords list.
#* The extracted value for the "Essay Word Count" field was 485, which is less than 600 and greater than 400.  Evaluated by our '''''Keywords''''' expression, that returns a value of "Normal Essay".
|
|
[[File:Pdf-generate-howto-49.png]]
Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".
* Consider these names reserved.
* If you are attempting to export '''Data Field''' values as custom PDF metadata, they ''cannot'' share any reserved names.  You will need to rename the '''Data Field''' in Grooper to a unique name.
|}
|}
</tab>
<tab name="Add Custom Metadata" style="margin:20px">
=== Add Custom Metadata ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
Last but not least, you can add custom metadata fields to the generated PDF using extraction results from the document's '''Data Model'''.  A custom metadata field is generated for every '''Data Field''' you choose in the '''Content Type's''' '''Data Model'''.


# Remember, we add '''''Behaviors''''' to '''Content Types''' (Typically a '''Content Model''' or a '''Document Type'''). In this case we're adding the '''''PDF Data Mapping''''' behavior to the '''Content Model'''
'''''PDF Data Mapping's''''' '''''Metadata''''' feature can store custom metadata as well, exporting '''Data Field''' values to custom PDF metadata fields.  This is a way for Grooper to save '''Data Field''' values directly to the PDF.
# '''Content Models''' and '''Document Types''' can have their own '''Data Models''' as one of their children. Configuring '''''PDF Data Mapping''''' on the '''Content Model''', we will utilize ''its'' '''Data Model''' to export this custom metadata.  
* '''''BE AWARE:''''' Only single-instance data can be exported to a PDF's custom metadata.
# This '''Data Model''' is configured with several '''Data Fields'''. These '''Data Fields''' will collect information about the "UNESCO Application Packet" and its component documents, such as the applicant's name and information about the proposal.
** '''Data Fields''' at the root of a '''Data Model''' or in single instance '''Data Sections''' can be exported.
#* This will be done during the '''Extract''' activity.  Once collected, '''''PDF Data Mapping''''' can insert the results into the generated PDF, creating one custom metadata field and corresponding result for each '''Data Field''' and its extracted result.
** '''Data Fields''' in multi-instance '''Data Sections''' and '''Data Column''' values ''cannot'' be exported.
|
 
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 04 Add Custom Metadata 01.png]]
 
|-
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
|
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
To do this, we will use the '''''Export Data Fields''''' option of ''PDF Data Mapping's'' '''''Metadata''''' properties.
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# Turn '''''Export Data Fields''''' to ''True''.
# Use the '''''Field Filter''''' editor to select a specific set of '''Data Fields''' to export.  Otherwise, all '''Data Fields''' will be exported to custom PDF metadata fields.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 06 03 Custom-Metadata-01.png]]
 
 
In our example, we exported all '''Data Fields''' to the generated PDF's custom fields.  Custom metadata can be viewed using Adobe Acrobat. Go to "Document Properties...".  Then select the "Custom" tab.  All selected '''Data Fields''' will be exported to this "Custom Properties" list in the PDF.
* FYI: Spaces and other special characters in a '''Data Field's''' name will be replaced with underscores (i.e. "Field_Name")
* FYI: '''Data Fields''' in single instance '''Data Sections''' will be named using dot notation (i.e. "Section_Name.Field_Name")
 
[[File:2023.1 PDF-Data-Mapping 06 03 Custom-Metadata-02.png|center]]


# In the '''''Metadata''''' sub-properties, click the check box next to the '''''Export Data Fields''''' property to change it from ''False'' to ''True''
== How To: Configure Piece Info ==
# By default, once you enable this property, Grooper will export all available '''Data Fields''' to the '''Content Type''' on which '''''PDF Data Mapping''''' is configured.
#* You can be more selective about what you want to include using the '''''Field Filter''''' property.
#* This will give you a drop down list of all the '''Data Field''' nodes available for custom PDF metadata creation.  You can check the box next to which ones you wish to include, leaving those '''Data Fields''' you wish to exclude unchecked.
|
[[File:2023 PDF Data Mapping - 2023 02 How To 04 Metadata 04 Add Custom Metadata 02.png]]
|-
|valign=top|
# When we open the generated PDF in Adobe Acrobat and view the "Document Properties" window, you can see the custom metadata generated in the "Custom" tab.
# The '''Data Fields'''' names show up in the "Names" column.
#* Note: '''Data Fields''' in '''Data Sections''' will have their names appended to the '''Data Section's''' name.  For example the "Proposal Title" '''Data Field''' in the "Proposal Information" '''Data Section's''' name translates to "Proposal_Information.Proposal_Title".
# The '''Data Field's''' result, collected by the '''Extract''' activity show up in the "Value" column.


{|class="attn-box"
{|class="important-box"
|
|
&#9888;
'''!!'''
|
|
Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".
'''''BE AWARE:  PIECE INFO IS STILL UNDER DEVELOPMENT'''''


You may run into an issue upon export if you have '''Data Fields''' in your '''Data Model''' who share one of these names.  If using the '''''Metadata''''' creation capabilities of '''''PDF Data Mapping''''', consider these names "taken" and adjust the name of the '''Data Field''' to be something different. For example, in this case a '''Data Field''' returning the title of the proposal listed on the application was changed from "Title" to "Title of Proposal"
Please consider the '''''Piece Info''''' feature in "beta" at this time.  This feature will be more fully documented once fully developed.
|}
|}
"PieceInfo" is a PDF dictionary of additional data stored by other applications.  For example, when you save a PDF from Adobe Illustrator, PieceInfo will store the original Illustrator file (which allows the PDF to be edited in Illustrator as if it were the original).  PieceInfo can be stored at the document level for the whole PDF or at the page level for one or more pages in the PDF.
'''''PDF Data Mapping''''' uses PieceInfo dictionaries to store extracted '''Data Field''' values as a PDF dictionary embedded in the document's structure by enabling and configuring the '''''Piece Info''''' settings.
* Contrast this with the '''''Metadata''''' settings which store '''Data Field''' values at the as custom metadata fields in the document properties.
* '''''Piece Info''''' is unique in that it can export data from a '''Data Table''' ''<u>in very specific scenarios</u>''.  Using the '''''Key Column''''' property, it can build the dictionary from ''<u>only two</u>'' columns in a table, ''<u>and only if</u>'' one of those columns acts as a "key" with unique values for each extracted row.
=== PieceInfo at document level vs PieceInfo at page level ===
With '''''Piece Info''''' enabled and configured, '''''PDF Data Mapping''''' will store the dictionary at either the document level or on a page, depending on the '''Batch's''' folder structure.
Imagine a '''Batch Folder''' that looks like this:
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-01.png|center]]
If '''''PDF Data Mapping''''' with '''''Piece Info''''' is configured for a parent document's '''Document Type''', the PieceInfo dictionary is stored at the document level in the PDF.
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-02.png|center]]
If '''''PDF Data Mapping''''' with '''''Piece Info''''' is configured for a child document's '''Document Type''', the PieceInfo dictionary is stored at the page level, on the first page of that child document in the PDF.
* In this example '''''PDF Data Mapping''''' with '''''Piece Info''''' was configured for the "Green" '''Document Type'''.
** With a PDF generated for the parent document folder, the output PDF will be 5 pages long total (because there are a total of five pages in the three child document folders).
** Page 1 of the child document folder "Green (2)" will be page 2 in the output PDF.
** The PieceInfo dictionary will therefore be stored in page 2 of the output PDF.
* Be aware, it doesn't matter if the child document is a multipage document with extracted results on multiple pages.  The PieceInfo dictionary is only stored once, on the first page only.
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-03.png|center]]
{|class="fyi-box"
|
|
[[File:Pdf-generate-howto-52.png]]
'''FYI'''
|-
|valign=top|
# You can also access this data using the "Additional Metadata..." button in the "Description" tab.
# Select the "Advanced" item.
# You'll see all the generated custom metadata listed under the "<nowiki>http://ns.adobe.com/pdfx/1.3/</nowiki>" node.
|
|
[[File:Pdf-generate-howto-53.png]]
You can inspect PieceInfo with Adobe Acrobat Pro.
 
For inspecting PieceInfo at the document level:
* Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight").  Select "Options" > "Browse Internal PDF Structure...".  Click the Lightbulb icon.  Expand "The document root" and look for "PieceInfo".  Expand "PieceInfo" and look for whatever you named your dictionary in the '''''Piece Info''''' configuration.
 
For inspecting PieceInfo at the page level:
* Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight").  Select "Options" > "Browse Internal PDF Structure...".  Click the Page icon.  Expand a Page and look for "PieceInfo".  Expand "PieceInfo" and look for whatever you named your dictionary in the '''''Piece Info''''' configuration.
|}
|}
</tab>
 
</tabs>
=== Known Piece Info Issues ===
 
'''Issue #1:  The Elements Property'''
 
The '''''Elements''''' property does nothing.  Its original intent was to be a kind of filter that allowed for simpler configuration of the '''''Fields''''' property.  However, it was never fully implemented.  It has been deemed an unnecessary property and will be removed in future versions.
 
'''Issue #2:  Page Level Classification'''
 
When separating and classifying documents using '''''ESP Auto Separation''''', Grooper performs page-level classification.  This can cause '''''Piece Info''''' to create a blank PDF PieceInfo dictionary for every page in certain '''''PDF Data Mapping''''' configurations.


== How To: Generate the PDF using Merge or Export ==
== How To: Generate the PDF using Merge or Export ==
A '''''PDF Data Mapping''''' configuration is applied when Grooper builds a PDF.  This will happen when one of two activities is applied to a '''Batch Folder''':
* Either the '''''Export''''' activity
* Or the '''''Merge''''' activity
In either case, three conditions must be met for Grooper to create a PDF with the additional '''''PDF Data Mapping''''' settings.
# The '''Batch Folder''' being processed ''must'' be assigned a '''Document Type''' that inherits the '''''PDF Data Mapping''''' behavior.
#* '''''PDF Data Mapping''''' will need to be configured for that '''Document Type''', its parent '''Content Category''' or its parent '''Content Model'''.
# A '''''PDF Format''''' must be added.
#* For the '''''Export''''' activity: To the '''''Export Formats''''' configuration in the '''''Export Behavior'''''.
#* For the '''''Merge''''' activity:  To the '''''Merge Format''''' configuration.
# The '''''PDF Format's''''' '''''Always Build''''' property should be set to ''True''.
#* This will ensure a new output file will be generated in cases where an imported PDF is already attached to the '''Batch Folder''' in Grooper.

Latest revision as of 10:59, 2 September 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.120232021

PDF Data Mapping is a Behavior that enhances PDF files generated by the Merge or Export activities with metadata, bookmarks, annotations and/or different kinds of widgets.

PDF Data Mapping builds a data rich "Smart PDF" from a document folder's content. Classification results, extracted data, and more can be used to insert native PDF elements into the generated PDF.

PDF elements that can be mapped from Grooper generated results include:

  • Bookmarks
  • Metadata
  • PDF Annotations (such as text highlighting, checkbox widgets and signature widgets)

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

The PDF Data Mapping behavior allows Grooper users to more fully leverage the capabilities of the PDF file type. The standard PDF Export Format (and Merge Format) in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon Export (or Merge).

However, this is just the "display information" required to open and read the document. There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text. PDF content can also include metadata, keywords, bookmarks, annotations, and more!

PDF Data Mapping expands Grooper's standard PDF generation capabilities. It creates an exportable PDF file that includes additional content available to the PDF file type. PDF Data Mapping merges data collected by Grooper into the PDF by mapping these values to native PDF elements like bookmarks and annotations.

The expanded PDF Data Mapping functionality can be divided into three categories:

  • Annotations: Highlight important text, insert comments, and embed interactive widgets like editable form fields and checkboxes.
  • Bookmarks: Organize complex documents with bookmarks linking to child documents and/or extracted Data Fields.
  • Metadata: Alter the PDFs default metadata, add searchable keywords and export custom metadata using data collected by Grooper.

Annotations

Annotations are native PDF elements used to highlight and comment text in a PDF file. For PDF Data Mapping, "annotations" also refer to interactable "widgets" such as checkbox and text form fields. The Annotations functionality allows you to embed many of these native PDF annotations and widgets into Grooper generated PDFs.

Annotations can serve many purposes:

  • Annotations can increase the readability, such using a highlight annotation to call out important information.
  • Annotations can add components for the reader to interact with the document, such as checkboxes and signature widgets.


PDF Data Mapping can add the following kinds of annotations/widgets:

  1. Highlighting
  2. Radio group buttons
  3. Checkboxes
  4. Signature boxes
  5. Editable text boxes


Grooper uses information from Data Elements in a Data Model collected during the Extract activity to add these annotations.

  • For example, if Grooper extracts a "Name" field and you want that highlighted on the output PDF, you can use the "Highlight Annotation" to highlight the name Grooper extracted on the document.

FYI

The size of all these annotations can also be adjusted using a Padding property if the size of the extracted data instance is too small for your needs.

Bookmarks

Bookmarks provide easy navigation for multipage PDF documents. PDF Data Mapping can generate bookmarks in one of two ways:

  1. Bookmarks can be generated for extracted Data Field locations.
  2. When exporting a document folder that has child document folders, bookmarks can be generated for each "sub-document".
    • This is the default bookmarking behavior and requires no configuration. Bookmarks will be named however the child document folders are named.


In this example, this document is an application packet for a study abroad program. It has both kinds of bookmarks.

  • The "Signature" bookmark is from an extracted Data Field. It will take the reader to a signature location on the PDF.
  • The rest were generated for each child document in the document folder (Batch Folder) that was exported. PDF Data Mapping inserted a bookmark for each sub-document. The selected "Resume (4)" bookmark in the image took the reader to the resume page in the PDF.

FYI

Bookmarks generated for child document folders will be named whatever the documents are named.

  • A document folder's (Batch Folder) name defaults to its classified Document Type and document number. Here, "Application (2)", "Proposal Summary (3)", "Resume (3)", and so on.
  • A document folder's name can be changed if you edit the Document Type's Caption property. This will then change the bookmarks name.
    • Be aware, the document must be extracted for the Caption to be applied and its name changed.

Metadata

Metadata refers to a PDF file's content beyond the information required to display the document (the page images and encoded text data). Prior to implementing the PDF Data Mapping functionality, Grooper only had access to edit minimal PDF metadata upon export (notably the PDF's file name).

PDF Data Mapping allows Grooper to alter and store additional metadata, including:

  1. The PDF's default metadata fields, including its "Title", "Author", "Subject" and more.
  2. Keywords
  3. Custom metadata fields
    • Custom metadata allows Grooper to embed any single instance Data Field's value directly to the PDF.


This gives Grooper a mechanism to create a viewable document with all extracted (single instance) data associated with the document itself, independent of that data being stored elsewhere (such as a database table or content management system).

FYI

This metadata can be accessed in Adobe Acrobat by opening the "Document Properties" window from the File menu.

Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".

  • Consider these names reserved.
  • If you are attempting to export Data Field values as custom PDF metadata, they cannot share any reserved names. You will need to rename the Data Field in Grooper to a unique name.

How To: Add a PDF Data Mapping Behavior

Like all Behaviors, PDF Data Mapping is configured on a Content Type node, commonly a Content Model or a Document Type.


  1. Here, we have selected a Content Model in the Node Tree.
  2. To add a Behavior, select the Behaviors property and click the ellipsis button at the end.
  3. This will bring up a dialogue window to add various behaviors to the Content Model, including PDF Data Mapping.
  4. Add PDF Data Mapping to the list by clicking on the "+" button.
  5. Select PDF Data Mapping from the listed options.


  1. Once added, you will see a PDF Data Mapping item added to the Behaviors list.
  2. Selecting this Behavior, you will see property options to configure PDF creation.
  3. Press "OK" when finished configuring PDF Data Mapping.
  4. Don't forget to save changes to the Content Model.

About the documents used in these tutorials

The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail a more specific set up for a PDF Data Mapping. This is a packet of documents from a single applicant containing a cover page and five different kinds of documents.

By the end of this tutorial we will have taken a source application packet, used Grooper to process it, and exported a single PDF with:

  • Metadata collected from Grooper
  • New annotations and widgets
  • Easily navigable bookmarks

Cover Page and Application

This is an application for a traveling abroad scholarship.

Primarily, the cover page and application document will allow us to demonstrate the annotations and widgets PDF Data Mapping can generate. We will use its Annotations settings to add the following annotations:

  • Text Annotation
  • Highlight Annotation
  • Checkbox Widget
  • Radio Group Widget
  • Signature Widget
  • Textbox Widget

Secondarily, we will also use data collected from this form will be used to generate and store default and custom metadata. We will use the Metadata settings to do this.

Lastly, we will embed a bookmark that will take the PDF's reader to the signature field on the document. We will use the Bookmarking settings to do this.

Essay

This application also includes an essay from the student.

This document will demonstrate how to add Keywords to the PDF's metadata. Using the Metadata settings we will configure a code expression to insert "long essay", "normal essay", or "short essay" depending on the essay's length.

Other Documents

This packet contains three other kinds of documents as well:

  • a proposal summary
  • the applicant's resume
  • and a letter of recommendation.

For these documents (as well as the rest) we will insert bookmarks into the generated PDF, taking the reader to each document in the larger file. We will use Bookmarking settings to do this.

Notes on PDF Data Mapping, child documents and bookmarking

The original document was imported as a single document into Grooper. We have separated it into child documents which will allow us to insert bookmarks for each separated document.

  1. The PDF Generation Behavior will be applied to the Batch Folders at folder-level one.
    • The attached file is the source application packet.
  2. The Split Pages activity was applied to split the packet into pages. Then, those pages were separated into classified document folders at folder-level two.
  3. PDF Data Mapping can create a bookmark in the generated PDF for each of these five sub documents by enabling the Bookmarking property.


By creating bookmarks for each child document, there is no need to export individual PDFs for each one. Instead, we will use PDF Data Mapping to generate one PDF for the whole application packet as use the bookmarks to navigate between each document.

How To: Configure Annotations

Annotations are native PDF elements used to highlight and comment text in a PDF file. For PDF Data Mapping "annotations" also refer to interactable "widgets" such as checkbox and text form fields. In this tutorial we will configure at least one example of each Annotation option. In this tutorial we will configure at least one example of each Annotation option.

  • Text Annotation - Inserts a text-based comment in the PDF.
  • Highlight Annotation - Highlights text on the PDF.
  • Radio Group Widget - Inserts a group of selectable radio buttons in the PDF.
  • Checkbox Widget - Inserts checkable checkboxes in the PDF.
  • Signature Widget - Inserts a signature block in the PDF.
  • Textbox Widget - Inserts an editable form field in the PDF.

BE AWARE: PDF Data Mapping cannot insert annotations on PDF pages with form fields.

If a PDF page is form-fillable, it is ill advised to insert annotations and widgets on top of these form fields. This can result in a corrupted PDF when it is generated by Merge or Export. PDF Data Mapping will not allow you to insert annotations and widgets on PDF pages with form fields.

Prereqs: Data Fields and extracted data

For PDF Data Mapping to work, Grooper needs to have data to map.

  • For Annotations this means Data Fields.
  • Data must be saved for each Data Field prior to the PDF being generated.
    • The Extract activity must run before Merge or Export generates the PDF.
    • If performing user assisted data review, the Review activity must complete before Merge or Export generates the PDF.


Each of the Annotation Types references a Data Field in a Data Model as part of their configuration. If the Data Field does not collect data during the Extract activity, the PDF Data Mapping won't know where to place the annotation.

About the Data Model used for this tutorial

The Data Model we're working with has several Data Fields that will allow PDF Data Mapping to place annotations and widgets.

The "Last Name" "First Name" and "Middle Initial" Data Fields (in the "Applicant Information" Data Section) will demonstrate the Highlight Annotation

  • These fields use Labeled Value to extract field values next to a label.
  • Be aware, nearly any kind of Value Extractor can be used to insert a highlight annotation. Grooper just needs a location on the document to draw the highlight boundaries.

The "US Citizen" Data Field will demonstrate the Radio Group Widget.

  • This field uses Labeled OMR to extract a group of checkboxes where only one may be checked.
  • Be aware, any OMR extractor (Labeled OMR, Ordered OMR or Zonal OMR) would be able insert the radio group widget as long as its Check Mode is set to CheckOne.

The "Checklist" Data Field will demonstrate the Checkbox Widget.

  • This field uses Labeled OMR to extract a group of checkboxes where one or more may be checked.
  • Be aware, any OMR extractor (Labeled OMR, Ordered OMR or Zonal OMR) would be able insert the checkbox widget.

The "Signature" Data Field will demonstrate the Signature Widget.

  • This field uses Detect Signature to detect whether or not a signature is present on the document.
  • Be aware, any zonal extractor (Read Zone, Highlight Zone or Detect Signature) would be able insert the signature widget.

The "Signature Date" Data Field will demonstrate the Textbox Widget.

  • Textbox Widget adds a text-editable form field to the PDF to store a field value.
    • Compare this to a Text Annotation which simply adds a text comment to the PDF.
  • This field uses Labeled Value to extract the date the application was signed.
  • Be aware, any zonal extractor (Read Zone, Highlight Zone or Detect Signature) would be able insert the signature widget.

The "IsProcessed" Data Field will demonstrate the Text Annotation.

  • Text Annotation inserts a text comment in the PDF.
    • Compare this to a Textbox Widget which adds an actual form field to the PDF to store a field value.
    • We will use this field and annotation to print the word "PROCESSED" on the output PDF
  • This field uses Highlight Zone to draw an extraction zone for the field and the Data Field's Default Value to determine what's printed.
    • This is a technique common to Text Annotation use cases and will be explained in further depth below.

Adding Annotations

PDF Data Mapping inserts various types of PDF annotations and widgets by configuring its Annotations property. Users can add one or more Annotation Types to the Annotations list. Adding a new Annotation to the list is simple.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Select the Annotations property and press the ellipsis button at the end.
  3. This will bring up the Annotations editor.
  4. Press the "+" button.
  5. Select the Annotation Type you want to add from the dropdown list.


  1. Once added, you will see the Annotation Type added to the Annotations list.
  2. All Annotation Types will have a set of General properties to configure.
  3. Some Annotation Types have additional properties you can configure.
    • For example, the Highlight Annotation has Appearance properties you can configure to adjust the highlight's color and other appearance properties.
  4. Press "OK" when finished.

Notes on shared properties

All Annotation Types share a set of General properties.

  • Fields
    • Select Data Fields to map the Data Fields to the PDF annotation with this property.
    • The Fields property is required.
      • One or more Data Field must be selected to generate the annotation.
      • If you don't select any Data Fields or the selected Data Fields are not extracted, PDF Data Mapping will not insert an annotation in the output PDF.
      • Be aware, all Data Fields are selected by default.
  • Padding
    • The Padding property can adjust the size of the annotation.
    • Grooper uses a Data Field's result instance to draw the annotation's boundaries.
      • The size of the Data Field's instance may be too small for what you want to appear on the output PDF.
      • If so, use Padding to increase the annotation's size on the PDF generated by PDF Data Mapping.
  • Allow Edit
    • Allow Edit refers to a reader's ability to edit the annotation as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the annotation (or widget).
    • Enabling this property (turning it True) will allow users to fully adjust the annotation in the PDF, including its size, location and other properties.
    • Be aware, even when False, users will still be able to interact with widgets, such as the Checkbox Widget or Textbox Widget.
  • Print
    • In a PDF viewing application, like Adobe Acrobat, all annotations and widgets PDF Data Mapping generates will be visible. The Print property determines whether or not the annotation is visible when the PDF is printed.
    • Be aware, the default is False.
      • Grooper presumes you will open the "Smart PDF" output by PDF Data Mapping will be opened in a PDF viewer (where all annotations will be visible).
      • Grooper also presumes if you want to print the PDF, you want something more like the original document printed, not the one with additional PDF elements Grooper inserts. If you do want those annotations and widgets visible when the PDF is printed, turn Print to True.

Annotation Types

There are currently six types of annotations Grooper can add to the PDF it creates:

Highlight Annotation

The Highlight Annotation overlays a colored rectangle with adjustable transparency on a Data Field's extracted location. In other words, it can highlight extraction results.

  • Use this to highlight important values extracted from Grooper.
  • Like all Annotations, this highlight can be printable or not. When the Print property is False, the highlight will show up when viewed in a PDF viewer but not if the PDF is printed.


In this example, we will use the Highlight Annotation to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form. To configure this Annotation we will:

  • Select the Data Fields we wish to highlight.
  • Adjust how we want the highlight to look.

Before Annotation

After Annotation

With a Highlight Annotation added to the Annotations list:

  1. Use the Fields property to select the Data Fields you wish to highlight.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Fields you wish to highlight.
    • In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" Data Fields.
    • Be aware, these fields must be extracted by the Extract activity or nothing will be highlighted.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
    • Adjusting Padding for Highlight Annotations is common. In this example, we increased the highlights size by 0.1 in on each side.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.


  1. Adjust the highlight's appearance, as desired, using the Appearance properties.
  2. Most commonly, users will adjust the Fill Color.
    • Use the dropdown to select from a list of system colors.
    • Or, enter an RGB value using the format #, #, #
    • This property defaults to the "Grooper green" highlight seen in Review's Data View. In this example, we've changed it to Yellow.
  3. Press "OK" when finished (or continue adding more Annotations).

Radio Group Widget

The Radio Group Widget overlays a group of radio button PDF elements on top of where a Grooper extractor finds OMR checkboxes on a document.

  • Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.
    • Note radio buttons (inserted by Radio Group Widget) differ from checkboxes (inserted by Checkbox Widget). For radio buttons, only one choice out of a group may be selected. For checkboxes, any number of choices may be selected.
  • The Data Field(s) this annotation references must use an OMR extractor to return results: Labeled OMR, Ordered OMR or Zonal OMR
    • This extractor must also have its Mode set to CheckOne (Only one box out of many may checked/selected).
  • PDF Data Mapping will insert one radio button for each checkbox the extractor locates.

Before Annotation

After Annotation

With a Radio Group Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the group of radio buttons.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "US Citizen" Data Field.
    • Be aware, this fields must (1) use an OMR extractor to return results (2) with its Mode set to CheckOne (3) have already been extracted by the Extract activity and (4) have located checkboxes during extraction or no radio buttons will be placed.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the widget (press a radio button).
  3. Press "OK" when finished (or continue adding more Annotations).

Be Aware: Annotations are overlaid on a page's image

BE AWARE: The Radio Group Widget overlays radio buttons on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the Image Processing activity.

  • Notice the original image for this document used checkboxes, not radio buttons. We see an "X" inside of a square box.

You can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).

  • In this case, the boxes were detected by the "detection only" Box Detection IP command and not removed by the "detection and removal" Box Removal command.
  • Box Detection finds and store the checkbox locations and check states but does not actually alter the image in any way.

Maybe you care about this, and maybe you don't. If you do, use Box Removal instead.

  • Box Removal will also find and store the checkbox locations and their check states, but it will also digitally remove the checkboxes from the document's image. This will allow Grooper to extract the checkboxes and allow PDF Data Mapping to overlay the radio buttons on a field of blank pixels.
  • Run Box Removal in an IP Profile using the Image Processing activity prior to running the Extract activity to do this.

Checkbox Widget

The Checkbox Widget inserts one or more form-fillable checkboxes into the PDF on top of where a Grooper extractor finds OMR checkboxes.

  • Checkboxes are common PDF elements used to indicate a choice from one or many options.
    • Note checkboxes (inserted by Checkbox Widget) differ from radio buttons (inserted by Radio Group Widget). For radio buttons, only one choice out of a group may be selected. For checkboxes, any number of choices may be selected.
  • The Data Field(s) this annotation references must use an OMR extractor to return results: Labeled OMR, Ordered OMR, or Zonal OMR
  • However, this extractor may use any of the OMR Modes (CheckOne, CheckMulti or Boolean).
  • PDF Data Mapping will insert a simple checkbox PDF element for each checkbox the extractor locates.


In this example, we will create a Checkbox Widget for the checkboxes extracted using the "Checklist" Data Field. This is a Labeled OMR extractor that uses the CheckMulti Mode, indicating one of any number of checkboxes may be checked for each label. Checked or not, the Checkbox Widget will insert a checkbox element into the generated PDF.

Before Annotation

After Annotation

With a Checkbox Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the group of radio buttons.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "Checklist" Data Field.
    • Be aware, this fields must (1) use an OMR extractor to return results (2) have already been extracted by the Extract activity and (3) have located checkboxes during extraction or no checkboxes will be placed.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the widget (check the checkboxes).
  3. Press "OK" when finished (or continue adding more Annotations).

BE AWARE: The Checkbox Widget overlays checkboxes on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the Image Processing activity.

For more information, see above.

Signature Widget

The Signature Widget inserts a signature block into the PDF.

  • Signature blocks allow PDFs to capture digital signatures. This allows you to create a document that can be digitally signed straight from Grooper on export.
  • The Data Field(s) this annotation references will typically use a zonal extractor to define where the signature block should be: Detect Signature or Highlight Zone most commonly
  • Other Value Extractors may work, but these are most typical. PDF Data Mapping will insert the signature block using the geometric boundaries of the extraction instance. Zonal extractors are well suited to define fixed boundaries of extraction results.


In this example, we will create a Signature Widget annotation for the signature line on the application form, using the "Signature" Data Field of our Data Model. The Signature Widget will insert an interactable signature element into the generated PDF.

Before Annotation

After Annotation

With a Signature Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the signature block.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "Signature" Data Field.
    • Be aware, this fields must (1) have already been extracted by the Extract activity and (2) have drawn a zone defining the location and size of the signature block (Most commonly, Detect Signature or Highlight Zone is used to do this).
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the element (submit a signature).
  3. Press "OK" when finished (or continue adding more Annotations).

BE AWARE: The Signature Widget overlays a signature block on a page's image. If present, any printed signature on the original page will persist (behind the widget), unless removed by the Image Processing activity.

For more information, see above.

Textbox Widget

The Textbox Widget inserts text-editable form fields into the generated PDF.

  • Form fields allow PDFs to collect and store data entered by a user.
  • Users can configure a Textbox Widget to create blank form fields or form fields with a value Grooper extracts already populated.
    • For blank form fields, the Data Field(s) this annotation references should use Highlight Zone to place a blank zone where the field should be inserted.
    • For populated form fields, the Data Field(s) this annotation references can use any extractor that returns a single-instance value (most typically Labeled Value).
      • This allows Grooper to not only generate a PDF with form fields where they weren't present in the source document, but prefill them with data Grooper collects.
  • Be aware, a Textbox Widget differs from a Text Annotation. Where Textbox Widget will insert a text-editable form field, Text Annotation adds a text comment to to PDF.

Before Annotation

After Annotation

In this example, we will use the Textbox Widget to insert a form field for the "Signature Date" Data Field. This used Labeled Value to extract the date. PDF Data Mapping will overlay the form field on top of the extraction result.

  • FYI: We will also adjust the generated widget's size using the Padding property. This is common when configuring Textbox Widgets when the font size you want to use for the form field is larger than the printed typeface on the document.


With a Textbox Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field(s) you wish to use to create text-editable form fields.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Field(s) you wish to select.
    • In our case, we are choosing the "Signature Date" Data Field.
    • Be aware, these fields must be extracted by the Extract activity or no textbox will be generated.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
    • Adjusting Padding for Textbox Widgets is common if the desired font size in the textbox differs from that printed on the source document. In this example, we increased the textbox's size by 0.1 in on each side.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to edit the value inside the textbox. To configure that, use the Read Only property.


  1. Adjust the textbox's other properties as desired.
    • These properties give you the ability to adjust the font and font size inside the textbox.
    • Please note: If you want to prevent a reader from editing the Grooper collected value inside the textbox, turn Read Only to True.
  2. Press "OK" when finished (or continue adding more Annotations).

Text Annotation

The Text Annotation inserts a text comment in the PDF.

  • This has two primary uses:
    • Insert comments into the PDF that are viewable when opening the PDF in a PDF viewer, but not printable.
    • Print a simple text note on a page.
      • Commonly, users will want to print a word like "PROCESSED" on the output PDF. This notes the document has been processed through Grooper.
  • The Data Field(s) may use any kind of extractor as long as it produces a result with (1) a location on the page to place the comment and (2) a text value to add to the comment.
  • Be aware, a Textbox Widget differs from a Text Annotation. Where Textbox Widget will insert a text-editable form field, Text Annotation adds a text comment to to PDF.


In this example, we will use a Text Annotation to print the word "PROCESSED" on the first page of the PDF generated by PDF Data Mapping.

  • We will use the "IsProcessed" Data Field to do this. The extraction logic to make this happen requires a less-than-common technique. We will show you how we build this Data Field in the #Technique: "IsProcessed" Data Field section.

Before Annotation

After Annotation

With a Text Annotation added to the Annotations list:

  1. Use the Fields property to select the Data Field(s) you wish to use to insert the text comment.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Fields you wish to select.
    • In our case, we are choosing the "IsProcessed" Data Fields.
    • Be aware, these fields must (1) be extracted by the Extract activity and (2) hold a location and value or no comment will be added.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.
    • In our case, we do want this comment printed when the document is printed. So, we've changed Print to True.


  1. Adjust the comment's appearance, as desired, using the Appearance properties.
    • Users may change the comment's font and font size with the Font Name and Font Size properties.
    • Users may select a Fill Color and Text Color in one of two ways:
      • Using the the dropdown to select from a list of system colors
      • Or, entering an RGB value using the format #, #, #
      • Be aware, there is no true "transparent" Fill Color option. The selectable Transparent option is a system color that equates to "white".
  2. Press "OK" when finished (or continue adding more Annotations).

Technique: "IsProcessed" Data Field

To print the word "PROCESSED" on the PDF, we used a specific technique. A Text Annotation just needs two things from a Data Field to insert the annotation: (1) a location on the page to place the comment and (2) a text value to add to the comment. The word "PROCESSED" did not exist on the source PDF. So, we had to figure out a way to use a Data Field to generate a result rather than extract it.

We did this in essentially two steps:

  1. Use the Highlight Zone extractor to define where the annotation should be printed.
  2. Use a Calculated Value to define the text we want to print (the word "PROCESSED").


This gives a Text Annotation everything it needs to insert the comment: (1) A location and (2) some text

How To: Configure Bookmarks

Bookmarks in PDFs aid readers when navigating through multipage documents. PDF Data Mapping can insert bookmarks into the generated PDF to take advantage of this functionality. This can be done in one of two ways (or both):

  1. Using a document folder's (Batch Folder) child folders (Batch Folder).
  2. Using a document folder's extracted Data Fields.

In this tutorial we take an application packet separated into component child documents and use PDF Data Mapping's Bookmarking property to create bookmarks for each one.

The application packet as a whole consists of five separate and distinguishable documents.

  1. The application itself (and a coversheet)
  2. A proposal summary
  3. The student's resume
  4. A letter of recommendation
  5. An essay

Our goal is to create a bookmark in the generated PDF file for each of these component documents (child documents).

Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks.


We we also demonstrate how to use Data Fields for bookmarking. This allows us to insert PDF bookmarks for locations of extracted data.

  • The "Signature" bookmark in this example would take the reader to the signature line of the PDF, using the location extracted by the "Signature" Data Field in our Data Model.

Bookmarking Option 1: Child document/folder bookmarks

There are two ways the Bookmarking feature can insert bookmarks into a PDF generated by PDF Data Mapping.

  1. It can insert a bookmark for each child document/folder.
  2. It can insert a bookmark for selected (single instance) Data Fields.

This section will detail how to insert bookmarks using child documents.

Option 1 Prereqs: Separated child documents

For PDF Data Mapping to work, Grooper needs to have data to map.

If enabled, Bookmarking will automatically add bookmarks to a PDF if a document has child documents in the Batch's folder hierarchy.

  • If a document at folder level 1 is exported and has two child documents, the generated PDF will have two bookmarks in the generated PDF.
  • Clicking on the bookmark will take the reader to that child document's page in the PDF.


For this to work:

  • The parent document folder must have separable child pages.
    • Either from scanning pages in with a scanner or using the Split Pages activity to generate pages from an imported PDF.
  • These child pages must then be separated into child folders.
    • Either using a Separation Profile when scanning or using the Separate activity.


Technically speaking, that's all you need. PDF Data Mapping will add PDF bookmarks for every child document and name it using each child folder's name.

  • Be aware, without classifying the child documents these names will just be "Folder (1)" "Folder (2)" "Folder (3)" and so on.

Not separated

No child folders

Separated

Has child folders

Separated and classified

Has child document folders.

What about Page 1 there?

Is it in a folder? No. Then it won't get a bookmark.

Adding bookmarks for child documents/folders

PDF Data Mapping will create bookmarks for child documents/folders by default. There is no configuration required besides enabling the Bookmarking property.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Bookmarking property to Enabled.
  3. Press "OK" when finished.


That's it! It's that simple!

As long as the document folder PDF Data Mapping is applied to has child documents/folders, bookmarks will be created for each child document.


Bookmarking Option 2: Data Field bookmarks

There are two ways the Bookmarking feature can insert bookmarks into a PDF generated by PDF Data Mapping.

  1. It can insert a bookmark for each child document/folder.
  2. It can insert a bookmark for selected (single instance) Data Fields.

This section will detail how to insert bookmarks using Data Fields. This allows PDF Data Mapping to bookmark important field value locations extracted by Grooper in the output PDF.

Option 2 Prereqs: Data Fields and extracted data

For PDF Data Mapping to work, Grooper needs to have data to map.

Bookmarking can also insert PDF bookmarks using extracted data and their location. Data Fields collect results using extractors which return results from the source document. Bookmarking will use these results' locations to embed this kind of bookmark.

For this to work:

  • You must have these Data Fields defined in a Data Model and configured to return results.
  • Data must be saved for each Data Field prior to the PDF being generated.
    • The Extract activity must run before Merge or Export generates the PDF.
    • If performing user assisted data review, the Review activity must complete before Merge or Export generates the PDF.

Adding bookmarks for Data Fields

PDF Data Mapping will insert bookmarks for extracted Data Field value locations by simply selecting which Data Field(s) you want to bookmark.

  • Please note: Only single-instance Data Fields may be bookmarked.


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Bookmarking property to Enabled and expand its sub-properties.
  3. Select Data Elements and press the ellipsis button at the end.
  4. A "Data Elements" selection editor will appear.
  5. Select the Data Field whose location you wish to bookmark.
    • Please note: Only single-instance Data Fields may be bookmarked.
  6. Press "OK" when finished selecting Data Fields.
  7. Press "OK" when finished configuring PDF Data Mapping.


As long as the document folder PDF Data Mapping is applied to has extracted the selected Data Field(s) with an Extract activity, bookmarks will be created for each Data Field selected.

How To: Configure Metadata

The PDF Data Mapping behavior has the ability to create and insert additional metadata into the generated PDF as well, using information collected during Grooper's document processing. The metadata you are able to create falls into one of three categories:

  1. Editing the PDF's default metadata fields, including:
    • Title
    • Author
    • Subject
    • Created Date
    • Modified Date
    • Application
  2. Adding "Keywords" to the PDF metadata
    • This can be done using expression based or extraction based methods.
  3. Creating custom metadata fields and values
    • Custom metadata can be stored for any (single instance) Data Field values collected during the Extract activity.

Notice what's not included in this list is the exported document's filename (e.g. "Im_a_file.pdf"). Filename mappings are always configured using an Export Behavior.

Prereqs: Data extraction

For PDF Data Mapping to work, Grooper needs to have data to map.

For Metadata, data coming from Grooper can be mapped to the PDF in one of two ways:

  1. Using Data Field results
    • To embed custom PDF metadata, the custom fields are generated from Data Fields in the document's Data Model and their collected results.
    • This means the document must be processed by the Extract activity in order to create and populate these custom fields.
    • Or, if performing user assisted data review, the values must be previously recorded during the Review activity.
  2. Using code expressions
    • In the case of the default PDF metadata fields and keywords, expressions can be used to populate the metadata.
    • This gives you access to not only extracted Data Field results but also system data, classification information, and various functions to manipulate it.

Mapping default PDF metadata

PDF Data Mapping's Metadata settings can edit a PDF's default metadata values for its "Title", "Author", "Subject", "Application", "Created" and "Modified" properties.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. Use a code expression to create custom values for the following default PDF metadata:
    • Title for the PDF's "Title" field
    • Author for the PDF's "Author" field
    • Creator for the PDF's "Application" field
    • Subject for the PDF's "Subject" field
    • Creation Date for the PDF's "Created" field
    • Modification Date for the PDF's "Modified" field
  4. Press "OK" when finished.


In our example, we made the following changes to the default PDF metadata:

  • Title:
    • This defaults to the expression CurrentDocument.ContentTypeName. This will make the title whatever the document's Document Type classification is.
    • We did not change Grooper's default.
  • Author:
    • This defaults to the expression LDAP.CurrentUserDisplayName. This will set the author to the Windows username for the Grooper user or service who created the PDF.
    • We changed this to evaluate to the applicant's first name, middle initial, and last name as collected by Grooper using the following expression:
      • $"{Applicant_Information.First_Name} {Applicant_Information.Middle_Initial} {Applicant_Information.Last_Name}"
  • Creator:
    • This will adjust the PDF's "Application" field. This field is left blank by default.
    • We changed this to the simple string "Grooper PDF Data Mapping".
  • Subject:
    • This field is left blank by default.
    • We changed this to use the value of the "Proposal Title" Data Field in the "Proposal Information" Data Section with the expression Proposal_Information.Proposal_Title
  • Creation Date:
    • This sets the PDF's "Created" date value and defaults to the expression DateTime.Now. This returns the current system time of your machine at the time the PDF is generated.
    • We did not change Grooper's default.
  • Modification Date:
    • This sets the PDF's "Modified" date value and defaults to the expression DateTime.Now. This returns the current system time of your machine at the time the PDF is generated.
    • We did not change Grooper's default.

Mapping Keywords

The Metadata settings can add terms to the PDF's "Keywords" field in one of two ways:

  1. Using a code expression
  2. Using an extractor (Data Type, Value Reader or Field Class)


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. To add keyword terms with a code expression, add the expression to the Keywords property.
    • This expression should evaluate to a string value. This string will be added to the PDF's "Keywords" field.
  4. To add keyword terms with an extractor, reference the extractor with the Keywords Extractor property.
    • This extractor should return a string value. This string will be added to the PDF's "Keywords" field.
  5. Press "OK" when finished.


In our example, we used an expression to insert a keyword based on the word count of the "Essay" document in the application packet.

  • "Short Essay" for essays under 400 words
  • "Long Essay" for essays over 600 words
  • "Normal Essay" for essays between 400 and 600 words


We also used an extractor to add a "signed" keyword if the application was signed and "not signed" if the application was not signed.

Mapping custom Metadata

Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".

  • Consider these names reserved.
  • If you are attempting to export Data Field values as custom PDF metadata, they cannot share any reserved names. You will need to rename the Data Field in Grooper to a unique name.

PDF Data Mapping's Metadata feature can store custom metadata as well, exporting Data Field values to custom PDF metadata fields. This is a way for Grooper to save Data Field values directly to the PDF.

  • BE AWARE: Only single-instance data can be exported to a PDF's custom metadata.
    • Data Fields at the root of a Data Model or in single instance Data Sections can be exported.
    • Data Fields in multi-instance Data Sections and Data Column values cannot be exported.


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. Turn Export Data Fields to True.
  4. Use the Field Filter editor to select a specific set of Data Fields to export. Otherwise, all Data Fields will be exported to custom PDF metadata fields.
  5. Press "OK" when finished.


In our example, we exported all Data Fields to the generated PDF's custom fields. Custom metadata can be viewed using Adobe Acrobat. Go to "Document Properties...". Then select the "Custom" tab. All selected Data Fields will be exported to this "Custom Properties" list in the PDF.

  • FYI: Spaces and other special characters in a Data Field's name will be replaced with underscores (i.e. "Field_Name")
  • FYI: Data Fields in single instance Data Sections will be named using dot notation (i.e. "Section_Name.Field_Name")

How To: Configure Piece Info

!!

BE AWARE: PIECE INFO IS STILL UNDER DEVELOPMENT

Please consider the Piece Info feature in "beta" at this time. This feature will be more fully documented once fully developed.

"PieceInfo" is a PDF dictionary of additional data stored by other applications. For example, when you save a PDF from Adobe Illustrator, PieceInfo will store the original Illustrator file (which allows the PDF to be edited in Illustrator as if it were the original). PieceInfo can be stored at the document level for the whole PDF or at the page level for one or more pages in the PDF.

PDF Data Mapping uses PieceInfo dictionaries to store extracted Data Field values as a PDF dictionary embedded in the document's structure by enabling and configuring the Piece Info settings.

  • Contrast this with the Metadata settings which store Data Field values at the as custom metadata fields in the document properties.
  • Piece Info is unique in that it can export data from a Data Table in very specific scenarios. Using the Key Column property, it can build the dictionary from only two columns in a table, and only if one of those columns acts as a "key" with unique values for each extracted row.

PieceInfo at document level vs PieceInfo at page level

With Piece Info enabled and configured, PDF Data Mapping will store the dictionary at either the document level or on a page, depending on the Batch's folder structure.


Imagine a Batch Folder that looks like this:


If PDF Data Mapping with Piece Info is configured for a parent document's Document Type, the PieceInfo dictionary is stored at the document level in the PDF.



If PDF Data Mapping with Piece Info is configured for a child document's Document Type, the PieceInfo dictionary is stored at the page level, on the first page of that child document in the PDF.

  • In this example PDF Data Mapping with Piece Info was configured for the "Green" Document Type.
    • With a PDF generated for the parent document folder, the output PDF will be 5 pages long total (because there are a total of five pages in the three child document folders).
    • Page 1 of the child document folder "Green (2)" will be page 2 in the output PDF.
    • The PieceInfo dictionary will therefore be stored in page 2 of the output PDF.
  • Be aware, it doesn't matter if the child document is a multipage document with extracted results on multiple pages. The PieceInfo dictionary is only stored once, on the first page only.



FYI

You can inspect PieceInfo with Adobe Acrobat Pro.

For inspecting PieceInfo at the document level:

  • Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight"). Select "Options" > "Browse Internal PDF Structure...". Click the Lightbulb icon. Expand "The document root" and look for "PieceInfo". Expand "PieceInfo" and look for whatever you named your dictionary in the Piece Info configuration.

For inspecting PieceInfo at the page level:

  • Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight"). Select "Options" > "Browse Internal PDF Structure...". Click the Page icon. Expand a Page and look for "PieceInfo". Expand "PieceInfo" and look for whatever you named your dictionary in the Piece Info configuration.

Known Piece Info Issues

Issue #1: The Elements Property

The Elements property does nothing. Its original intent was to be a kind of filter that allowed for simpler configuration of the Fields property. However, it was never fully implemented. It has been deemed an unnecessary property and will be removed in future versions.

Issue #2: Page Level Classification

When separating and classifying documents using ESP Auto Separation, Grooper performs page-level classification. This can cause Piece Info to create a blank PDF PieceInfo dictionary for every page in certain PDF Data Mapping configurations.

How To: Generate the PDF using Merge or Export

A PDF Data Mapping configuration is applied when Grooper builds a PDF. This will happen when one of two activities is applied to a Batch Folder:

  • Either the Export activity
  • Or the Merge activity


In either case, three conditions must be met for Grooper to create a PDF with the additional PDF Data Mapping settings.

  1. The Batch Folder being processed must be assigned a Document Type that inherits the PDF Data Mapping behavior.
    • PDF Data Mapping will need to be configured for that Document Type, its parent Content Category or its parent Content Model.
  2. A PDF Format must be added.
    • For the Export activity: To the Export Formats configuration in the Export Behavior.
    • For the Merge activity: To the Merge Format configuration.
  3. The PDF Format's Always Build property should be set to True.
    • This will ensure a new output file will be generated in cases where an imported PDF is already attached to the Batch Folder in Grooper.