2023.1:PDF Data Mapping (Behavior): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
 
(242 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{|cellpadding="10" cellspacing="5"
{{AutoVersion}}
|-style="background-color:#ed2330; color:white"
 
|style="font-size:14pt"|
<blockquote>{{#lst:Glossary|PDF Data Mapping}}</blockquote>
'''2021'''
 
|This article is in development for the upcoming version of Grooper, '''Grooper 2021'''''PDF Generate'' is a new '''Content Type''' '''''Behavior''''' option in 2021. This information is incomplete and/or may change by the time of release.
'''''PDF Data Mapping''''' builds a data rich "Smart PDF" from a document folder's content.  Classification results, extracted data, and more can be used to insert native PDF elements into the generated PDF.
 
PDF elements that can be mapped from Grooper generated results include:
* Bookmarks
* Metadata
* PDF Annotations (such as text highlighting, checkbox widgets and signature widgets)
 
{|class="download-box"
|
[[File:Asset 22@4x.png]]
|
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1)The first contains a '''Project''' with resources used in examples throughout this article.  The second contains one or more '''Batches''' of sample documents.
* [[Media:2023.1 Wiki PDF-Data-Mapping Batches.zip]]
* [[Media:2023.1 Wiki PDF-Data-Mapping Project.zip]]
|}
|}
<blockquote style="font-size:14pt">
The ''PDF Generate Behavior'' is a '''Content Type''' '''''Behavior''''' designed to create an exportable PDF file with additional native PDF elements, using the classification and extraction content of a '''Batch Folder'''.  This includes capabilities to export extracted data as PDF metadata, inserting bookmarks, and creating PDF annotations, such as highlighting, checkbox and signature widgets.
</blockquote>


== About ==
== About ==
The '''''PDF Data Mapping''''' behavior allows Grooper users to more fully leverage the capabilities of the PDF file type.  The standard PDF '''''Export Format''''' (and '''''Merge Format''''') in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon '''''Export''''' (or '''''Merge'''''). 


The ''PDF Generate Behavior'' (or ''PDF Generate'' for short) allows Grooper users to more fully leverage the capabilities of the PDF file type.  The standard PDF '''''Export Format''''' in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon '''Export'''.  However, this is just the "display information" required to open and read the document.  There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text.  PDF content can also include metadata, keywords, bookmarks, annotations, and more!   
However, this is just the "display information" required to open and read the document.  There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text.  PDF content can also include metadata, keywords, bookmarks, annotations, and more!   


The ''PDF Generate Behavior'' creates an exportable PDF file that includes some of this additional content available to the PDF formatThis is part of Grooper's evolving "Smart PDF Architecture".  This is a design philosophy striving to more fully utilize the capabilities of the PDF file type and merge them with Grooper's own document processing capabilities.
'''''PDF Data Mapping''''' expands Grooper's standard PDF generation capabilities.  It creates an exportable PDF file that includes additional content available to the PDF file type'''''PDF Data Mapping''''' merges data collected by Grooper into the PDF by mapping these values to native PDF elements like bookmarks and annotations.


The expanded ''PDF Generate Behavior'' functionality can be divided into three categories:
The expanded '''''PDF Data Mapping''''' functionality can be divided into three categories:
* '''''Annotations'''''
* '''''Annotations''''': Highlight important text, insert comments, and embed interactive widgets like editable form fields and checkboxes.
* '''''Bookmarks'''''
* '''''Bookmarks''''':  Organize complex documents with bookmarks linking to child documents and/or extracted '''Data Fields'''.
* '''''Metadata'''''
* '''''Metadata''''':  Alter the PDFs default metadata, add searchable keywords and export custom metadata using data collected by Grooper.


<tabs style="margin:20px">
<tab name="Annotations" style="margin:20px">
=== Annotations ===
=== Annotations ===
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
Annotations are additional objects you can add to PDF documents. Grooper uses information from '''Data Elements''' in a '''Data Model''' collected during the '''Extract''' activity to add these annotations (also called "widgets"). These annotations can increase the readability and add components for the reader to interact with the document, such as checkboxes and signature boxes.
Annotations are native PDF elements used to highlight and comment text in a PDF file. For '''''PDF Data Mapping''''', "annotations" also refer to interactable "widgets" such as checkbox and text form fields.  The '''''Annotations''''' functionality allows you to embed many of these native PDF annotations and widgets into Grooper generated PDFs.
 
'''''Annotations''''' can serve many purposes:
* '''''Annotations''''' can increase the readability, such using a highlight annotation to call out important information.
* '''''Annotations''''' can add components for the reader to interact with the document, such as checkboxes and signature widgets.


The kinds of annotations you can add are:


'''''PDF Data Mapping''''' can add the following kinds of annotations/widgets:
# Highlighting
# Highlighting
# Radio group buttons
# Radio group buttons
Line 36: Line 49:
# Editable text boxes
# Editable text boxes


Grooper uses the data instance information from extracted '''Data Fields''' to insert these annotations.  For example, here we set up a '''Content Model''' with a '''Data Field''' named "Last Name".  After the document's data was collected during the '''Extract''' activity, Grooper has a data instance it can associate with the "Last Name" '''Data Field''', including its size and location coordinates on the document.  We then used the ''Highlight Annotation'' to highlight the extracted last name on the document in yellow.


Grooper uses information from '''Data Elements''' in a '''Data Model''' collected during the '''Extract''' activity to add these annotations.
* For example, if Grooper extracts a "Name" field and you want that highlighted on the output PDF, you can use the "'''''Highlight Annotation'''''" to highlight the name Grooper extracted on the document.
{|class="fyi-box"
|
'''FYI'''
|
The size of all these annotations can also be adjusted using a '''''Padding''''' property if the size of the extracted data instance is too small for your needs.
The size of all these annotations can also be adjusted using a '''''Padding''''' property if the size of the extracted data instance is too small for your needs.
|
|}
|valign=top|
[[File:Pdf-generate-about-05.png]]
[[File:Pdf-generate-about-05.png]]
|}
|}
</tab>
 
<tab name="Bookmarks" style="margin:20px">
=== Bookmarks ===
=== Bookmarks ===
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
Bookmarks allow easy navigation for multipage PDF documents.  When exporting a single PDF comprised of multiple child sub-documents, you can create bookmarks for each child document. This way, you can keep all the documents together in a single PDF file, easily navigating from one section of the document to another.
Bookmarks provide easy navigation for multipage PDF documents.  '''''PDF Data Mapping''''' can generate bookmarks in one of two ways:
# Bookmarks can be generated for extracted '''Data Field''' locations.
# When exporting a document folder that has child document folders, bookmarks can be generated for each "sub-document".
#* This is the default bookmarking behavior and requires no configuration.  Bookmarks will be named however the child document folders are named.


For example, this document is an application packet for a study abroad program.  Each document in the packed was separated and classified as a child document folder of one '''Document Type''' or another.  The ''PDF Generate Behavior'' was used to export the packet as a single PDF and a bookmark was inserted for each sub-document and named after its '''Document Type'''.


Grooper can create bookmarks from extracted '''Data Fields''' in the document as well.
In this example, this document is an application packet for a study abroad program.  It has both kinds of bookmarks.
* The "Signature" bookmark is from an extracted '''Data Field'''.  It will take the reader to a signature location on the PDF.
* The rest were generated for each child document in the document folder ('''Batch Folder''') that was exported. '''''PDF Data Mapping''''' inserted a bookmark for each sub-document.  The selected "Resume (4)" bookmark in the image took the reader to the resume page in the PDF.
 
{|class="fyi-box"
|
'''FYI'''
|
|
[[File:Pdf-generate-about-06.png]]
Bookmarks generated for child document folders will be named whatever the documents are named.
* A document folder's ('''Batch Folder''') name defaults to its classified '''Document Type''' and document number. Here, "Application (2)", "Proposal Summary (3)", "Resume (3)", and so on.
* A document folder's name can be changed if you edit the '''Document Type's''' '''''Caption''''' property. This will then change the bookmarks name.
** Be aware, the document must be ''extracted'' for the '''''Caption''''' to be applied and its name changed.
|}
|valign=top|
[[File:2023.1 PDF-Data-Mapping 01 02 About-Bookmarks-01.png]]
|}
|}
</tab>
 
<tab name="Metadata" style="margin:20px">
=== Metadata ===
=== Metadata ===
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
Metadata refers to a PDF file's content beyond the information required to display the document (the page images and encoded text data).  Prior to implementing the ''PDF Generate Behavior'' functionality, Grooper only had access to edit minimal PDF metadata, notably the file's name upon exportThe ''PDF Generate Behavior'' allows Grooper to alter and store additional collected metadata as well, including '''Data Field''' values collected during the '''Extract''' activity.  This means Grooper can now create a viewable document with all the extracted data associated with the document itself, independent of that data being stored elsewhere (such as a database table or content management system).
Metadata refers to a PDF file's content beyond the information required to display the document (the page images and encoded text data).  Prior to implementing the '''''PDF Data Mapping''''' functionality, Grooper only had access to edit minimal PDF metadata upon export (notably the PDF's file name).   


This metadata can be accessed by opening a PDF in a PDF viewer application, such as Adobe Acrobat, and opening the "Document Properties" window from the File menu.
'''''PDF Data Mapping''''' allows Grooper to alter and store additional metadata, including:
|
# The PDF's default metadata fields, including its "Title", "Author", "Subject" and more.
[[File:Pdf-generate-about-07.png]]
# Keywords
|-
# Custom metadata fields
|valign=top|
#* Custom metadata allows Grooper to embed any single instance '''Data Field's''' value directly to the PDF.
There are several pieces of metadata Grooper has access to.


# All of the fields highlighted here can be created from Grooper, using an expression based syntax to access data extracted from the document and system information.
# Note this gives Grooper the capability to generate and insert keywords into the PDF's "Keywords" field.
#* In this case, Grooper has created a keyword based on the word count length of the essay in this study abroad application packet.
# Extracted '''Data Field''' values can also be exported as PDF metadata.  This information can be viewed either using the "Custom" tab or the "Additional Metadata..." window.
|
[[File:Pdf-generate-about-08.png]]
|-
|valign=top|
# In the "Custom" tab...
# You can see all the '''Data Fields''' Grooper extracted and their values as custom metadata for this document.


This gives Grooper a mechanism to create a viewable document with all extracted (single instance) data associated with the document itself, independent of that data being stored elsewhere (such as a database table or content management system).


{|cellpadding="10" cellspacing="5"
{|class="fyi-box"
|-style="background-color:#f89420; color:white"
|style="font-size:22pt"|'''&#9888;'''
|
|
Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".
'''FYI'''
 
You may run into an issue upon export if you have '''Data Fields''' in your '''Data Model''' who share one of these names.  If using the '''''Metadata''''' creation capabilities of the ''PDF Generate Behavior'', consider these names "taken" and adjust the name of the '''Data Field''' to be something different.  For example, in this case a '''Data Field''' returning the title of the proposal listed on the application was changed from "Title" to "Title of Proposal"
|}
|
|
[[File:Pdf-generate-about-09.png]]
This metadata can be accessed in Adobe Acrobat by opening the "Document Properties" window from the File menu.
|}
|}
</tab>
</tabs>


{|cellpadding=10 cellspacing=5
{|class="attn-box"
|style="width:40%" valign=top|
|
As a '''''Behavior''''', ''PDF Generate'' is configured on a '''Content Type''' object, commonly a '''Content Model''' or a '''Document Type'''.
&#9888;
 
# Here, we have selected a '''Content Model''' in the Node Tree.
# To add a '''''Behavior''''', select the '''''Behaviors''''' property and press the ellipsis button at the end.
# This will bring up a dialogue window to add various behaviors to the '''Content Model''', including the ''PDF Generate Behavior''.
# Add the ''PDF Generate Behavior'' to the list using the "Add" button.
# Select ''PDF Generate Behavior'' from the listed options.
|
|
[[File:Pdf-generate-about-01.png]]
Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".
|-
* Consider these names reserved.
* If you are attempting to export '''Data Field''' values as custom PDF metadata, they ''cannot'' share any reserved names.  You will need to rename the '''Data Field''' in Grooper to a unique name.
|}
|valign=top|
|valign=top|
# Once added, you will see a ''PDF Generate Behavior'' item added to the '''''Behaviors''''' list.
[[File:2023.1 PDF-Data-Mapping 01 02 About-Metadata-01.png]]
# Selecting this '''''Behavior''''', you will see property options to configure PDF creation.
|}


== How To: Add a PDF Data Mapping Behavior ==


The expanded ''PDF Generate Behavior'' functionality can be divided into three categories:
Like all '''''Behaviors''''', '''''PDF Data Mapping''''' is configured on a '''Content Type''' node, commonly a '''Content Model''' or a '''Document Type'''.
* '''''Metadata'''''
* '''''Bookmarks'''''
* '''''Annotations'''''




Before we get into what these properties do, how to configure them, and how they effect the exported PDF, there's one key thing to keep in mind when using the ''PDF Generate Behavior''.
# Here, we have selected a '''Content Model''' in the Node Tree.
|
# To add a '''''Behavior''''', select the '''''Behaviors''''' property and click the ellipsis button at the end.
[[File:Pdf-generate-about-02.png]]
# This will bring up a dialogue window to add various behaviors to the '''Content Model''', including '''''PDF Data Mapping'''''.
|-
# Add '''''PDF Data Mapping''''' to the list by clicking on the "+" button.
|valign=top|
# Select ''PDF Data Mapping'' from the listed options.
Along with the ''PDF Generate Behavior'', you will also need an ''Export Behavior'' configured to export a PDF formatted file.  The ''PDF Generate Behavior'' does the job of configuring all the extra content (metadata, bookmarks and/or annotations) you want to add to the exported PDF. The ''Export Behavior'' does the job of actually creating the PDF (with the content configuration information supplied by the ''PDF Generate Behavior'') and sending it off to an external storage platform.


''Export Behaviors'' can be added to '''Content Types''', such as the '''Content Model''' here.
[[File:2023.1 PDF-Data-Mapping 03-01.png]]


# To add an ''Export Behavior'', press the "Add" button in a '''''Behaviors''''' list collector.
# Select ''Export Behavior''.


#<li value=6> Once added, you will see a '''''PDF Data Mapping''''' item added to the '''''Behaviors''''' list.
# Selecting this '''''Behavior''''', you will see property options to configure PDF creation.
#* How to configure each of these properties will be discussed in the [[#How To: Configure PDF Data Mapping]] section.
# Press "OK" when finished configuring '''''PDF Data Mapping'''''.
# Don't forget to save changes to the '''Content Model'''.


{|cellpadding="10" cellspacing="5"
[[File:2023.1 PDF-Data-Mapping 03-02.png]]
|-style="background-color:#36b0a7; color:white"
|style="font-size:14pt"|'''FYI'''
|
''Export Behaviors'' can also be configured on the '''Export''' activity as local ''Export Behaviors'' to the activity configuration.


The benefit to adding it to a '''Content Model''' is you will often use information collected from a '''Content Model''' upon exporting your documents, such as a document folder's classified '''Document Type''' or collected data from a '''Data Model''' for field mapping purposes.  You might as well do it now, adding it to the '''Content Model''' while you're adding the ''PDF Generate Behavior''.
== About the documents used in these tutorials ==
|}
The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail a more specific set up for a '''''PDF Data Mapping'''''.  This is a packet of documents from a single applicant containing a cover page and five different kinds of documents.
|
[[File:Pdf-generate-about-03.png]]
|-
|valign=top|
Once the ''Export Behavior is added'', you will need to add an '''''Export Definition'''''.  This will control how the file is exported, most notably where the file is exported.  Whether exporting to a Windows file system, or an IMAP email mailbox, or a CMIS content management system, Grooper needs to know where to put the file.  An '''''Export Definition''''' is how Grooper knows where the file goes.


'''Importantly for the ''PDF Generate Behavior''''', you will also use an '''''Export Definition''''' to define what type(s) of file you want to export.  For whichever '''''Export Definition''''' you choose, you will need to ensure you've configured an '''''Export Format''''' for a PDF formatted file in order to export the generated PDF.
By the end of this tutorial we will have taken a source application packet, used Grooper to process it, and exported a single PDF with:
* Metadata collected from Grooper
* New annotations and widgets
* Easily navigable bookmarks


# To add an '''''Export Definition''''', select the property and press the ellipsis button at the end.
{|class="how-to-table"
# This will bring up an '''''Export Definitions''''' list collector window.
|
# Here, we've added a ''CMIS Export'' definition, using a '''CMIS Connection''' to a local NTFS folder.
'''Cover Page and Application'''
#* The '''''Export Definition''''' is up to you and your needs.  There are many different external storage platforms Grooper can export to.
# Note, we've added a ''PDF Format'' configuration to the '''''Export Formats''''' property.


We will review some specifics of the ''PDF Format'' option's configuration later.  For now, just be aware adding a PDF '''''Export Format''''' is a ''necessary'' step to export the PDF file generated by the ''PDF Generate Behavior''.
This is an application for a traveling abroad scholarship.
|
[[File:Pdf-generate-about-04.png]]
|}


== How To ==
Primarily, the cover page and application document will allow us to demonstrate the annotations and widgets '''''PDF Data Mapping''''' can generate.  We will use its '''''Annotations''''' settings to add the following annotations:
* '''''Text Annotation'''''
* '''''Highlight Annotation'''''
* '''''Checkbox Widget'''''
* '''''Radio Group Widget'''''
* '''''Signature Widget'''''
* '''''Textbox Widget'''''


The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail more specific set up for a ''PDF Generate Behavior''.  This is a packet of documents from a single applicant containing five different kinds of documents.
Secondarily, we will also use data collected from this form will be used to generate and store default and custom metadata. We will use the '''''Metadata''''' settings to do this.


<tabs style="margin:20px">
Lastly, we will embed a bookmark that will take the PDF's reader to the signature field on the document.  We will use the '''''Bookmarking''''' settings to do this.
<tab name="Application" style="margin:20px">
|valign=top|
=== Application ===
[[File:2023.1 PDF-Data-Mapping 04 01 About-the-docs-01.png]]
{|cellpadding=10 cellspacing=5
|valign=top|
[[File:Pdf-generate-howto-docset-02.png]]
|-
|valign=top style="width:40%"|
|valign=top style="width:40%"|
This document consists of two pages.  The first is a coversheet for the whole application packet.  The second is the application form itself.
'''Essay'''
 
Primarily, this document will allow us to demonstrate the different kinds of annotations available when using a ''PDF Generate Behavior'' to generate a PDF file (using its '''''Annotations''''' property configuration).  We will see how to set up one example of each of the following annotation types available in Grooper:
* Highlight Annotation
* Checkbox Widget
* Radio Group Widget
* Signature Widget
* Textbox Widget


Importantly for any annotation type, a '''Data Field''' must be extracted in order to place the annotation.  How does Grooper know what you want to highlight?  It uses the extraction result of a '''Data Field''', which includes information about where that value is located on the pageEven if the extraction result is just a blank zone without returning any actual information, Grooper needs some kind of coordinates to know where to place the annotation.
This application also includes an essay from the student.   


Since we're going to end up extracting some data in order to place these annotations, this will also give us the opportunity to see some of the collected data inserted as PDF metadata as well.
This document will demonstrate how to add Keywords to the PDF's metadata.  Using the '''''Metadata''''' settings we will configure a code expression to insert "long essay", "normal essay", or "short essay" depending on the essay's length.
|
|
[[File:Pdf-generate-howto-docset-01.png]]
[[File:Pdf-generate-howto-docset-03.png|400px]]
|-
|
|
[[File:Pdf-generate-howto-docset-02.png]]
'''Other Documents'''
|}
</tab>
<tab name="Essay" style="margin:20px">
=== Essay ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
This application also includes an essay from the student.  This document will demonstrate how to add keywords to the PDF's metadata. 


We will use an extractor to count the number of words in the essay and configure the ''PDF Generate Behavior's'' '''''Metadata''''' properties to insert a keyword of "long essay", "medium essay", or "short essay" depending on the essay's length.
|
[[File:Pdf-generate-howto-docset-03.png|400px]]
|}
</tab>
<tab name="Other Documents" style="margin:20px">
=== Other Documents ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
This packet contains three other kinds of documents as well:  
This packet contains three other kinds of documents as well:  
* a proposal summary
* a proposal summary
Line 205: Line 191:
* and a letter of recommendation.
* and a letter of recommendation.


These documents (as well as the rest) will allow us to see how to insert bookmarks into the generated PDF, using the ''PDF Generate Behavior's'' '''Bookmarking''' property configuration.
For these documents (as well as the rest) we will insert bookmarks into the generated PDF, taking the reader to each document in the larger file.  We will use '''''Bookmarking''''' settings to do this.
|
|valign=top|
[[File:Pdf-generate-howto-docset-04.png]]
[[File:Pdf-generate-howto-docset-04.png]]
|
|valign=top|
[[File:Pdf-generate-howto-docset-05.png]]
[[File:Pdf-generate-howto-docset-05.png]]
|valign=top|
[[File:Pdf-generate-howto-docset-06.png]]
|}
=== Notes on PDF Data Mapping, child documents and bookmarking ===
{|class="how-to-table"
|
|
[[File:Pdf-generate-howto-docset-06.png]]
The original document was imported as a single document into Grooper.  We have separated it into child documents which will allow us to insert bookmarks for each separated document.
|-
 
|style="width:40%" valign=top|
# The '''''PDF Generation Behavior''''' will be applied to the '''Batch Folders''' at folder-level one.
The original document, imported as a single multipage PDF file, has been processed a bit to facilitate this.
#* The attached file is the source application packet.
# The '''''Split Pages''''' activity was applied to split the packet into pages. Then, those pages were separated into classified document folders at folder-level two.
# '''''PDF Data Mapping''''' can create a bookmark in the generated PDF for each of these five sub documents by enabling the '''''Bookmarking''''' property.


# See here this document folder in the '''Batch''' is classified as an "UNESCO Application Packet" '''Document Type'''.  This '''Batch Folder''' was created upon importing the original application packet file, named "UNESCO Packet.pdf",
# The PDF document's pages were split out using the '''Split Pages''' activity to create child '''Batch Page''' objects.  This allowed us to separate the pages into child document folder for each of the documents inside the imported application packet.
# The ''PDF Generate Behavior'' can create a bookmark in the generated PDF for each of these five sub documents using the '''''Bookmarking''''' property.  Each bookmark will be named after their classified '''Document Type''' (i.e. "Application", "Proposal Summery", "Resume", etc.). 


This means we can process the full imported application packet document, and export a single file with easily navigable bookmarks for its component documents.  There's no need to export individual documents for each component document and figure out a way to index them, or put them in their own folder, or any other method you may come up with to relate them to each other in their final storage locationWith the ''PDF Generate Behavior's'' bookmarking capabilities, you can export just one file with each child '''Document Type''' bookmarked.
By creating bookmarks for each child document, there is no need to export individual PDFs for each oneInstead, we will use '''''PDF Data Mapping''''' to generate one PDF for the whole application packet as use the bookmarks to navigate between each document.
|colspan=3|
|colspan=3 valign=top|
[[File:Pdf-generate-howto-02.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 About-the-docs-07.png]]
|}
|}
</tab>
</tabs>


=== Configure PDF Generation for Annotations ===
== How To: Configure Annotations ==
Annotations are native PDF elements used to highlight and comment text in a PDF file.  For '''''PDF Data Mapping''''' "annotations" also refer to interactable "widgets" such as checkbox and text form fields.  In this tutorial we will configure at least one example of each '''''Annotation''''' option. In this tutorial we will configure at least one example of each '''''Annotation''''' option.
* '''''Text Annotation''''' - Inserts a text-based comment in the PDF.
* '''''Highlight Annotation''''' - Highlights text on the PDF.
* '''''Radio Group Widget''''' - Inserts a group of selectable radio buttons in the PDF.
* '''''Checkbox Widget''''' - Inserts checkable checkboxes in the PDF.
* '''''Signature Widget''''' - Inserts a signature block in the PDF.
* '''''Textbox Widget''''' - Inserts an editable form field in the PDF.


<tabs style="margin:20px">
{|class="attn-box"
<tab name="About" style="margin:20px">
|
=== About ===
&#9888;
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
The ''PDF Generate Behavior'' has the capability of inserting various annotations and native pdf widgets into the generated PDF.  This increases the document's readability and adds functionality for the reader to interact with the document through widgets such as radio group buttons, checkboxes and signature fields.
 
We will demonstrate how to configure one example for each of the '''''Annotation Types'''''.
# ''Highlight Annotation''
#* We will use Grooper to highlight the extraction result for the applicant's name on the document.
# ''Radio Group Widget''
#* Radio buttons are useful for documents when you have a collection of choices listed and can only select one option.  Such is the case for the "US Citizen" field on this document.  You either are or are not a US Citizen and can answer "Yes" or "No".  We will insert a radio group widget into this document to allow the user to toggle between these choices.
# ''Checkbox Widget''
#* It seems every standard form uses checkboxes for one thing or another.  This annotation will allow us to insert checkable checkboxes into the PDF file if located using OMR based extraction techniques.  For example, the checkboxes here next to each checklist item for the application packet.
# ''Signature Widget''
#* With the ''Signature Widget'' we can create a form-fillable signature box for the generated PDF.  Notice the document as imported is not signed.  With the ''PDF Generate Behavior'' we can add a signature box to the processed file.  This way you could send the application back to the applicant and have them sign the document digitally.
|
|
[[File:Pdf-generate-howto-03.png]]
'''''BE AWARE: PDF Data Mapping cannot insert annotations on PDF pages with form fields.'''''
|-
|valign=top|
We will also use the ''Textbox Widget'' to insert editable text boxes into the document's coversheet. These text boxes will also be populated with some corresponding information from the rest of the document.


# A textbox will be created for the "Candidate" on the coversheet and populated with the applicant's first name, middle initial and last name (Dog O Doggerson).
If a PDF page is form-fillable, it is ill advised to insert annotations and widgets on top of these form fields.  This can result in a corrupted PDF when it is generated by '''''Merge''''' or '''''Export'''''. '''''PDF Data Mapping''''' will not allow you to insert annotations and widgets on PDF pages with form fields.
# A textbox will be created for the "Title" on the coversheet and populated with the proposal title (Who's a Good Boy?)
# A textbox will be created for the "Country of Travel" on the coversheet and populated with the proposed travel country for the study abroad program (Japan).
|
[[File:Pdf-generate-howto-04.png]]
|}
|}
</tab>
<tab name="Prereqs - Data Fields & Extracted Data" style="margin:20px">
=== Prereqs - Data Fields & Extracted Data ===
Before a PDF annotation can be generated, a document's data must be extracted.  Put another way, the '''Extract''' activity must run ''before'' the '''Export''' activity (when the ''PDF Generate Behavior'' ultimately builds the PDF and exports it).


Each of the '''''Annotation Types''''' point to a '''Data Field''' in a '''Data Model''' as part of their configuration.  If the '''Data Field''' does not collect data during the '''Extract''' activity, the ''PDF Generate Behavior'' won't know where to place the annotation.
=== Prereqs: Data Fields and extracted data ===
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
* For '''''Annotations''''' this means '''Data Fields'''.
* Data must be saved for each '''Data Field''' prior to the PDF being generated.
** The '''''Extract''''' activity must run ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
** If performing user assisted data review, the '''''Review''''' activity must complete ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
 
 
Each of the '''''Annotation Types''''' references a '''Data Field''' in a '''Data Model''' as part of their configuration.  If the '''Data Field''' does not collect data during the '''Extract''' activity, the '''''PDF Data Mapping''''' won't know where to place the annotation.
 
==== About the Data Model used for this tutorial ====


{|cellpadding=10 cellspacing=5
The '''Data Model''' we're working with has several '''Data Fields''' that will allow '''''PDF Data Mapping''''' to place annotations and widgets.
|style="width:40%" valign=top|
 
# We will ultimately configure the ''PDF Generate Behavior'' using the '''''Behaviors''''' property of this '''Content Model''' which we've named "PDF Generate - UNESCO Packet"
{|class="how-to-table"
#* Before we do that, we will need to ensure we have '''Data Fields''' that correspond to the annotations we want to place.
|style="width:33% !important"|
# We've added the necessary '''Data Fields''' to the '''Content Model's''' '''Data Model'''.
The "Last Name" "First Name" and "Middle Initial" '''Data Fields''' (in the "Applicant Information" '''Data Section''') will demonstrate the '''''Highlight Annotation'''''
# The "Candidate", "Title of Proposal", and "Country of Travel" '''Data Fields''' will be used to place the ''Textbox Widget'' annotations.
* These fields use '''''Labeled Value''''' to extract field values next to a label.
# The "Last Name", "First Name", and "Middle Initial" '''Data Fields''' will be used to place the ''Highlight Annotation'' annotations.
* Be aware, nearly any kind of Value Extractor can be used to insert a highlight annotation.  Grooper just needs a location on the document to draw the highlight boundaries.
# The "US Citizen" '''Data Field''' will be used to place the ''Radio Group Widget'' annotation.
|style="width:20% !important"|
# The "Application", "Proposal Summary", "Essay", "Resume" and "Recommendation Letter" '''Data Fields''' will be used to place the ''Checkbox Widget'' annotations.
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-02.png]]
# The "Signature" '''Data Field''' will be used to place the ''Signature Widget'' annotation.
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-08.png]]
|
|-
|
The "US Citizen" '''Data Field''' will demonstrate the '''''Radio Group Widget'''''.
* This field uses '''''Labeled OMR''''' to extract a group of checkboxes where ''only one'' may be checked.
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the radio group widget as long as its '''''Check Mode''''' is set to ''CheckOne''.
|
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-03.png]]
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-09.png]]
|-
|
The "Checklist" '''Data Field''' will demonstrate the '''''Checkbox Widget'''''.
* This field uses '''''Labeled OMR''''' to extract a group of checkboxes where ''one or more'' may be checked.
* Be aware, any OMR extractor ('''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR''''') would be able insert the checkbox widget.
|
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-04.png]]
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-10.png]]
|-
|
|
[[File:Pdf-generate-howto-05.png]]
The "Signature" '''Data Field''' will demonstrate the '''''Signature Widget'''''.
|}
* This field uses '''''Detect Signature''''' to detect whether or not a signature is present on the document.
</tab>
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
<tab name="Add the Behavior" style="margin:20px">
=== Add the Behavior ===
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
Annotations are one of the configuration options for the ''PDF Generate Behavior''.  They are one way a '''Content Type''' '''''Behavior''''' can tell an activity (specifically the '''Export''' activity) how to use the '''Content Type''' to do something (specifically how to use the '''Content Model's''' collected '''Data Fields''' to insert additional content when generating a PDF upon export).
 
# All '''''Behaviors''''' are added to a '''Content Type''' object.
#* We will add the ''PDF Generate Behavior'' to this '''Content Model''' named "PDF Generate - UNESCO Packet".
# All '''''Behaviors''''' are added using the '''''Behaviors''''' property.  Select the '''''Behaviors''''' property and press the ellipsis button at the end to add the ''PDF Generate Behavior''.
# This will bring up the '''''Behaviors''''' editor window.
# Press the "Add" button to add a '''''Behavior'''''.
# Choose "PDF Generate Behavior" from the list.
|
|
[[File:Pdf-generate-howto-06.png]]
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-05.png]]
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-11.png]]
|-
|-
|valign=top|
# Once added, you will see the ''PDF Generate Behavior'' added to the list on the left.  Select it to add an '''''Annotation'''''.
# In the right panel, select the '''''Annotations''''' property and press the ellipsis button at the end.
# This will bring up an '''''Annotations''''' collection editor.
We will detail collection and configuration of the various '''''Annotation Types''''' in the next tabs of this tutorial.
|
|
[[File:Pdf-generate-howto-07.png]]
The "Signature Date" '''Data Field''' will demonstrate the '''''Textbox Widget'''''.
* '''''Textbox Widget''''' adds a text-editable form field to the PDF to store a field value.
** Compare this to a '''''Text Annotation''''' which simply adds a text comment to the PDF.
* This field uses '''''Labeled Value''''' to extract the date the application was signed.
* Be aware, any zonal extractor ('''''Read Zone''''', '''''Highlight Zone''''' or '''''Detect Signature''''') would be able insert the signature widget.
|
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-06.png]]
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-12.png]]
|-
|style="width:33% !important"|
The "IsProcessed" '''Data Field''' will demonstrate the '''''Text Annotation'''''.
* '''''Text Annotation''''' inserts a text comment in the PDF.
** Compare this to a '''''Textbox Widget''''' which adds an actual form field to the PDF to store a field value.
** We will use this field and annotation to print the word "PROCESSED" on the output PDF
* This field uses '''''Highlight Zone''''' to draw an extraction zone for the field and the '''Data Field's''' '''''Default Value''''' to determine what's printed.
** This is a technique common to '''''Text Annotation''''' use cases and will be explained in further depth below.
|style="width:20% !important"|
[[File:2023.1 PDF-Data-Mapping 04 01 Prereqs-01.png]]
|style="text-align: center;"|
[[File:2023.1 PDF-Data-Mapping 04 02 01 Prereqs-07.png]]
|}
|}
</tab>
 
<tab name="Highlight Annotation" style="margin:20px">
=== Adding Annotations ===
=== Highlight Annotation ===
 
'''''PDF Data Mapping''''' inserts various types of PDF annotations and widgets by configuring its '''''Annotations''''' property.  Users can add one or more '''''Annotation Types''''' to the '''''Annotations''''' list.  Adding a new '''''Annotation''''' to the list is simple.
 
With a '''''PDF Data Mapping''''' behavior added to a '''Content Type''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Select the '''''Annotations''''' property and press the ellipsis button at the end.
# This will bring up the '''''Annotations''''' editor.
# Press the "+" button.
# Select the '''''Annotation Type''''' you want to add from the dropdown list.
 
[[File:2023.1 PDF-Data-Mapping 03 02 02 Adding-Annotations-01.png]]
 
 
# <li value=6> Once added, you will see the '''''Annotation Type''''' added to the '''''Annotations''''' list.
# All '''''Annotation Types''''' will have a set of '''''General''''' properties to configure.
# Some '''''Annotation Types''''' have additional properties you can configure.
#* For example, the '''''Highlight Annotation''''' has '''''Appearance''''' properties you can configure to adjust the highlight's color and other appearance properties.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 03 02 02 Adding-Annotations-02.png]]
 
==== Notes on shared properties ====
 
All '''''Annotation Types''''' share a set of '''''General''''' properties.
* '''''Fields'''''
** Select '''Data Fields''' to map the '''Data Fields''' to the PDF annotation with this property.
** The '''''Fields''''' property is ''required''.
*** One or more '''Data Field''' must be selected to generate the annotation.
*** If you don't select any '''Data Fields''' ''or'' the selected '''Data Fields''' are not extracted, '''''PDF Data Mapping''''' will not insert an annotation in the output PDF.
*** Be aware, all '''Data Fields''' are selected by default.
* '''''Padding'''''
** The '''''Padding''''' property can adjust the size of the annotation.
** Grooper uses a '''Data Field's''' result instance to draw the annotation's boundaries.
*** The size of the '''Data Field's''' instance may be too small for what you want to appear on the output PDF.
*** If so, use '''''Padding''''' to increase the annotation's size on the PDF generated by '''''PDF Data Mapping'''''.
* '''''Allow Edit'''''
** '''''Allow Edit''''' refers to a reader's ability to edit the annotation as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the annotation (or widget).
** Enabling this property (turning it ''True'') will allow users to fully adjust the annotation in the PDF, including its size, location and other properties.
** Be aware, even when ''False'', users will still be able to interact with widgets, such as the '''''Checkbox Widget''''' or '''''Textbox Widget'''''.
* '''''Print'''''
** In a PDF viewing application, like Adobe Acrobat, all annotations and widgets '''''PDF Data Mapping''''' generates will be visible. The '''''Print''''' property determines whether or not the annotation is visible when the PDF is printed.
** Be aware, the default is ''False''.
*** Grooper presumes you will open the "Smart PDF" output by '''''PDF Data Mapping''''' will be opened in a PDF viewer (where all annotations will be visible). 
*** Grooper also presumes if you want to print the PDF, you want something more like the original document printed, not the one with additional PDF elements Grooper inserts.  If you ''do'' want those annotations and widgets visible when the PDF is printed, turn '''''Print''''' to ''True.
 
=== Annotation Types ===
 
There are currently six types of annotations Grooper can add to the PDF it creates:
* '''''[[#Highlight Annotation|Highlight Annotation]]'''''
* '''''[[#Radio Group Widget|Radio Group Widget]]'''''
* '''''[[#Checkbox Widget|Checkbox Widget]]'''''
* '''''[[#Signature Widget|Signature Widget]]'''''
* '''''[[#Textbox Widget|Textbox Widget]]'''''
* '''''[[#Text Annotation|Text Annotation]]'''''
 
==== Highlight Annotation ====
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|


The '''''Highlight Annotation''''' overlays a colored rectangle with adjustable transparency on a '''Data Field's''' extracted location.  In other words, it can highlight extraction results.
* Use this to highlight important values extracted from Grooper.
* Like all '''''Annotations''''', this highlight can be printable or not.  When the '''''Print''''' property is ''False'', the highlight will show up when viewed in a PDF viewer but not if the PDF is printed.




 
In this example, we will use the '''''Highlight Annotation''''' to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form.  To configure this '''''Annotation''''' we will:
We will look at the ''Highlight Annotation'' first.  This annotation is what it sounds like.  You can use it to highlight portions of a PDF. 
* Select the '''Data Fields''' we wish to highlight.
 
* Adjust how we want the highlight to look.
In this example, we will use the ''Highlight Annotation'' to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form.
|
|
{|
{|
Line 319: Line 382:
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-08.png]]
[[File:Pdf-generate-howto-08.png]]
|-
|-
Line 325: Line 388:
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-09.png]]
[[File:Pdf-generate-howto-09.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Highlight Annotation''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
 
# In the '''''Annotations''''' collection editor, press the "Add" button to add the ''Highlight Annotation'' annotation.
# Use the '''''Fields''''' property to select the '''Data Fields''' you wish to highlight.
#* Refer to the previous tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# Select ''Highlight Annotation'' from the list.
# In the window that pops up, mark the checkboxes next to the '''Data Fields''' you wish to highlight.
|valign=top|
#* In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''.
[[File:Pdf-generate-howto-10.png]]
#* Be aware, these fields must be extracted by the '''''Extract''''' activity or nothing will be highlighted.
|-
# Press "OK" when finished.
|valign=top|
 
# This will add a ''Highlight Annotation'' to the '''''Annotations''''' list.
[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-01.png]]
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to highlight.  Use the '''''Fields''''' property to select which '''Data Fields''' you wish to highlight.
 
#* Whatever result is returned by the selected '''Data Fields''' will be used to create the highlighted annotation.
# Using the dropdown list, select the '''Data Fields''' you wish to highlight.
#* In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''. Once collected by the '''Extract''' activity, Grooper will know where these results are located on the document.  The ''Highlight Annotation'' annotation will then highlight the document as seen in the "''After Annotation''" image above.
|
[[File:Pdf-generate-howto-11.png]]
|-
|valign=top|
Optionally, you can control how the highlight looks.  Its color, size, opacity and whether or not there's a stroke around the highlighted rectangle.


# For instance, we set the '''''Padding''''' property to ''0.1in''
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
#* This will increase the size of the highlight rectangle by 0.1 inches on all sides.
#* Adjusting '''''Padding''''' for '''''Highlight Annotations''''' is common.  In this example, we increased the highlights size by 0.1 in on each side.
#* All annotations have the ability to be padded to increase their size, not just ''Highlight Annotation''.
# Determine if you need to adjust if the annotation is editable or printableAdjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* You can also expand the '''''Padding''''' property's sub properties to adjust specific configurations for padding the '''''Left''''', '''''Top''''', '''''Right''''', and '''''Bottom'''''' edges.
#* Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.
# While we did not choose to do so, you can add a colored border around the highlighted rectangle by choosing a '''''Border Style''''' (such as ''Solid'' for a solid border or ''Dashed'' for a dashed line border)
#* The '''''Border Color''''' and '''''Border Width''''' properties will further help you configure the border produced.
#* Note: While the '''''Border Color''''' and '''''Border Width''''' properties are configured to ''64, 64, 64'' and ''1pt'' by default, the '''''Border Style''''' is set to ''None'' by default.  With no border produced, these properties are ignored.  They will not be used to create a border until you choose a '''''Border Style'''''.
# We also set the '''''Fill Color''''' to ''Yellow''.
#* Grooper defaults to green.  This is the same green you see extraction results highlighted when you're testing out extractors in '''Grooper Design Studio'''.
#* You can select colors using a dropdown list or use comma-separated values in the RBG color space.  For example, "yellow" is also ''255, 255, 128'' in the RBG color space.
|valign=top|
[[File:Pdf-generate-howto-12.png]]
|}
</tab>
<tab name="Radio Group Widget" style="margin:20px">
=== Radio Group Widget ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|


[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-02.png]]




#<li value=6> Adjust the highlight's appearance, as desired, using the '''''Appearance''''' properties.
# Most commonly, users will adjust the '''''Fill Color'''''.
#* Use the dropdown to select from a list of system colors.
#* Or, enter an RGB value using the format <code>#, #, #</code>
#* This property defaults to the "Grooper green" highlight seen in '''Review's''' '''''Data View'''''.  In this example, we've changed it to ''Yellow''.
# Press "OK" when finished (or continue adding more '''''Annotations''''').


The ''Radio Group Widget'' annotation allows you to add radio buttons to the document. Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.  This ''PDF Generate'' '''''Annotation Type''''' uses OMR extraction techniques (such as ''Labeled OMR'' and ''Zonal OMR'') to find existing checkboxes on the document.  A group of radio buttons are then overlaid on top of the checkboxes when the ''PDF Generate Behavior'' builds the PDF file.
[[File:2023.1 PDF-Data-Mapping 03 04 01 Highlight-Annotation-03.png]]


For example, we will create a ''Radio Group Widget'' annotation from the "US Citizen" '''Data Field's''' resultWe have two choices, either "Yes" or "No".  Only one or the other can be chosenSo, this is well suited for a radio button group.
==== Radio Group Widget ====
|
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Radio Group Widget''''' overlays a group of radio button PDF elements on top of where a Grooper extractor finds OMR checkboxes on a document. 
* Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.
** Note radio buttons (inserted by '''''Radio Group Widget''''') differ from checkboxes (inserted by '''''Checkbox Widget''''')For radio buttons, only one choice out of a group may be selectedFor checkboxes, any number of choices may be selected.
* The '''Data Field(s)''' this annotation references ''must'' use an OMR extractor to return results: '''''Labeled OMR''''', '''''Ordered OMR''''' or '''''Zonal OMR'''''
** This extractor ''must also'' have its '''''Mode''''' set to ''CheckOne'' (Only one box out of many may checked/selected).
* '''''PDF Data Mapping''''' will insert one radio button for each checkbox the extractor locates.
|valign=top|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-13.png]]
[[File:Pdf-generate-howto-13.png]]
|-
|-
Line 386: Line 442:
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-14.png]]
[[File:Pdf-generate-howto-14.png]]
|}
|}
|}
|}
With a '''''Radio Group Widget''''' added to the '''''Annotations''''' list:
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the group of radio buttons.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
#* In our case, we are choosing the "US Citizen" '''Data Field'''.
#* Be aware, this fields must (1) use an OMR extractor to return results (2) with its '''''Mode''''' set to ''CheckOne'' (3) have already been extracted by the '''''Extract''''' activity and (4) have located checkboxes during extraction or no radio buttons will be placed.
# Press "OK" when finished.
[[File:2023.1 PDF-Data-Mapping 03 04 02 Radio-Group-Widget-01.png]]
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the widget (press a radio button).
# Press "OK" when finished (or continue adding more '''''Annotations''''').
[[File:2023.1 PDF-Data-Mapping 03 04 02 Radio-Group-Widget-02.png]]
===== Be Aware:  Annotations are overlaid on a page's image =====


{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
|valign=top style="width:50%"|
# In the '''''Annotations''''' collection editor, press the "Add" button to add the ''Radio Group Widget'' annotation.
'''''BE AWARE:''''' The '''''Radio Group Widget''''' overlays radio buttons on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
* Notice the original image for this document used checkboxes, not radio buttons.  We see an "X" inside of a square box.
# Select ''Radio Group Widget'' from the list.
|valign=top|
|valign=top|
[[File:Pdf-generate-howto-15.png]]
[[File:Pdf-generate-howto-18.png]]
|-
|-
|valign=top|
|valign=top|
# This will add a ''Radio Group Widget'' to the '''''Annotations''''' list.
You can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the radio buttons.  Use the '''''Fields''''' property to select these '''Data Fields'''.
* In this case, the boxes were detected by the "detection only" '''''Box Detection''''' IP command and not removed by the "detection and removal" '''''Box Removal''''' command.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the radio buttons.
* '''''Box Detection''''' finds and store the checkbox locations and check states but does not actually alter the image in any way.
#* You may use the '''''Padding''''' property to adjust the size of the radio button if you desire.
#* These '''Data Fields''' ''must'' use an OMR based extraction method (''Labeled OMR'', ''Ordered OMR'', or ''Zonal OMR'') to insert the radio buttons.
# Using the dropdown list, select the '''Data Fields''' you wish to use to create the group of radio buttons.
#* In our case, we are choosing the "US Citizen" '''Data Field'''.  Once collected by the '''Extract''' activity, Grooper will know which results you want to use to create the radio buttons.  This will include the checkbox locations and check states stored in the document's layout data.  The ''Radio Group Widget'' annotation will then insert radio buttons into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
|valign=top|
[[File:Pdf-generate-howto-16.png]]
[[File:Pdf-generate-howto-19.png]]
|-
|-
|valign=top|
|valign=top|
Let's briefly look at this "US Citizen" '''Data Field''' and see what's happening behind the scenes when the ''PDF Generate Behavior'' creates the radio buttons.
Maybe you care about this, and maybe you don't.  If you do, use '''''Box Removal''''' instead.
 
* '''''Box Removal''''' will also find and store the checkbox locations and their check states, but it will ''also'' digitally remove the checkboxes from the document's image.  This will allow Grooper to extract the checkboxes and allow '''''PDF Data Mapping''''' to overlay the radio buttons on a field of blank pixels.
# We have selected the "US Citizen" '''Data Field''' in the '''Grooper''' Node Tree.
* Run '''''Box Removal''''' in an '''IP Profile''' using the '''''Image Processing''''' activity prior to running the '''''Extract''''' activity to do this.
# This '''Data Field''' uses the ''Labeled OMR'' extractor to return its result, looking for checkboxes next to the labels "Yes" and "No" on the document.
|valign=top|
# The box next to "Yes" is checked.  This is ultimately the result returned to the "US Citizen" '''Data Field'''.
[[File:Pdf-generate-howto-20.png]]
#* This is how the ''Radio Group Widget'' annotation knows where to place the radio button.  The data instance used to insert the PDF radio button is drawn around the detected box (in this case highlighted in green in the Document Viewer).
#* Since this is the detected checked result, the radio button is configured as "pressed" upon outputting the generated PDF.
# The box next to "No" is not checked.  The ''Radio Group Widget'' will also create radio buttons for the unchecked boxes next to labels on the document as well.
#* The alternate candidate data instances are used to insert the other PDF radio buttons in the group (in this case highlighted in red in the Document Viewer).
#* The unchecked boxes ''must'' be detected from a '''Box Detection''' or '''Box Removal''' '''IP Command''' in order to be inserted in the generated PDF.  They ''must'' be present in the document's layout data file ''before'' the '''Extract''' activity runs.
#* Since this is detected as an unchecked result, the radio button is not pressed upon outputting the generated PDF.
|
[[File:Pdf-generate-howto-17.png]]
|}
|}


{|cellpadding="10" cellspacing="5"
==== Checkbox Widget ====
|-style="background-color:#36b0a7; color:white"
|style="font-size:14pt"|'''FYI'''
|
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
In the case of every '''''Annotation Type''''', the ''PDF Generate Behavior'' inserts the annotation by overlaying it on top of the documentThis can be important to keep in mind for all annotations but is often particularly relevant when inserting radio buttons using the ''Radio Group Widget''.
The '''''Checkbox Widget''''' inserts one or more form-fillable checkboxes into the PDF on top of where a Grooper extractor finds OMR checkboxes.
* Checkboxes are common PDF elements used to indicate a choice from one or many options.
** Note checkboxes (inserted by '''''Checkbox Widget''''') differ from radio buttons (inserted by '''''Radio Group Widget''''').  For radio buttons, only one choice out of a group may be selectedFor checkboxes, any number of choices may be selected.
* The '''Data Field(s)''' this annotation references ''must'' use an OMR extractor to return results:  '''''Labeled OMR''''', '''''Ordered OMR''''', or '''''Zonal OMR'''''
* However, this extractor may use any of the OMR '''''Modes''''' (''CheckOne'', ''CheckMulti'' or ''Boolean'').
* '''''PDF Data Mapping''''' will insert a simple checkbox PDF element for each checkbox the extractor locates.


Notice the original image for this document used checkboxes, not radio buttons. We see an "X" inside of a square box.
 
|
In this example, we will create a '''''Checkbox Widget''''' for the checkboxes extracted using the "Checklist" '''Data Field'''.  This is a '''''Labeled OMR''''' extractor that uses the ''CheckMulti'' '''''Mode''''', indicating one of any number of checkboxes may be checked for each label.  Checked or not, the ''Checkbox Widget'' will insert a checkbox element into the generated PDF.
[[File:Pdf-generate-howto-18.png]]
|valign=top|
{|
|style="text-align:center"|
''Before Annotation''
|-
|style="text-align:center"|
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-03.png]]
|-
|style="text-align:center"|
''After Annotation''
|-
|-
|valign=top|
|style="text-align:center"|
The radio button annotations are simply overlaid on the page's imageYou can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-04.png]]
|}
|}
 
With a '''''Checkbox Widget''''' added to the '''''Annotations''''' list:
 
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the group of radio buttons.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
#* In our case, we are choosing the "Checklist" '''Data Field'''.
#* Be aware, this fields must (1) use an OMR extractor to return results (2) have already been extracted by the '''''Extract''''' activity and (3) have located checkboxes during extraction or no checkboxes will be placed.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-01.png]]
 
 
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printableAdjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the widget (check the checkboxes).
# Press "OK" when finished (or continue adding more '''''Annotations''''').
 
[[File:2023.1 PDF-Data-Mapping 04 03 03 Checkbox-Widget-02.png]]


In this case, the boxes were stored in the layout data using the '''Box Detection''' '''IP Command'''.  This will find and store the checkbox locations and check states, but not actually alter the image in any way.
{|class="attn-box"
|
|
[[File:Pdf-generate-howto-19.png]]
&#9888;
|-
|
|valign=top|
'''''BE AWARE:''''' The '''''Checkbox Widget''''' overlays checkboxes on a page's image.  Any printed checkbox on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
Maybe you care about this, and maybe you don't.  If you do, you may consider using the '''Box Removal''' '''IP Command''' instead.  '''Box Removal''' will also find and store the checkbox locations and their check states, but it will ''also'' digitally remove the checkboxes from the document's image.


In this case, the boxes were stored in the layout data using the '''Box Removal''' '''IP Command'''.  Since the boxes are removed before the '''Export''' activity, the edges of the boxes are not present on the final image.  The radio button annotations are placed on blank pixels.
For more information, [[#Be Aware: Annotations are overlaid on a page's image|see above.]]
|
[[File:Pdf-generate-howto-20.png]]
|}
</tab>
<tab name="Checkbox Widget" style="margin:20px">
=== Checkbox Widget ===
{|cellpadding="10" cellspacing="5"
|-style="background-color:#ed2330; color:white"
|style="font-size:14pt"|'''WIP'''||The ''Checkbox Widget'' documentation needs to be finalized after getting some guidance from dev.  If it seems incomplete or images don't match up with text, that is why.
|}
|}


==== Signature Widget ====
{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
The '''''Signature Widget''''' inserts a signature block into the PDF.
* Signature blocks allow PDFs to capture digital signatures.  This allows you to create a document that can be digitally signed straight from Grooper on export.
* The '''Data Field(s)''' this annotation references ''will typically'' use a zonal extractor to define where the signature block should be: '''''Detect Signature''''' or '''''Highlight Zone''''' most commonly
* Other Value Extractors may work, but these are most typical.  '''''PDF Data Mapping''''' will insert the signature block using the geometric boundaries of the extraction instance.  Zonal extractors are well suited to define fixed boundaries of extraction results.




 
In this example, we will create a '''''Signature Widget''''' annotation for the signature line on the application form, using the "Signature" '''Data Field''' of our '''Data Model'''.  The '''''Signature Widget''''' will insert an interactable signature element into the generated PDF.
 
The ''PDF Generate Behavior'' also has the capability to insert form-fillable checkboxes as well, using the ''Checkbox Widget'' '''''Annotation Type'''''.  This '''''Annotation Type''''' also uses OMR extraction techniques (such as ''Labeled OMR'' and ''Zonal OMR'') to find existing checkboxes on the documentIt works a lot like the ''Radio Group Widget'' annotation, just instead of radio buttons, editable checkboxes are overlaid on the document.
 
For example, we will create a ''Checkbox Widget'' annotation for the checkboxes in the "Checklist" section of this document, the "Application", "Proposal Summary", "Essay", "Resume" and "Recommendation Letter" '''Data Fields'''.  These are Boolean OMR checkboxes, returning "true" if the box next to the corresponding label is checked, and "false" if unchecked.  In either case, checked or not, the ''Checkbox Widget'' will insert an editable checkbox element into the generated PDF.
|
|
{|
{|
Line 474: Line 563:
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|[[File:Pdf-generate-howto-23.png]]
[[File:Pdf-generate-howto-13.png]]
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-14.png]]
[[File:Pdf-generate-howto-24.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Signature Widget''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
 
# In the '''''Annotations''''' collection editor, press the "Add" button to add the ''Checkbox Widget'' annotation.
# Use the '''''Fields''''' property to select the '''Data Field''' you wish use to insert the signature block.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# Select ''Checkbox Widget'' from the list.
# In the window that pops up, mark the checkbox next to the '''Data Field''' you wish to select.
|valign=top|
#* In our case, we are choosing the "Signature" '''Data Field'''.
[[File:Pdf-generate-howto-21.png]]
#* Be aware, this fields must (1) have already been extracted by the '''''Extract''''' activity and (2) have drawn a zone defining the location and size of the signature block (Most commonly, '''''Detect Signature''''' or '''''Highlight Zone''''' is used to do this).
|-
# Press "OK" when finished.
|valign=top|
# This will add a ''Checkbox Widget'' to the '''''Annotations''''' list.
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the checkboxes.  Use the '''''Fields''''' property to select these '''Data Fields'''.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the checkboxes.
#* You may use the '''''Padding''''' property to adjust the size of the checkboxes if you desire.
#* These '''Data Fields''' ''must'' use an OMR based extraction method (''Labeled OMR'', ''Ordered OMR'', or ''Zonal OMR'') to insert the checkboxes.
# Using the dropdown list, select the '''Data Fields''' you wish to use to create the checkboxes.
#* In our case, we are choosing the "Application", "Proposal Summary", "Essay", "Resume" and "Recommendation Letter" '''Data Fields'''. Once collected by the '''Extract''' activity, Grooper will know which results you want to use to create the checkboxes.  This will include the checkbox locations and check states stored in the document's layout data.  The ''Checkbox Widget'' annotation will then insert checkboxes into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
[[File:Pdf-generate-howto-22.png]]
|}
</tab>
<tab name="Signature Widget" style="margin:20px">
=== Signature Widget ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|


[[File:2023.1 PDF-Data-Mapping 04 03 04 Signature-Widget-01.png]]




#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
# Determine if you need to adjust if the annotation is editable or printable.  Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size.  It does not refer to a reader's ability to interact with the element (submit a signature).
# Press "OK" when finished (or continue adding more '''''Annotations''''').


Form-fillable signature boxes can be inserted using the ''Signature Widget'' annotation.  This '''''Annotation Type''''' uses a zonal extraction type (such as ''Detect Signature'' or ''Highlight Zone'') to draw the boundaries of the inserted signature widget.  This allows you to create a document that can be digitally signed straight from Grooper upon exporting the generated PDF.
[[File:2023.1 PDF-Data-Mapping 04 03 04 Signature-Widget-02.png]]


For example, we will create a ''Signature Widget'' annotation for the signature line on the application form, using the "Signature" '''Data Field''' of our '''Data Model.  The ''Checkbox Widget'' will insert an interactable signature element into the generated PDF.
{|class="attn-box"
|
&#9888;
|
|
'''''BE AWARE:''''' The '''''Signature Widget''''' overlays a signature block on a page's image.  If present, any printed signature on the original page will persist (behind the widget), unless removed by the '''''Image Processing''''' activity.
For more information, [[#Be Aware: Annotations are overlaid on a page's image|see above.]]
|}
==== Textbox Widget ====
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Textbox Widget''''' inserts text-editable form fields into the generated PDF.
* Form fields allow PDFs to collect and store data entered by a user.
* Users can configure a '''''Textbox Widget''''' to create blank form fields or form fields with a value Grooper extracts already populated.
** For blank form fields, the '''Data Field(s)''' this annotation references should use '''''Highlight Zone''''' to place a blank zone where the field should be inserted.
** For populated form fields, the '''Data Field(s)''' this annotation references can use any extractor that returns a single-instance value (most typically '''''Labeled Value''''').
*** This allows Grooper to not only generate a PDF with form fields where they weren't present in the source document, but prefill them with data Grooper collects.
* Be aware, a '''''Textbox Widget''''' differs from a '''''Text Annotation'''''.  Where '''''Textbox Widget''''' will insert a text-editable form field, '''''Text Annotation''''' adds a text comment to to PDF.
|valign=top|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-23.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-04.png]]
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-24.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-05.png]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
In this example, we will use the '''''Textbox Widget''''' to insert a form field for the "Signature Date" '''Data Field'''. This used '''''Labeled Value''''' to extract the date.  '''''PDF Data Mapping''''' will overlay the form field on top of the extraction result.
|valign=top style="width:40%"|
* FYI: We will also adjust the generated widget's size using the '''''Padding''''' property. This is common when configuring '''''Textbox Widgets''''' when the font size you want to use for the form field is larger than the printed typeface on the document.
# In the '''''Annotations''''' collection editor, press the "Add" button to add the ''Signature Widget'' annotation.
 
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
 
# Select ''Signature Widget'' from the list.
With a '''''Textbox Widget''''' added to the '''''Annotations''''' list:
|valign=top|
 
[[File:Pdf-generate-howto-25.png]]
# Use the '''''Fields''''' property to select the '''Data Field(s)''' you wish to use to create text-editable form fields.
|-
#* Press the ellipsis button at the end of the '''''Fields''''' property.
|valign=top|
# In the window that pops up, mark the checkboxes next to the '''Data Field(s)''' you wish to select.
# This will add a ''Signature Widget'' to the '''''Annotations''''' list.
#* In our case, we are choosing the "Signature Date" '''Data Field'''.
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the signature box. Use the '''''Fields''''' property to select these '''Data Fields'''.
#* Be aware, these fields must be extracted by the '''''Extract''''' activity or no textbox will be generated.
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the signature box widget.
# Press "OK" when finished.
#* You may use the '''''Padding''''' property to adjust the size of the signature box if you desire.
 
#* Zonal based extraction methods (such as ''Signature Detection'' and ''Highlight Zone'') are typically used as the '''Data Field's''' extractor type.
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-01.png]]
# Using the dropdown list, select the '''Data Fields''' you wish to use to create the checkboxes.
#* In our case, we are choosing the "Signature" '''Data Field'''. Once collected by the '''Extract''' activity, Grooper will be supplied the size and location of the '''Data Field's''' extraction zone, which will form the size and location of the PDF signature widget. The ''Signature Widget'' annotation will then insert the form-fillable signature box into the generated PDF as seen in the "''After Annotation''" image above.
|valign=top|
[[File:Pdf-generate-howto-26.png]]
|}


Just like any '''''Annotation Type''''', the extraction result from the '''Data Field''' is critical for placing the signature annotation on the generated PDF.  Let's look at the "Signature" '''Data Field's''' result to understand a little better how these results are used to create the signature widget.


In our case, we're using the ''Detect Signature'' extractor type to supply these results. The ''Detect Signature'' extractor is perfectly suited for the ''Signature Widget'' '''''Annotation Type'''''.   
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
* It actually combines both Zonal and OMR based extraction techniques to determine if a signature is present in the zoneIt sets the boundaries of where you expect to find a signature using Zonal based methods and detects if the signature is present by counting the percentage of filled pixels in the zone, which is the basis of OMR based extraction methodsYou can then output different values if the zone is filled above or below a certain percentage.  In this case, the extractor returns "Not Signed" because there aren't enough pixels present in the extraction zone to count as filledIf there were a signature present, there'd be more pixels present, accounting for a higher filled percentage.
#* Adjusting '''''Padding''''' for '''''Textbox Widgets''''' is common if the desired font size in the textbox differs from that printed on the source documentIn this example, we increased the textbox's size by 0.1 in on each side.
# Determine if you need to adjust if the annotation is editable or printableAdjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
#* Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
#* Please note: '''''Allow Edit''''' refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its sizeIt does not refer to a reader's ability to edit the value ''inside'' the textboxTo configure that, use the '''''Read Only''''' property.


This is great for our purposes because it gives us the exact information we need for the ''Signature Widget'', which is an extraction zone.  Grooper needs a data instance indicating the size and location for the generated signature widget.
[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-02.png]]
* But wait there's more!  We also get some bonus information about whether or not there's a signature present. Does the ''Signature Widget'' '''''Annotation Type''''' need to know if there's a signature present?  No.  It does not.  It will place the widget no matter what the result is.  But might that information be otherwise useful to you?  Probably.


{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
# We have selected the "Signature" '''Data Field''' in our '''Data Model'''.
# This '''Data Field''' uses the ''Detect Signature'' extractor to draw the extraction zone used to insert the signature widget.
# This extractor uses the ''Text Region'' '''''Location''''' option.
# This gives us the ability to anchor the extraction zone to an extractable text anchor, using the '''''Text Extractor''''' property.
#* In this case we've anchored the zone to the word "Signature" outlined in blue in the document viewer.  Where do we want to place the extraction zone (and ultimately the signature widget)?  On the signature line.  How do we know where that line is?  It's above the text label "Signature".
# The extraction zone itself is drawn using the '''''Translation''''' and '''''Adjustment''''' properties.
#* This allows us to set the size ('''''Adjustment''''') and location ('''''Translation''''') of the extraction zone (and ultimately the signature widget) relative to the '''''Text Extractor's''''' result. 
#* The extraction zone is the green rectangle in the document viewer.
# When the ''PDF Generate Behavior'' builds the PDF, using the ''Signature Widget'' annotation, the extraction zone's size and location forms the inserted signature widget.
|valign=top|
[[File:Pdf-generate-howto-27.png]]
|}
</tab>
<tab name="Textbox Widget" style="margin:20px">
=== Textbox Widget ===
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|


#<li value=6> Adjust the textbox's other properties as desired.
#* These properties give you the ability to adjust the font and font size inside the textbox.
#* Please note: If you want to prevent a reader from editing the Grooper collected value inside the textbox, turn '''''Read Only''''' to ''True''.
# Press "OK" when finished (or continue adding more '''''Annotations''''').


[[File:2023.1 PDF-Data-Mapping 04 03 05 Textbox-Widget-03.png]]


==== Text Annotation ====
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
The '''''Text Annotation''''' inserts a text comment in the PDF. 
* This has two primary uses:
** Insert comments into the PDF that are viewable when opening the PDF in a PDF viewer, but not printable.
** Print a simple text note on a page.
*** Commonly, users will want to print a word like "PROCESSED" on the output PDF.  This notes the document has been processed through Grooper.
* The '''Data Field(s)''' may use any kind of extractor as long as it produces a result with (1) a location on the page to place the comment and (2) a text value to add to the comment.
* Be aware, a '''''Textbox Widget''''' differs from a '''''Text Annotation'''''.  Where '''''Textbox Widget''''' will insert a text-editable form field, '''''Text Annotation''''' adds a text comment to to PDF.




The ''Textbox Widget'' '''''Annotation Type''''' will insert editable text boxes into the generated PDF.  One simple way to use this functionality is to use the ''Highlight Zone'' extractor type to place a blank zone where you want to place an empty text box on the PDF.  However, any extractor type can be used to define the textbox's location. Furthermore, if the '''Data Field''' used to create the annotation collects a valued during the '''Extract''' activity, not only will a textbox be inserted into the generated PDF, but it will be prefilled with the '''Data Field's''' extracted value upon export.
In this example, we will use a '''''Text Annotation''''' to print the word "PROCESSED" on the first page of the PDF generated by '''''PDF Data Mapping'''''.
* We will use the "IsProcessed" '''Data Field''' to do this. The extraction logic to make this happen requires a less-than-common technique.  We will show you how we build this '''Data Field''' in the [[#Technique: "IsProcessed" Data Field]] section.


For example, we will use the ''Textbox Widget'' functionality to fill out the blank coversheet on the first page of our application packet.  We will end up using a ''Highlight Zone'' extractor to define the size and location of the text box.  However, we're going to go one step further and populate the '''Data Field's''' used with some information from other '''Data Field's''' in our '''Data Model'''.  By the end of it, the ''PDF Generate Behavior'' will not only insert editable textboxes into the generated PDF, but fill them in with text.  By the end of it, we end up with this blank coversheet automatically populated with some information collected during the '''Extract''' activity.
|valign=top|
|
{|
{|
|style="text-align:center"|
|style="text-align:center"|
''Before Annotation''
''Before Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-28.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-04.png|500px]]
|-
|-
|style="text-align:center"|
|style="text-align:center"|
''After Annotation''
''After Annotation''
|-
|-
|
|style="text-align:center"|
[[File:Pdf-generate-howto-29.png]]
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-05.png|500px]]
|}
|}
|}
|}


{|cellpadding=10 cellspacing=5
With a '''''Text Annotation''''' added to the '''''Annotations''''' list:
|valign=top style="width:40%"|
 
# In the '''''Annotations''''' collection editor, press the "Add" button to add the ''Textbox Widget'' annotation.
# Use the '''''Fields''''' property to select the '''Data Field(s)''' you wish to use to insert the text comment.
#* Refer to the "Add the Behavior" tab if you are unclear how we got to this window in '''Grooper Design Studio'''.
#* Press the ellipsis button at the end of the '''''Fields''''' property.
# Select ''Textbox Widget'' from the list.
# In the window that pops up, mark the checkboxes next to the '''Data Fields''' you wish to select.
|valign=top|
#* In our case, we are choosing the "IsProcessed" '''Data Fields'''.
[[File:Pdf-generate-howto-30.png]]
#* Be aware, these fields must (1) be extracted by the '''''Extract''''' activity and (2) hold a location and value or no comment will be added.
|-
# Press "OK" when finished.
|valign=top|
 
# This will add a ''Textbox Widget'' to the '''''Annotations''''' list.
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-02.png]]
# The only configuration that is ''strictly required'' is to indicate which '''Data Fields''' you wish to use to create the signature box. Use the '''''Fields''''' property to select these '''Data Fields'''.
 
#* Whatever result is returned by the selected '''Data Fields''' will be used to draw and insert the textbox widget.  If that '''Data Field''' collected a value during the '''Extract''' activity, it will also be filled with the returned value.
 
# Using the dropdown list, select the '''Data Fields''' you wish to use to create the checkboxes.
#<li value=4>  Determine if you need to adjust the annotation's padding.  Adjust the '''''Padding''''' property if you do.
#* In our case, we are choosing the "Candidate", "Title of Proposal" and "Country of Travel" '''Data Fields'''.  Once collected by the '''Extract''' activity, Grooper will be supplied the sizes and locations of the '''Data Field's''' data instances for each result. This will form the size and location of the textbox widget. The ''Textbox Widget'' annotation will then insert the form-fillable textbox into the generated PDF as seen in the "''After Annotation''" image above.  These boxes will also be prefilled with the extraction results from each '''Data Field'''.
# Determine if you need to adjust if the annotation is editable or printable. Adjust the '''''Allow Edit''''' or '''''Print''''' properties if you do.
|valign=top|
#* Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.
[[File:Pdf-generate-howto-31.png]]
#* In our case, we ''do'' want this comment printed when the document is printed. So, we've changed '''''Print''''' to ''True''.
|-
 
|valign=top|
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-02.png]]
The ''Textbox Widget'' annotation has some additional configuration options as well.
 


# As with all '''''Annotation Types''''', you can optionally adjust the size of the annotation using the '''''Padding''''' property.
#<li value=6> Adjust the comment's appearance, as desired, using the '''''Appearance''''' properties.
# You can also change the font and font size of the editable text in the textbox using the '''''Font Name''''' and '''''Font Size'''''.
#* Users may change the comment's font and font size with the '''''Font Name''''' and '''''Font Size''''' properties.
|
#* Users may select a '''''Fill Color''''' and '''''Text Color''''' in one of two ways:
[[File:Pdf-generate-howto-32.png]]
#** Using the the dropdown to select from a list of system colors
|-
#** Or, entering an RGB value using the format <code>#, #, #</code>
|valign=top|
#** Be aware, there is no true "transparent" '''''Fill Color''''' option.  The selectable ''Transparent'' option is a system color that equates to "white".
As far as looking behind the scenes, there's at least two things going on with how we've set up these '''Data Fields'''' extraction, ultimately supplying the result used to insert the ''Textbox Widget'' annotation.
# Press "OK" when finished (or continue adding more '''''Annotations''''').


First, we used the ''Highlight Zone'' extractor type to draw the textbox, defining the size and location of the annotation upon generating the PDF.
[[File:2023.1 PDF-Data-Mapping 04 03 06 Text-Annotation-03.png]]


# We have selected the "Candidate" '''Data Field''' in our ''''Data Model'''.
===== Technique:  "IsProcessed" Data Field =====
# Each '''Data Field's''' '''''Value Extractor''''' is set to ''Highlight Zone''.
# We used the ''Relative Region'' '''''Location''''' option to anchor an extraction zone to the box next to the label "Candidate".
#* This will form the size and and location of the inserted textbox annotation.


Second, we used an expression to return a value, using the results of other '''Data Fields''' in our '''Data Model'''.
To print the word "PROCESSED" on the PDF, we used a specific technique.  A '''''Text Annotation''''' just needs two things from a '''Data Field''' to insert the annotation:  (1) a location on the page to place the comment and (2) a text value to add to the comment.  The word "PROCESSED" did not exist on the source PDF.  So, we had to figure out a way to use a '''Data Field''' to ''generate'' a result rather than extract it.


#<li value = 4> We've used the '''''Calculated Value''''' property (in '''''Calculate Mode''''' ''Always Set'') to return the full name of the candidate extracted by the "Last Name", "First Name", and "Middle Initial" '''Data Fields'''
We did this in essentially two steps:
#* The full expression is as follows: <code>Applicant_Information.First_Name + " " + Applicant_Information.Middle_Initial + " " + Applicant_Information.Last_Name</code>
# Use the '''''Highlight Zone''''' extractor to define where the annotation should be printed.
# This will take the extraction results of these three '''Data Fields''' and jam them together with space characters in between them.
# Use a '''''Calculated Value''''' to define the text we want to print (the word "PROCESSED").
# However, if we test extraction at this point, we're going to get an error.
#* We're in the wrong scope!  We need to go up to the '''Data Model's''' level and test extraction there.  We need the full '''Data Model's''' results to do what we're trying to do here.  Testing extraction on this "Candidate" '''Data Field''', it can't "see" the "Last Name", "First Name" and "Middle Initial" '''Data Fields''' results to combine them.
|
[[File:Pdf-generate-howto-33.png]]
|-
|
# Once we test extraction on the '''Data Model''' you'll see what results are actually collected by the '''Extract''' activity.
# The '''''Calculated Value''''' expression we configured forms one result for the "Candidate"...
# ...using the results of the "Last Name", "First Name" and "Middle Initial" '''Data Field's''' results.
# With a result returned and zone drawn upon extract, the ''Textbox Widget'' annotation has all the information it needs to place the form-fillable textbox and fill it with the results.




{|cellpadding="10" cellspacing="5"
This gives a '''''Text Annotation''''' everything it needs to insert the comment: (1) A location and (2) some text
|-style="background-color:#36b0a7; color:white"
|style="font-size:14pt"|'''FYI'''
|
This certainly isn't the '''only''' way to set up a '''Data Field''' for a ''Textbox Widget''.  This is just how we did it for the point of illustrating the ''Textbox Widget'' functionality.  You are not '''required''' to use the ''Highlight Zone'' extractor type.  You can use whatever extractor type best suits your document's needs.  Often Grooper users will use the ''Reference'' extractor to point to a '''Data Type's''' results and adjust the size of the ''Textbox Widget'' using its '''''Padding''''' property.
|}
|
[[File:Pdf-generate-howto-34.png]]
|}
</tab>
</tabs>


=== Configure PDF Generation for Bookmarks ===
== How To: Configure Bookmarks ==


<tabs style="margin:20px">
Bookmarks in PDFs aid readers when navigating through multipage documents.  '''''PDF Data Mapping''''' can insert bookmarks into the generated PDF to take advantage of this functionality.  This can be done in one of two ways (or both):
<tab name="About" style="margin:20px">
=== About ===
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
Bookmarks in PDFs aid readers when navigating through multipage documents.  The ''PDF Generate Behavior'' can insert bookmarks into the generated PDF to take advantage of this functionality.  This can be done in one of two ways (or both):


# Using a '''Batch Folder's''' child document folders.
# Using a document folder's ('''Batch Folder''') child folders ('''Batch Folder''').
# Using the document's extracted '''Data Fields'''.
# Using a document folder's extracted '''Data Fields'''.


{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:50%"|
|valign=top style="width:50%"|
We will focus on the bookmarking method (as it is more common).  Often it is the case you will import a file into Grooper that has multiple documents inside you want to separate and classify, but otherwise all belong together in one way or another.
In this tutorial we take an application packet separated into component child documents and use '''''PDF Data Mapping's''''' '''''Bookmarking''''' property to create bookmarks for each one.


Such is the case with our study abroad application packet.  The application packet as a whole consists of five separate and distinguishable documents.
The application packet as a whole consists of five separate and distinguishable documents.
# The application itself (and a coversheet)
# The application itself (and a coversheet)
# A proposal summary
# A proposal summary
Line 693: Line 753:
|-
|-
|valign=top|
|valign=top|
Our goal is to create a bookmark in the generated PDF file for each of these component documents (or child documents as we will come to call them).   
Our goal is to create a bookmark in the generated PDF file for each of these component documents (child documents).   


Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks corresponding to each component document.
Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks.
# Application - For the application itself (and its coversheet)
 
# Proposal Summary - For the proposal summary
 
# Resume - For the student's resume
We we also demonstrate how to use '''Data Fields''' for bookmarking.  This allows us to insert PDF bookmarks for locations of extracted data.
# Rec Letter - For the letter of recommendation
* The "Signature" bookmark in this example would take the reader to the signature line of the PDF, using the location extracted by the "Signature" '''Data Field''' in our '''Data Model'''.
# Essay - For the essay
|
|
[[File:Pdf-generate-about-06.png]]
[[File:2023.1 PDF-Data-Mapping 01 02 About-Bookmarks-01.png]]
|}
|}


</tab>
=== Bookmarking Option 1:  Child document/folder bookmarks ===
<tab name="Prereqs - Split Pages, Separation and Classification" style="margin:20px">
 
=== Prereqs - Split Pages, Separation and Classification ===
There are two ways the '''''Bookmarking''''' feature can insert bookmarks into a PDF generated by '''''PDF Data Mapping'''''.
# '''It can insert a bookmark for each child document/folder.'''
# It can insert a bookmark for selected (single instance) '''Data Fields'''.
 
This section will detail how to insert bookmarks using child documents.
 
==== Option 1 Prereqs: Separated child documents ====
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
 
If enabled, '''''Bookmarking''''' will automatically add bookmarks to a PDF if a document has ''child'' documents in the '''Batch's''' folder hierarchy.
* If a document at folder level 1 is exported and has two child documents, the generated PDF will have two bookmarks in the generated PDF.
* Clicking on the bookmark will take the reader to that child document's page in the PDF.
 
 
For this to work:
* The parent document folder must have separable child pages.
** Either from scanning pages in with a scanner or using the '''''Split Pages''''' activity to generate pages from an imported PDF.
* These child pages must then be separated into child folders.
** Either using a '''Separation Profile''' when scanning or using the '''''Separate''''' activity.


{|cellpadding=10 cellspacing=5
|valign=top style="width:40%"|
In order to accomplish this goal, we're going to have to do some things to this application packet before we configure the ''PDF Generate Behavior''.


By the end of it, we're looking for a '''Batch''' whose documents have a structure like thisThe documents in this batch consist of two '''Batch Folder''' levels.
Technically speaking, that's all you need.  '''''PDF Data Mapping''''' will add PDF bookmarks for every child document and name it using each child folder's name.
# '''Folder Level 1''':  This is the parent document folder. It is the container for the full document.  All seven pages of the application packet in this case.
* Be aware, without classifying the child documents these names will just be "Folder (1)" "Folder (2)" "Folder (3)" and so on.
# '''Folder Level 2''':  These are the child document folders for the parent document.  They are the containers for each component document of the full application packet.


This is what we want to end up with.  How did we get there?  Long story short, we have some document separation and classification requirements before we can insert bookmarks in the generated PDF.  The bookmarks are inserted for each child document folder and named after their classified '''Document Type's''' name.  In order to do that, we need to split out the pages of the imported document, separate them into child document folders, and classify them first.
{|style="text-align:center;"
|
|
[[File:Pdf-generate-howto-36.png]]
''Not separated''
|-
|valign=top|
The full application document came into Grooper like this. A 7 page PDF file with each of these 5 component documents was imported into a new '''Batch'''.  This is now the parent document folder at '''Folder Level 1'''.


But there's documents in them there document!  How do we get them out?
''No child folders''
|
|
[[File:Pdf-generate-howto-37.png]]
''Separated''
|-
|valign=top|
First, we need to use the '''Split Pages''' activity to create child '''Batch Page''' objects. 


This will split out the pages of the imported PDF file, creating one child '''Batch Batch''' for each page in PDF on the parent document folder.  Now we have page objects we can manipulate in our '''Batch'''.
''Has child folders''
|
|
[[File:Pdf-generate-howto-38.png]]
''Separated and classified''
 
''Has child document folders.''
 
''
|-
|-
|valign=top|
|valign=top|
Now that we have '''Batch Page''' objects in our '''Batch''', we can use the '''Separate''' activity to insert the second folder levelThis is the first step in organizing these pages into child documentsWe need to distinguish between one collection of pages as a document and another collection of pages as a document.  Creating a folders is the first part of that equation.
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-01.png]]
|valign=top|
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-02.png]]
|valign=top|
[[File:2023.1 PDF-Data-Mapping 05 01 Prereqs-Separation-03.png]]
 
''What about Page 1 there?''
 
''Is it in a folder?  No.  Then it won't get a bookmark.''
|}
 
==== Adding bookmarks for child documents/folders ====
'''''PDF Data Mapping''''' will create bookmarks for child documents/folders by default.  There is no configuration required besides enabling the '''''Bookmarking''''' property.
 
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Bookmarking''''' property to ''Enabled''.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 05 04 Bookmarks-for-Child-Docs-01.png]]
 
 
That's it!  It's that simple! 
 
As long as the document folder '''''PDF Data Mapping''''' is applied to has child documents/folders, bookmarks will be created for each child document.
 
 
[[File:2023.1 PDF-Data-Mapping 05 04 Bookmarks-for-Child-Docs-02.png|900px]]
 
=== Bookmarking Option 2:  Data Field bookmarks ===
 
There are two ways the '''''Bookmarking''''' feature can insert bookmarks into a PDF generated by '''''PDF Data Mapping'''''.
# It can insert a bookmark for each child document/folder.
# '''It can insert a bookmark for selected (single instance)''' '''Data Fields'''.
 
This section will detail how to insert bookmarks using '''Data Fields'''.  This allows '''''PDF Data Mapping''''' to bookmark important field value locations extracted by Grooper in the output PDF.
 
==== Option 2 Prereqs: Data Fields and extracted data ====
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
 
'''''Bookmarking''''' can also insert PDF bookmarks using extracted data and their location.  '''Data Fields''' collect results using extractors which return results from the source document.  '''''Bookmarking''''' will use these results' locations to embed this kind of bookmark.
 
For this to work:
* You must have these '''Data Fields''' defined in a '''Data Model''' and configured to return results.
* Data must be saved for each '''Data Field''' prior to the PDF being generated.
** The '''''Extract''''' activity must run ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
** If performing user assisted data review, the '''''Review''''' activity must complete ''<u>before</u>'' '''''Merge''''' or '''''Export''''' generates the PDF.
 
==== Adding bookmarks for Data Fields ====
 
'''''PDF Data Mapping''''' will insert bookmarks for extracted '''Data Field''' value locations by simply selecting which '''Data Field(s)''' you want to bookmark.
* Please note: Only single-instance '''Data Fields''' may be bookmarked.
 
 
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Bookmarking''''' property to ''Enabled'' and expand its sub-properties.
# Select '''''Data Elements''''' and press the ellipsis button at the end.
# A "Data Elements" selection editor will appear.
# Select the '''Data Field''' whose location you wish to bookmark.
#* Please note: Only single-instance '''Data Fields''' may be bookmarked.
# Press "OK" when finished selecting '''Data Fields'''.
# Press "OK" when finished configuring '''''PDF Data Mapping'''''.
 
[[File:2023.1 PDF-Data-Mapping 05 02 02 Bookmarks-for-Data-Fields-01.png]]
 
 
As long as the document folder '''''PDF Data Mapping''''' is applied to has extracted the selected '''Data Field(s)''' with an '''''Extract''''' activity, bookmarks will be created for each '''Data Field''' selected.
 
[[File:2023.1 PDF-Data-Mapping 05 02 02 Bookmarks-for-Data-Fields-02.png|center]]
 
== How To: Configure Metadata ==
 
{|cellpadding=10 cellspacing=5
|style="width:50%" valign=top|
The '''''PDF Data Mapping''''' behavior has the ability to create and insert additional metadata into the generated PDF as well, using information collected during Grooper's document processingThe metadata you are able to create falls into one of three categories:
 
# Editing the PDF's default metadata fields, including:
#* Title
#* Author
#* Subject
#* Created Date
#* Modified Date
#* Application
# Adding "Keywords" to the PDF metadata
#* This can be done using expression based or extraction based methods.
# Creating custom metadata fields and values
#* Custom metadata can be stored for any (single instance) '''Data Field''' values collected during the '''Extract''' activity.


Now, we have child document folders for this parent document folder, but they are just blank folders.  There is nothing to distinguish one folder from the next.
{|class="attn-box"
|
|
[[File:Pdf-generate-howto-39.png]]
&#9888;
|-
|
Notice what's not included in this list is the exported document's ''filename'' (e.g. "Im_a_file.pdf").  Filename mappings are always configured using an '''''Export Behavior'''''.
|}
|valign=top|
|valign=top|
And, that's the second part of the organization equation, classification. Next, these folders will be assigned a '''Document Type''' from our '''Content Model''' using the '''Classify''' activity.   
[[File:2023.1 PDF-Data-Mapping 01 02 About-Metadata-01.png]]
|}
 
=== Prereqs: Data extraction ===
For '''''PDF Data Mapping''''' to work, Grooper needs to have ''data to map''.
 
For '''''Metadata''''', data coming from Grooper can be mapped to the PDF in one of two ways:
# Using '''Data Field''' results
#* To embed custom PDF metadata, the custom fields are generated from '''Data Fields''' in the document's '''Data Model''' and their collected results.
#* This means the document ''must'' be processed by the '''''Extract''''' activity in order to create and populate these custom fields.
#* Or, if performing user assisted data review, the values must be previously recorded during the '''''Review''''' activity.
# Using code expressions
#* In the case of the default PDF metadata fields and keywords, expressions can be used to populate the metadata.
#* This gives you access to not only extracted '''Data Field''' results but also system data, classification information, and various functions to manipulate it.
 
=== Mapping default PDF metadata ===
 
'''''PDF Data Mapping's''''' '''''Metadata''''' settings can edit a PDF's default metadata values for its "Title", "Author", "Subject", "Application", "Created" and "Modified" properties.
 
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# Use a code expression to create custom values for the following default PDF metadata:
#* '''''Title''''' for the PDF's "Title" field
#* '''''Author''''' for the PDF's "Author" field
#* '''''Creator''''' for the PDF's "Application" field
#* '''''Subject''''' for the PDF's "Subject" field
#* '''''Creation Date''''' for the PDF's "Created" field
#* '''''Modification Date''''' for the PDF's "Modified" field
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 06 01 Default-Metadata-01.png]]
 
 
In our example, we made the following changes to the default PDF metadata:
* '''''Title''''':
** This defaults to the expression <code>CurrentDocument.ContentTypeName</code>. This will make the title whatever the document's '''Document Type''' classification is.
** We did not change Grooper's default.
* '''''Author''''':
** This defaults to the expression <code>LDAP.CurrentUserDisplayName</code>.  This will set the author to the Windows username for the Grooper user or service who created the PDF.
** We changed this to evaluate to the applicant's first name, middle initial, and last name as collected by Grooper using the following expression:
*** <code>$"{Applicant_Information.First_Name} {Applicant_Information.Middle_Initial} {Applicant_Information.Last_Name}"</code>
* '''''Creator''''':
** This will adjust the PDF's "Application" field.  This field is left blank by default.
** We changed this to the simple string <code>"Grooper PDF Data Mapping"</code>.
* '''''Subject''''':
** This field is left blank by default.
** We changed this to use the value of the "Proposal Title" '''Data Field''' in the "Proposal Information" '''Data Section''' with the expression <code>Proposal_Information.Proposal_Title</code>
* '''''Creation Date''''':
** This sets the PDF's "Created" date value and defaults to the expression <code>DateTime.Now</code>.  This returns the current system time of your machine at the time the PDF is generated. 
** We did not change Grooper's default.
* '''''Modification Date''''':
** This sets the PDF's "Modified" date value and defaults to the expression <code>DateTime.Now</code>.  This returns the current system time of your machine at the time the PDF is generated.   
** We did not change Grooper's default.


Now, we have everything we need to configure the bookmarking functionality of the ''PDF Generate Behavior''.  Bookmarks will be created every time a new child document is encountered and named after the '''Document Type''' assigned to that folder. 
=== Mapping Keywords ===


When the full PDF is generated, a bookmark named "Application" will be inserted at the first page of the PDF.  That child document is two pages long.  The third page of the full PDF will be the proposal summary.  So a bookmark named "Proposal Summery" will be inserted at page three.  A "Resume" bookmark will be inserted at page four.  And so on.
The '''''Metadata''''' settings can add terms to the PDF's "Keywords" field in one of two ways:
# Using a code expression
# Using an extractor ('''Data Type''', '''Value Reader''' or '''Field Class''')


Writer's Note: There are many ways to separate and classify documents, including ''[[ESP Auto Separation]]'' which both separates and classifies documents time with a single activity (just '''Separate''')But this is the general idea to get us where we need to go. One way or another, create child document folders out of a parent document folderThat way when we generate the PDF for the parent document folder upon export, bookmarks will be created for the classified child document folders.
 
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# To add keyword terms with a code expression, add the expression to the '''''Keywords''''' property.
#* This expression should evaluate to a string valueThis string will be added to the PDF's "Keywords" field.
# To add keyword terms with an extractor, reference the extractor with the '''''Keywords Extractor''''' property.
#* This extractor should return a string valueThis string will be added to the PDF's "Keywords" field.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 06 02 Keywords-01.png]]
 
 
In our example, we used an expression to insert a keyword based on the word count of the "Essay" document in the application packet.
* "Short Essay" for essays under 400 words
* "Long Essay" for essays over 600 words
* "Normal Essay" for essays between 400 and 600 words
 
 
We also used an extractor to add a "signed" keyword if the application was signed and "not signed" if the application was not signed.
 
=== Mapping custom Metadata ===
 
{|class="attn-box"
|
&#9888;
|
|
[[File:Pdf-generate-howto-40.png]]
Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".
* Consider these names reserved.
* If you are attempting to export '''Data Field''' values as custom PDF metadata, they ''cannot'' share any reserved names.  You will need to rename the '''Data Field''' in Grooper to a unique name.
|}
|}
</tab>
<tab name="Add the Behavior" style="margin:20px">
=== Add the Behavior ===
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
Annotations are one of the configuration options for the ''PDF Generate Behavior''.  They are one way a '''Content Type''' '''''Behavior''''' can tell an activity (specifically the '''Export''' activity) how to use the '''Content Type''' to do something (specifically how to use the '''Content Model's''' collected '''Data Fields''' to insert additional content when generating a PDF upon export).


# All '''''Behaviors''''' are added to a '''Content Type''' object.
'''''PDF Data Mapping's''''' '''''Metadata''''' feature can store custom metadata as well, exporting '''Data Field''' values to custom PDF metadata fields.  This is a way for Grooper to save '''Data Field''' values directly to the PDF.
#* We will add the ''PDF Generate Behavior'' to this '''Content Model''' named "PDF Generate - UNESCO Packet".
* '''''BE AWARE:''''' Only single-instance data can be exported to a PDF's custom metadata.
# All '''''Behaviors''''' are added using the '''''Behaviors''''' property. Select the '''''Behaviors''''' property and press the ellipsis button at the end to add the ''PDF Generate Behavior''.
** '''Data Fields''' at the root of a '''Data Model''' or in single instance '''Data Sections''' can be exported.
# This will bring up the '''''Behaviors''''' editor window.
** '''Data Fields''' in multi-instance '''Data Sections''' and '''Data Column''' values ''cannot'' be exported.
# Press the "Add" button to add a '''''Behavior'''''.
 
# Choose "PDF Generate Behavior" from the list.
 
With a '''''PDF Data Mapping''''' behavior added to a '''''Content Type''''':
# Select the '''''PDF Data Mapping''''' behavior in the '''''Behaviors''''' editor.
# Change the '''''Metadata''''' property to ''Enabled'' and expand its sub-properties.
# Turn '''''Export Data Fields''''' to ''True''.
# Use the '''''Field Filter''''' editor to select a specific set of '''Data Fields''' to export.  Otherwise, all '''Data Fields''' will be exported to custom PDF metadata fields.
# Press "OK" when finished.
 
[[File:2023.1 PDF-Data-Mapping 06 03 Custom-Metadata-01.png]]
 
 
In our example, we exported all '''Data Fields''' to the generated PDF's custom fields.  Custom metadata can be viewed using Adobe Acrobat. Go to "Document Properties...".  Then select the "Custom" tab.  All selected '''Data Fields''' will be exported to this "Custom Properties" list in the PDF.
* FYI: Spaces and other special characters in a '''Data Field's''' name will be replaced with underscores (i.e. "Field_Name")
* FYI: '''Data Fields''' in single instance '''Data Sections''' will be named using dot notation (i.e. "Section_Name.Field_Name")
 
[[File:2023.1 PDF-Data-Mapping 06 03 Custom-Metadata-02.png|center]]
 
== How To: Configure Piece Info ==
 
{|class="important-box"
|
'''!!'''
|
|
[[File:Pdf-generate-howto-06.png]]
'''''BE AWARE: PIECE INFO IS STILL UNDER DEVELOPMENT'''''
|-
 
|valign=top|
Please consider the '''''Piece Info''''' feature in "beta" at this time.  This feature will be more fully documented once fully developed.
# Once added, you will see the ''PDF Generate Behavior'' added to the list on the leftSelect it to add an '''''Annotation'''''.
|}
# In the right panel, select the '''''Annotations''''' property and press the ellipsis button at the end.
 
# This will bring up an '''''Annotations''''' collection editor.
"PieceInfo" is a PDF dictionary of additional data stored by other applications.  For example, when you save a PDF from Adobe Illustrator, PieceInfo will store the original Illustrator file (which allows the PDF to be edited in Illustrator as if it were the original).  PieceInfo can be stored at the document level for the whole PDF or at the page level for one or more pages in the PDF.
 
'''''PDF Data Mapping''''' uses PieceInfo dictionaries to store extracted '''Data Field''' values as a PDF dictionary embedded in the document's structure by enabling and configuring the '''''Piece Info''''' settings.
* Contrast this with the '''''Metadata''''' settings which store '''Data Field''' values at the as custom metadata fields in the document properties.
* '''''Piece Info''''' is unique in that it can export data from a '''Data Table''' ''<u>in very specific scenarios</u>''Using the '''''Key Column''''' property, it can build the dictionary from ''<u>only two</u>'' columns in a table, ''<u>and only if</u>'' one of those columns acts as a "key" with unique values for each extracted row.
 
=== PieceInfo at document level vs PieceInfo at page level ===
 
With '''''Piece Info''''' enabled and configured, '''''PDF Data Mapping''''' will store the dictionary at either the document level or on a page, depending on the '''Batch's''' folder structure.
 
 
Imagine a '''Batch Folder''' that looks like this:
 
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-01.png|center]]
 
 
If '''''PDF Data Mapping''''' with '''''Piece Info''''' is configured for a parent document's '''Document Type''', the PieceInfo dictionary is stored at the document level in the PDF.
 
 
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-02.png|center]]
 
 
If '''''PDF Data Mapping''''' with '''''Piece Info''''' is configured for a child document's '''Document Type''', the PieceInfo dictionary is stored at the page level, on the first page of that child document in the PDF.
* In this example '''''PDF Data Mapping''''' with '''''Piece Info''''' was configured for the "Green" '''Document Type'''.
** With a PDF generated for the parent document folder, the output PDF will be 5 pages long total (because there are a total of five pages in the three child document folders).
** Page 1 of the child document folder "Green (2)" will be page 2 in the output PDF.
** The PieceInfo dictionary will therefore be stored in page 2 of the output PDF.
* Be aware, it doesn't matter if the child document is a multipage document with extracted results on multiple pages.  The PieceInfo dictionary is only stored once, on the first page only.
 
 
[[File:2023.1 PDF-Data-Mapping 07 Piece-Info-03.png|center]]
 


We will detail collection and configuration of the various '''''Annotation Types''''' in the next tabs of this tutorial.
{|class="fyi-box"
|
'''FYI'''
|
|
[[File:Pdf-generate-howto-07.png]]
You can inspect PieceInfo with Adobe Acrobat Pro.
 
For inspecting PieceInfo at the document level:
* Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight").  Select "Options" > "Browse Internal PDF Structure...".  Click the Lightbulb icon.  Expand "The document root" and look for "PieceInfo".  Expand "PieceInfo" and look for whatever you named your dictionary in the '''''Piece Info''''' configuration.
 
For inspecting PieceInfo at the page level:
* Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight").  Select "Options" > "Browse Internal PDF Structure...".  Click the Page icon.  Expand a Page and look for "PieceInfo".  Expand "PieceInfo" and look for whatever you named your dictionary in the '''''Piece Info''''' configuration.
|}
|}
</tab>
</tabs>


=== Configure PDF Generation for Metadata ===
=== Known Piece Info Issues ===
 
'''Issue #1:  The Elements Property'''
 
The '''''Elements''''' property does nothing.  Its original intent was to be a kind of filter that allowed for simpler configuration of the '''''Fields''''' property.  However, it was never fully implemented.  It has been deemed an unnecessary property and will be removed in future versions.
 
'''Issue #2:  Page Level Classification'''
 
When separating and classifying documents using '''''ESP Auto Separation''''', Grooper performs page-level classification.  This can cause '''''Piece Info''''' to create a blank PDF PieceInfo dictionary for every page in certain '''''PDF Data Mapping''''' configurations.
 
== How To: Generate the PDF using Merge or Export ==
 
A '''''PDF Data Mapping''''' configuration is applied when Grooper builds a PDF.  This will happen when one of two activities is applied to a '''Batch Folder''':
* Either the '''''Export''''' activity
* Or the '''''Merge''''' activity


== Version Differences ==


'''''Behaviors''''' are a new functionality in '''Grooper 2021'''.  Much of the ''PDF Generate Behavior'' functionality was not available in previous versions. Prior to version '''2021''', only annotation creation was possible using the '''[[Generate PDF]]''' activity. In version '''2021''', this activity has been replaced by the ''PDF Generate Behavior'', expanding its capabilities to generate bookmarks and document metadata as well.
In either case, three conditions must be met for Grooper to create a PDF with the additional '''''PDF Data Mapping''''' settings.
# The '''Batch Folder''' being processed ''must'' be assigned a '''Document Type''' that inherits the '''''PDF Data Mapping''''' behavior.
#* '''''PDF Data Mapping''''' will need to be configured for that '''Document Type''', its parent '''Content Category''' or its parent '''Content Model'''.
# A '''''PDF Format''''' must be added.
#* For the '''''Export''''' activity: To the '''''Export Formats''''' configuration in the '''''Export Behavior'''''.
#* For the '''''Merge''''' activity:  To the '''''Merge Format''''' configuration.
# The '''''PDF Format's''''' '''''Always Build''''' property should be set to ''True''.
#* This will ensure a new output file will be generated in cases where an imported PDF is already attached to the '''Batch Folder''' in Grooper.

Latest revision as of 10:59, 2 September 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.120232021

PDF Data Mapping is a Behavior that enhances PDF files generated by the Merge or Export activities with metadata, bookmarks, annotations and/or different kinds of widgets.

PDF Data Mapping builds a data rich "Smart PDF" from a document folder's content. Classification results, extracted data, and more can be used to insert native PDF elements into the generated PDF.

PDF elements that can be mapped from Grooper generated results include:

  • Bookmarks
  • Metadata
  • PDF Annotations (such as text highlighting, checkbox widgets and signature widgets)

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

The PDF Data Mapping behavior allows Grooper users to more fully leverage the capabilities of the PDF file type. The standard PDF Export Format (and Merge Format) in Grooper will use the page image files and their text data to create a multipage PDF file for each document folder upon Export (or Merge).

However, this is just the "display information" required to open and read the document. There's a lot more to what a PDF can be than just a multipage document with page images and machine readable text. PDF content can also include metadata, keywords, bookmarks, annotations, and more!

PDF Data Mapping expands Grooper's standard PDF generation capabilities. It creates an exportable PDF file that includes additional content available to the PDF file type. PDF Data Mapping merges data collected by Grooper into the PDF by mapping these values to native PDF elements like bookmarks and annotations.

The expanded PDF Data Mapping functionality can be divided into three categories:

  • Annotations: Highlight important text, insert comments, and embed interactive widgets like editable form fields and checkboxes.
  • Bookmarks: Organize complex documents with bookmarks linking to child documents and/or extracted Data Fields.
  • Metadata: Alter the PDFs default metadata, add searchable keywords and export custom metadata using data collected by Grooper.

Annotations

Annotations are native PDF elements used to highlight and comment text in a PDF file. For PDF Data Mapping, "annotations" also refer to interactable "widgets" such as checkbox and text form fields. The Annotations functionality allows you to embed many of these native PDF annotations and widgets into Grooper generated PDFs.

Annotations can serve many purposes:

  • Annotations can increase the readability, such using a highlight annotation to call out important information.
  • Annotations can add components for the reader to interact with the document, such as checkboxes and signature widgets.


PDF Data Mapping can add the following kinds of annotations/widgets:

  1. Highlighting
  2. Radio group buttons
  3. Checkboxes
  4. Signature boxes
  5. Editable text boxes


Grooper uses information from Data Elements in a Data Model collected during the Extract activity to add these annotations.

  • For example, if Grooper extracts a "Name" field and you want that highlighted on the output PDF, you can use the "Highlight Annotation" to highlight the name Grooper extracted on the document.

FYI

The size of all these annotations can also be adjusted using a Padding property if the size of the extracted data instance is too small for your needs.

Bookmarks

Bookmarks provide easy navigation for multipage PDF documents. PDF Data Mapping can generate bookmarks in one of two ways:

  1. Bookmarks can be generated for extracted Data Field locations.
  2. When exporting a document folder that has child document folders, bookmarks can be generated for each "sub-document".
    • This is the default bookmarking behavior and requires no configuration. Bookmarks will be named however the child document folders are named.


In this example, this document is an application packet for a study abroad program. It has both kinds of bookmarks.

  • The "Signature" bookmark is from an extracted Data Field. It will take the reader to a signature location on the PDF.
  • The rest were generated for each child document in the document folder (Batch Folder) that was exported. PDF Data Mapping inserted a bookmark for each sub-document. The selected "Resume (4)" bookmark in the image took the reader to the resume page in the PDF.

FYI

Bookmarks generated for child document folders will be named whatever the documents are named.

  • A document folder's (Batch Folder) name defaults to its classified Document Type and document number. Here, "Application (2)", "Proposal Summary (3)", "Resume (3)", and so on.
  • A document folder's name can be changed if you edit the Document Type's Caption property. This will then change the bookmarks name.
    • Be aware, the document must be extracted for the Caption to be applied and its name changed.

Metadata

Metadata refers to a PDF file's content beyond the information required to display the document (the page images and encoded text data). Prior to implementing the PDF Data Mapping functionality, Grooper only had access to edit minimal PDF metadata upon export (notably the PDF's file name).

PDF Data Mapping allows Grooper to alter and store additional metadata, including:

  1. The PDF's default metadata fields, including its "Title", "Author", "Subject" and more.
  2. Keywords
  3. Custom metadata fields
    • Custom metadata allows Grooper to embed any single instance Data Field's value directly to the PDF.


This gives Grooper a mechanism to create a viewable document with all extracted (single instance) data associated with the document itself, independent of that data being stored elsewhere (such as a database table or content management system).

FYI

This metadata can be accessed in Adobe Acrobat by opening the "Document Properties" window from the File menu.

Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".

  • Consider these names reserved.
  • If you are attempting to export Data Field values as custom PDF metadata, they cannot share any reserved names. You will need to rename the Data Field in Grooper to a unique name.

How To: Add a PDF Data Mapping Behavior

Like all Behaviors, PDF Data Mapping is configured on a Content Type node, commonly a Content Model or a Document Type.


  1. Here, we have selected a Content Model in the Node Tree.
  2. To add a Behavior, select the Behaviors property and click the ellipsis button at the end.
  3. This will bring up a dialogue window to add various behaviors to the Content Model, including PDF Data Mapping.
  4. Add PDF Data Mapping to the list by clicking on the "+" button.
  5. Select PDF Data Mapping from the listed options.


  1. Once added, you will see a PDF Data Mapping item added to the Behaviors list.
  2. Selecting this Behavior, you will see property options to configure PDF creation.
  3. Press "OK" when finished configuring PDF Data Mapping.
  4. Don't forget to save changes to the Content Model.

About the documents used in these tutorials

The following tutorials use a mock UNESCO Laura W. Bush Traveling Fellowship application to detail a more specific set up for a PDF Data Mapping. This is a packet of documents from a single applicant containing a cover page and five different kinds of documents.

By the end of this tutorial we will have taken a source application packet, used Grooper to process it, and exported a single PDF with:

  • Metadata collected from Grooper
  • New annotations and widgets
  • Easily navigable bookmarks

Cover Page and Application

This is an application for a traveling abroad scholarship.

Primarily, the cover page and application document will allow us to demonstrate the annotations and widgets PDF Data Mapping can generate. We will use its Annotations settings to add the following annotations:

  • Text Annotation
  • Highlight Annotation
  • Checkbox Widget
  • Radio Group Widget
  • Signature Widget
  • Textbox Widget

Secondarily, we will also use data collected from this form will be used to generate and store default and custom metadata. We will use the Metadata settings to do this.

Lastly, we will embed a bookmark that will take the PDF's reader to the signature field on the document. We will use the Bookmarking settings to do this.

Essay

This application also includes an essay from the student.

This document will demonstrate how to add Keywords to the PDF's metadata. Using the Metadata settings we will configure a code expression to insert "long essay", "normal essay", or "short essay" depending on the essay's length.

Other Documents

This packet contains three other kinds of documents as well:

  • a proposal summary
  • the applicant's resume
  • and a letter of recommendation.

For these documents (as well as the rest) we will insert bookmarks into the generated PDF, taking the reader to each document in the larger file. We will use Bookmarking settings to do this.

Notes on PDF Data Mapping, child documents and bookmarking

The original document was imported as a single document into Grooper. We have separated it into child documents which will allow us to insert bookmarks for each separated document.

  1. The PDF Generation Behavior will be applied to the Batch Folders at folder-level one.
    • The attached file is the source application packet.
  2. The Split Pages activity was applied to split the packet into pages. Then, those pages were separated into classified document folders at folder-level two.
  3. PDF Data Mapping can create a bookmark in the generated PDF for each of these five sub documents by enabling the Bookmarking property.


By creating bookmarks for each child document, there is no need to export individual PDFs for each one. Instead, we will use PDF Data Mapping to generate one PDF for the whole application packet as use the bookmarks to navigate between each document.

How To: Configure Annotations

Annotations are native PDF elements used to highlight and comment text in a PDF file. For PDF Data Mapping "annotations" also refer to interactable "widgets" such as checkbox and text form fields. In this tutorial we will configure at least one example of each Annotation option. In this tutorial we will configure at least one example of each Annotation option.

  • Text Annotation - Inserts a text-based comment in the PDF.
  • Highlight Annotation - Highlights text on the PDF.
  • Radio Group Widget - Inserts a group of selectable radio buttons in the PDF.
  • Checkbox Widget - Inserts checkable checkboxes in the PDF.
  • Signature Widget - Inserts a signature block in the PDF.
  • Textbox Widget - Inserts an editable form field in the PDF.

BE AWARE: PDF Data Mapping cannot insert annotations on PDF pages with form fields.

If a PDF page is form-fillable, it is ill advised to insert annotations and widgets on top of these form fields. This can result in a corrupted PDF when it is generated by Merge or Export. PDF Data Mapping will not allow you to insert annotations and widgets on PDF pages with form fields.

Prereqs: Data Fields and extracted data

For PDF Data Mapping to work, Grooper needs to have data to map.

  • For Annotations this means Data Fields.
  • Data must be saved for each Data Field prior to the PDF being generated.
    • The Extract activity must run before Merge or Export generates the PDF.
    • If performing user assisted data review, the Review activity must complete before Merge or Export generates the PDF.


Each of the Annotation Types references a Data Field in a Data Model as part of their configuration. If the Data Field does not collect data during the Extract activity, the PDF Data Mapping won't know where to place the annotation.

About the Data Model used for this tutorial

The Data Model we're working with has several Data Fields that will allow PDF Data Mapping to place annotations and widgets.

The "Last Name" "First Name" and "Middle Initial" Data Fields (in the "Applicant Information" Data Section) will demonstrate the Highlight Annotation

  • These fields use Labeled Value to extract field values next to a label.
  • Be aware, nearly any kind of Value Extractor can be used to insert a highlight annotation. Grooper just needs a location on the document to draw the highlight boundaries.

The "US Citizen" Data Field will demonstrate the Radio Group Widget.

  • This field uses Labeled OMR to extract a group of checkboxes where only one may be checked.
  • Be aware, any OMR extractor (Labeled OMR, Ordered OMR or Zonal OMR) would be able insert the radio group widget as long as its Check Mode is set to CheckOne.

The "Checklist" Data Field will demonstrate the Checkbox Widget.

  • This field uses Labeled OMR to extract a group of checkboxes where one or more may be checked.
  • Be aware, any OMR extractor (Labeled OMR, Ordered OMR or Zonal OMR) would be able insert the checkbox widget.

The "Signature" Data Field will demonstrate the Signature Widget.

  • This field uses Detect Signature to detect whether or not a signature is present on the document.
  • Be aware, any zonal extractor (Read Zone, Highlight Zone or Detect Signature) would be able insert the signature widget.

The "Signature Date" Data Field will demonstrate the Textbox Widget.

  • Textbox Widget adds a text-editable form field to the PDF to store a field value.
    • Compare this to a Text Annotation which simply adds a text comment to the PDF.
  • This field uses Labeled Value to extract the date the application was signed.
  • Be aware, any zonal extractor (Read Zone, Highlight Zone or Detect Signature) would be able insert the signature widget.

The "IsProcessed" Data Field will demonstrate the Text Annotation.

  • Text Annotation inserts a text comment in the PDF.
    • Compare this to a Textbox Widget which adds an actual form field to the PDF to store a field value.
    • We will use this field and annotation to print the word "PROCESSED" on the output PDF
  • This field uses Highlight Zone to draw an extraction zone for the field and the Data Field's Default Value to determine what's printed.
    • This is a technique common to Text Annotation use cases and will be explained in further depth below.

Adding Annotations

PDF Data Mapping inserts various types of PDF annotations and widgets by configuring its Annotations property. Users can add one or more Annotation Types to the Annotations list. Adding a new Annotation to the list is simple.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Select the Annotations property and press the ellipsis button at the end.
  3. This will bring up the Annotations editor.
  4. Press the "+" button.
  5. Select the Annotation Type you want to add from the dropdown list.


  1. Once added, you will see the Annotation Type added to the Annotations list.
  2. All Annotation Types will have a set of General properties to configure.
  3. Some Annotation Types have additional properties you can configure.
    • For example, the Highlight Annotation has Appearance properties you can configure to adjust the highlight's color and other appearance properties.
  4. Press "OK" when finished.

Notes on shared properties

All Annotation Types share a set of General properties.

  • Fields
    • Select Data Fields to map the Data Fields to the PDF annotation with this property.
    • The Fields property is required.
      • One or more Data Field must be selected to generate the annotation.
      • If you don't select any Data Fields or the selected Data Fields are not extracted, PDF Data Mapping will not insert an annotation in the output PDF.
      • Be aware, all Data Fields are selected by default.
  • Padding
    • The Padding property can adjust the size of the annotation.
    • Grooper uses a Data Field's result instance to draw the annotation's boundaries.
      • The size of the Data Field's instance may be too small for what you want to appear on the output PDF.
      • If so, use Padding to increase the annotation's size on the PDF generated by PDF Data Mapping.
  • Allow Edit
    • Allow Edit refers to a reader's ability to edit the annotation as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the annotation (or widget).
    • Enabling this property (turning it True) will allow users to fully adjust the annotation in the PDF, including its size, location and other properties.
    • Be aware, even when False, users will still be able to interact with widgets, such as the Checkbox Widget or Textbox Widget.
  • Print
    • In a PDF viewing application, like Adobe Acrobat, all annotations and widgets PDF Data Mapping generates will be visible. The Print property determines whether or not the annotation is visible when the PDF is printed.
    • Be aware, the default is False.
      • Grooper presumes you will open the "Smart PDF" output by PDF Data Mapping will be opened in a PDF viewer (where all annotations will be visible).
      • Grooper also presumes if you want to print the PDF, you want something more like the original document printed, not the one with additional PDF elements Grooper inserts. If you do want those annotations and widgets visible when the PDF is printed, turn Print to True.

Annotation Types

There are currently six types of annotations Grooper can add to the PDF it creates:

Highlight Annotation

The Highlight Annotation overlays a colored rectangle with adjustable transparency on a Data Field's extracted location. In other words, it can highlight extraction results.

  • Use this to highlight important values extracted from Grooper.
  • Like all Annotations, this highlight can be printable or not. When the Print property is False, the highlight will show up when viewed in a PDF viewer but not if the PDF is printed.


In this example, we will use the Highlight Annotation to highlight the extracted "Last Name", "First Name" and "Middle Initial" fields from the application form. To configure this Annotation we will:

  • Select the Data Fields we wish to highlight.
  • Adjust how we want the highlight to look.

Before Annotation

After Annotation

With a Highlight Annotation added to the Annotations list:

  1. Use the Fields property to select the Data Fields you wish to highlight.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Fields you wish to highlight.
    • In our case, we are choosing the "Last Name", "First Name", and "Middle Initial" Data Fields.
    • Be aware, these fields must be extracted by the Extract activity or nothing will be highlighted.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
    • Adjusting Padding for Highlight Annotations is common. In this example, we increased the highlights size by 0.1 in on each side.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.


  1. Adjust the highlight's appearance, as desired, using the Appearance properties.
  2. Most commonly, users will adjust the Fill Color.
    • Use the dropdown to select from a list of system colors.
    • Or, enter an RGB value using the format #, #, #
    • This property defaults to the "Grooper green" highlight seen in Review's Data View. In this example, we've changed it to Yellow.
  3. Press "OK" when finished (or continue adding more Annotations).

Radio Group Widget

The Radio Group Widget overlays a group of radio button PDF elements on top of where a Grooper extractor finds OMR checkboxes on a document.

  • Radio buttons are common PDF elements used to indicate a single choice from multiple options in a list.
    • Note radio buttons (inserted by Radio Group Widget) differ from checkboxes (inserted by Checkbox Widget). For radio buttons, only one choice out of a group may be selected. For checkboxes, any number of choices may be selected.
  • The Data Field(s) this annotation references must use an OMR extractor to return results: Labeled OMR, Ordered OMR or Zonal OMR
    • This extractor must also have its Mode set to CheckOne (Only one box out of many may checked/selected).
  • PDF Data Mapping will insert one radio button for each checkbox the extractor locates.

Before Annotation

After Annotation

With a Radio Group Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the group of radio buttons.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "US Citizen" Data Field.
    • Be aware, this fields must (1) use an OMR extractor to return results (2) with its Mode set to CheckOne (3) have already been extracted by the Extract activity and (4) have located checkboxes during extraction or no radio buttons will be placed.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the widget (press a radio button).
  3. Press "OK" when finished (or continue adding more Annotations).

Be Aware: Annotations are overlaid on a page's image

BE AWARE: The Radio Group Widget overlays radio buttons on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the Image Processing activity.

  • Notice the original image for this document used checkboxes, not radio buttons. We see an "X" inside of a square box.

You can actually see the edges of the square box persist in the generated PDF (Here, highlighted in yellow for your viewing pleasure).

  • In this case, the boxes were detected by the "detection only" Box Detection IP command and not removed by the "detection and removal" Box Removal command.
  • Box Detection finds and store the checkbox locations and check states but does not actually alter the image in any way.

Maybe you care about this, and maybe you don't. If you do, use Box Removal instead.

  • Box Removal will also find and store the checkbox locations and their check states, but it will also digitally remove the checkboxes from the document's image. This will allow Grooper to extract the checkboxes and allow PDF Data Mapping to overlay the radio buttons on a field of blank pixels.
  • Run Box Removal in an IP Profile using the Image Processing activity prior to running the Extract activity to do this.

Checkbox Widget

The Checkbox Widget inserts one or more form-fillable checkboxes into the PDF on top of where a Grooper extractor finds OMR checkboxes.

  • Checkboxes are common PDF elements used to indicate a choice from one or many options.
    • Note checkboxes (inserted by Checkbox Widget) differ from radio buttons (inserted by Radio Group Widget). For radio buttons, only one choice out of a group may be selected. For checkboxes, any number of choices may be selected.
  • The Data Field(s) this annotation references must use an OMR extractor to return results: Labeled OMR, Ordered OMR, or Zonal OMR
  • However, this extractor may use any of the OMR Modes (CheckOne, CheckMulti or Boolean).
  • PDF Data Mapping will insert a simple checkbox PDF element for each checkbox the extractor locates.


In this example, we will create a Checkbox Widget for the checkboxes extracted using the "Checklist" Data Field. This is a Labeled OMR extractor that uses the CheckMulti Mode, indicating one of any number of checkboxes may be checked for each label. Checked or not, the Checkbox Widget will insert a checkbox element into the generated PDF.

Before Annotation

After Annotation

With a Checkbox Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the group of radio buttons.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "Checklist" Data Field.
    • Be aware, this fields must (1) use an OMR extractor to return results (2) have already been extracted by the Extract activity and (3) have located checkboxes during extraction or no checkboxes will be placed.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the widget (check the checkboxes).
  3. Press "OK" when finished (or continue adding more Annotations).

BE AWARE: The Checkbox Widget overlays checkboxes on a page's image. Any printed checkbox on the original page will persist (behind the widget), unless removed by the Image Processing activity.

For more information, see above.

Signature Widget

The Signature Widget inserts a signature block into the PDF.

  • Signature blocks allow PDFs to capture digital signatures. This allows you to create a document that can be digitally signed straight from Grooper on export.
  • The Data Field(s) this annotation references will typically use a zonal extractor to define where the signature block should be: Detect Signature or Highlight Zone most commonly
  • Other Value Extractors may work, but these are most typical. PDF Data Mapping will insert the signature block using the geometric boundaries of the extraction instance. Zonal extractors are well suited to define fixed boundaries of extraction results.


In this example, we will create a Signature Widget annotation for the signature line on the application form, using the "Signature" Data Field of our Data Model. The Signature Widget will insert an interactable signature element into the generated PDF.

Before Annotation

After Annotation

With a Signature Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field you wish use to insert the signature block.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkbox next to the Data Field you wish to select.
    • In our case, we are choosing the "Signature" Data Field.
    • Be aware, this fields must (1) have already been extracted by the Extract activity and (2) have drawn a zone defining the location and size of the signature block (Most commonly, Detect Signature or Highlight Zone is used to do this).
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to interact with the element (submit a signature).
  3. Press "OK" when finished (or continue adding more Annotations).

BE AWARE: The Signature Widget overlays a signature block on a page's image. If present, any printed signature on the original page will persist (behind the widget), unless removed by the Image Processing activity.

For more information, see above.

Textbox Widget

The Textbox Widget inserts text-editable form fields into the generated PDF.

  • Form fields allow PDFs to collect and store data entered by a user.
  • Users can configure a Textbox Widget to create blank form fields or form fields with a value Grooper extracts already populated.
    • For blank form fields, the Data Field(s) this annotation references should use Highlight Zone to place a blank zone where the field should be inserted.
    • For populated form fields, the Data Field(s) this annotation references can use any extractor that returns a single-instance value (most typically Labeled Value).
      • This allows Grooper to not only generate a PDF with form fields where they weren't present in the source document, but prefill them with data Grooper collects.
  • Be aware, a Textbox Widget differs from a Text Annotation. Where Textbox Widget will insert a text-editable form field, Text Annotation adds a text comment to to PDF.

Before Annotation

After Annotation

In this example, we will use the Textbox Widget to insert a form field for the "Signature Date" Data Field. This used Labeled Value to extract the date. PDF Data Mapping will overlay the form field on top of the extraction result.

  • FYI: We will also adjust the generated widget's size using the Padding property. This is common when configuring Textbox Widgets when the font size you want to use for the form field is larger than the printed typeface on the document.


With a Textbox Widget added to the Annotations list:

  1. Use the Fields property to select the Data Field(s) you wish to use to create text-editable form fields.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Field(s) you wish to select.
    • In our case, we are choosing the "Signature Date" Data Field.
    • Be aware, these fields must be extracted by the Extract activity or no textbox will be generated.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
    • Adjusting Padding for Textbox Widgets is common if the desired font size in the textbox differs from that printed on the source document. In this example, we increased the textbox's size by 0.1 in on each side.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the widget and prevent it from being visible when printed.
    • Please note: Allow Edit refers to a reader's ability to edit the widget as a PDF element, such as moving its location on the PDF or adjusting its size. It does not refer to a reader's ability to edit the value inside the textbox. To configure that, use the Read Only property.


  1. Adjust the textbox's other properties as desired.
    • These properties give you the ability to adjust the font and font size inside the textbox.
    • Please note: If you want to prevent a reader from editing the Grooper collected value inside the textbox, turn Read Only to True.
  2. Press "OK" when finished (or continue adding more Annotations).

Text Annotation

The Text Annotation inserts a text comment in the PDF.

  • This has two primary uses:
    • Insert comments into the PDF that are viewable when opening the PDF in a PDF viewer, but not printable.
    • Print a simple text note on a page.
      • Commonly, users will want to print a word like "PROCESSED" on the output PDF. This notes the document has been processed through Grooper.
  • The Data Field(s) may use any kind of extractor as long as it produces a result with (1) a location on the page to place the comment and (2) a text value to add to the comment.
  • Be aware, a Textbox Widget differs from a Text Annotation. Where Textbox Widget will insert a text-editable form field, Text Annotation adds a text comment to to PDF.


In this example, we will use a Text Annotation to print the word "PROCESSED" on the first page of the PDF generated by PDF Data Mapping.

  • We will use the "IsProcessed" Data Field to do this. The extraction logic to make this happen requires a less-than-common technique. We will show you how we build this Data Field in the #Technique: "IsProcessed" Data Field section.

Before Annotation

After Annotation

With a Text Annotation added to the Annotations list:

  1. Use the Fields property to select the Data Field(s) you wish to use to insert the text comment.
    • Press the ellipsis button at the end of the Fields property.
  2. In the window that pops up, mark the checkboxes next to the Data Fields you wish to select.
    • In our case, we are choosing the "IsProcessed" Data Fields.
    • Be aware, these fields must (1) be extracted by the Extract activity and (2) hold a location and value or no comment will be added.
  3. Press "OK" when finished.


  1. Determine if you need to adjust the annotation's padding. Adjust the Padding property if you do.
  2. Determine if you need to adjust if the annotation is editable or printable. Adjust the Allow Edit or Print properties if you do.
    • Use the defaults to prevent the users from being able to adjust the annotation and prevent it from being visible when printed.
    • In our case, we do want this comment printed when the document is printed. So, we've changed Print to True.


  1. Adjust the comment's appearance, as desired, using the Appearance properties.
    • Users may change the comment's font and font size with the Font Name and Font Size properties.
    • Users may select a Fill Color and Text Color in one of two ways:
      • Using the the dropdown to select from a list of system colors
      • Or, entering an RGB value using the format #, #, #
      • Be aware, there is no true "transparent" Fill Color option. The selectable Transparent option is a system color that equates to "white".
  2. Press "OK" when finished (or continue adding more Annotations).

Technique: "IsProcessed" Data Field

To print the word "PROCESSED" on the PDF, we used a specific technique. A Text Annotation just needs two things from a Data Field to insert the annotation: (1) a location on the page to place the comment and (2) a text value to add to the comment. The word "PROCESSED" did not exist on the source PDF. So, we had to figure out a way to use a Data Field to generate a result rather than extract it.

We did this in essentially two steps:

  1. Use the Highlight Zone extractor to define where the annotation should be printed.
  2. Use a Calculated Value to define the text we want to print (the word "PROCESSED").


This gives a Text Annotation everything it needs to insert the comment: (1) A location and (2) some text

How To: Configure Bookmarks

Bookmarks in PDFs aid readers when navigating through multipage documents. PDF Data Mapping can insert bookmarks into the generated PDF to take advantage of this functionality. This can be done in one of two ways (or both):

  1. Using a document folder's (Batch Folder) child folders (Batch Folder).
  2. Using a document folder's extracted Data Fields.

In this tutorial we take an application packet separated into component child documents and use PDF Data Mapping's Bookmarking property to create bookmarks for each one.

The application packet as a whole consists of five separate and distinguishable documents.

  1. The application itself (and a coversheet)
  2. A proposal summary
  3. The student's resume
  4. A letter of recommendation
  5. An essay

Our goal is to create a bookmark in the generated PDF file for each of these component documents (child documents).

Rather than exporting five separate PDF files for each component document, we will export a single PDF for the whole packet with navigable bookmarks.


We we also demonstrate how to use Data Fields for bookmarking. This allows us to insert PDF bookmarks for locations of extracted data.

  • The "Signature" bookmark in this example would take the reader to the signature line of the PDF, using the location extracted by the "Signature" Data Field in our Data Model.

Bookmarking Option 1: Child document/folder bookmarks

There are two ways the Bookmarking feature can insert bookmarks into a PDF generated by PDF Data Mapping.

  1. It can insert a bookmark for each child document/folder.
  2. It can insert a bookmark for selected (single instance) Data Fields.

This section will detail how to insert bookmarks using child documents.

Option 1 Prereqs: Separated child documents

For PDF Data Mapping to work, Grooper needs to have data to map.

If enabled, Bookmarking will automatically add bookmarks to a PDF if a document has child documents in the Batch's folder hierarchy.

  • If a document at folder level 1 is exported and has two child documents, the generated PDF will have two bookmarks in the generated PDF.
  • Clicking on the bookmark will take the reader to that child document's page in the PDF.


For this to work:

  • The parent document folder must have separable child pages.
    • Either from scanning pages in with a scanner or using the Split Pages activity to generate pages from an imported PDF.
  • These child pages must then be separated into child folders.
    • Either using a Separation Profile when scanning or using the Separate activity.


Technically speaking, that's all you need. PDF Data Mapping will add PDF bookmarks for every child document and name it using each child folder's name.

  • Be aware, without classifying the child documents these names will just be "Folder (1)" "Folder (2)" "Folder (3)" and so on.

Not separated

No child folders

Separated

Has child folders

Separated and classified

Has child document folders.

What about Page 1 there?

Is it in a folder? No. Then it won't get a bookmark.

Adding bookmarks for child documents/folders

PDF Data Mapping will create bookmarks for child documents/folders by default. There is no configuration required besides enabling the Bookmarking property.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Bookmarking property to Enabled.
  3. Press "OK" when finished.


That's it! It's that simple!

As long as the document folder PDF Data Mapping is applied to has child documents/folders, bookmarks will be created for each child document.


Bookmarking Option 2: Data Field bookmarks

There are two ways the Bookmarking feature can insert bookmarks into a PDF generated by PDF Data Mapping.

  1. It can insert a bookmark for each child document/folder.
  2. It can insert a bookmark for selected (single instance) Data Fields.

This section will detail how to insert bookmarks using Data Fields. This allows PDF Data Mapping to bookmark important field value locations extracted by Grooper in the output PDF.

Option 2 Prereqs: Data Fields and extracted data

For PDF Data Mapping to work, Grooper needs to have data to map.

Bookmarking can also insert PDF bookmarks using extracted data and their location. Data Fields collect results using extractors which return results from the source document. Bookmarking will use these results' locations to embed this kind of bookmark.

For this to work:

  • You must have these Data Fields defined in a Data Model and configured to return results.
  • Data must be saved for each Data Field prior to the PDF being generated.
    • The Extract activity must run before Merge or Export generates the PDF.
    • If performing user assisted data review, the Review activity must complete before Merge or Export generates the PDF.

Adding bookmarks for Data Fields

PDF Data Mapping will insert bookmarks for extracted Data Field value locations by simply selecting which Data Field(s) you want to bookmark.

  • Please note: Only single-instance Data Fields may be bookmarked.


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Bookmarking property to Enabled and expand its sub-properties.
  3. Select Data Elements and press the ellipsis button at the end.
  4. A "Data Elements" selection editor will appear.
  5. Select the Data Field whose location you wish to bookmark.
    • Please note: Only single-instance Data Fields may be bookmarked.
  6. Press "OK" when finished selecting Data Fields.
  7. Press "OK" when finished configuring PDF Data Mapping.


As long as the document folder PDF Data Mapping is applied to has extracted the selected Data Field(s) with an Extract activity, bookmarks will be created for each Data Field selected.

How To: Configure Metadata

The PDF Data Mapping behavior has the ability to create and insert additional metadata into the generated PDF as well, using information collected during Grooper's document processing. The metadata you are able to create falls into one of three categories:

  1. Editing the PDF's default metadata fields, including:
    • Title
    • Author
    • Subject
    • Created Date
    • Modified Date
    • Application
  2. Adding "Keywords" to the PDF metadata
    • This can be done using expression based or extraction based methods.
  3. Creating custom metadata fields and values
    • Custom metadata can be stored for any (single instance) Data Field values collected during the Extract activity.

Notice what's not included in this list is the exported document's filename (e.g. "Im_a_file.pdf"). Filename mappings are always configured using an Export Behavior.

Prereqs: Data extraction

For PDF Data Mapping to work, Grooper needs to have data to map.

For Metadata, data coming from Grooper can be mapped to the PDF in one of two ways:

  1. Using Data Field results
    • To embed custom PDF metadata, the custom fields are generated from Data Fields in the document's Data Model and their collected results.
    • This means the document must be processed by the Extract activity in order to create and populate these custom fields.
    • Or, if performing user assisted data review, the values must be previously recorded during the Review activity.
  2. Using code expressions
    • In the case of the default PDF metadata fields and keywords, expressions can be used to populate the metadata.
    • This gives you access to not only extracted Data Field results but also system data, classification information, and various functions to manipulate it.

Mapping default PDF metadata

PDF Data Mapping's Metadata settings can edit a PDF's default metadata values for its "Title", "Author", "Subject", "Application", "Created" and "Modified" properties.

With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. Use a code expression to create custom values for the following default PDF metadata:
    • Title for the PDF's "Title" field
    • Author for the PDF's "Author" field
    • Creator for the PDF's "Application" field
    • Subject for the PDF's "Subject" field
    • Creation Date for the PDF's "Created" field
    • Modification Date for the PDF's "Modified" field
  4. Press "OK" when finished.


In our example, we made the following changes to the default PDF metadata:

  • Title:
    • This defaults to the expression CurrentDocument.ContentTypeName. This will make the title whatever the document's Document Type classification is.
    • We did not change Grooper's default.
  • Author:
    • This defaults to the expression LDAP.CurrentUserDisplayName. This will set the author to the Windows username for the Grooper user or service who created the PDF.
    • We changed this to evaluate to the applicant's first name, middle initial, and last name as collected by Grooper using the following expression:
      • $"{Applicant_Information.First_Name} {Applicant_Information.Middle_Initial} {Applicant_Information.Last_Name}"
  • Creator:
    • This will adjust the PDF's "Application" field. This field is left blank by default.
    • We changed this to the simple string "Grooper PDF Data Mapping".
  • Subject:
    • This field is left blank by default.
    • We changed this to use the value of the "Proposal Title" Data Field in the "Proposal Information" Data Section with the expression Proposal_Information.Proposal_Title
  • Creation Date:
    • This sets the PDF's "Created" date value and defaults to the expression DateTime.Now. This returns the current system time of your machine at the time the PDF is generated.
    • We did not change Grooper's default.
  • Modification Date:
    • This sets the PDF's "Modified" date value and defaults to the expression DateTime.Now. This returns the current system time of your machine at the time the PDF is generated.
    • We did not change Grooper's default.

Mapping Keywords

The Metadata settings can add terms to the PDF's "Keywords" field in one of two ways:

  1. Using a code expression
  2. Using an extractor (Data Type, Value Reader or Field Class)


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. To add keyword terms with a code expression, add the expression to the Keywords property.
    • This expression should evaluate to a string value. This string will be added to the PDF's "Keywords" field.
  4. To add keyword terms with an extractor, reference the extractor with the Keywords Extractor property.
    • This extractor should return a string value. This string will be added to the PDF's "Keywords" field.
  5. Press "OK" when finished.


In our example, we used an expression to insert a keyword based on the word count of the "Essay" document in the application packet.

  • "Short Essay" for essays under 400 words
  • "Long Essay" for essays over 600 words
  • "Normal Essay" for essays between 400 and 600 words


We also used an extractor to add a "signed" keyword if the application was signed and "not signed" if the application was not signed.

Mapping custom Metadata

Be aware the PDF file format has metadata fields already named "Title", "Author", "Subject", "Keywords", "Creator", "Producer", "CreationDate", "ModDate" and "Trapped".

  • Consider these names reserved.
  • If you are attempting to export Data Field values as custom PDF metadata, they cannot share any reserved names. You will need to rename the Data Field in Grooper to a unique name.

PDF Data Mapping's Metadata feature can store custom metadata as well, exporting Data Field values to custom PDF metadata fields. This is a way for Grooper to save Data Field values directly to the PDF.

  • BE AWARE: Only single-instance data can be exported to a PDF's custom metadata.
    • Data Fields at the root of a Data Model or in single instance Data Sections can be exported.
    • Data Fields in multi-instance Data Sections and Data Column values cannot be exported.


With a PDF Data Mapping behavior added to a Content Type:

  1. Select the PDF Data Mapping behavior in the Behaviors editor.
  2. Change the Metadata property to Enabled and expand its sub-properties.
  3. Turn Export Data Fields to True.
  4. Use the Field Filter editor to select a specific set of Data Fields to export. Otherwise, all Data Fields will be exported to custom PDF metadata fields.
  5. Press "OK" when finished.


In our example, we exported all Data Fields to the generated PDF's custom fields. Custom metadata can be viewed using Adobe Acrobat. Go to "Document Properties...". Then select the "Custom" tab. All selected Data Fields will be exported to this "Custom Properties" list in the PDF.

  • FYI: Spaces and other special characters in a Data Field's name will be replaced with underscores (i.e. "Field_Name")
  • FYI: Data Fields in single instance Data Sections will be named using dot notation (i.e. "Section_Name.Field_Name")

How To: Configure Piece Info

!!

BE AWARE: PIECE INFO IS STILL UNDER DEVELOPMENT

Please consider the Piece Info feature in "beta" at this time. This feature will be more fully documented once fully developed.

"PieceInfo" is a PDF dictionary of additional data stored by other applications. For example, when you save a PDF from Adobe Illustrator, PieceInfo will store the original Illustrator file (which allows the PDF to be edited in Illustrator as if it were the original). PieceInfo can be stored at the document level for the whole PDF or at the page level for one or more pages in the PDF.

PDF Data Mapping uses PieceInfo dictionaries to store extracted Data Field values as a PDF dictionary embedded in the document's structure by enabling and configuring the Piece Info settings.

  • Contrast this with the Metadata settings which store Data Field values at the as custom metadata fields in the document properties.
  • Piece Info is unique in that it can export data from a Data Table in very specific scenarios. Using the Key Column property, it can build the dictionary from only two columns in a table, and only if one of those columns acts as a "key" with unique values for each extracted row.

PieceInfo at document level vs PieceInfo at page level

With Piece Info enabled and configured, PDF Data Mapping will store the dictionary at either the document level or on a page, depending on the Batch's folder structure.


Imagine a Batch Folder that looks like this:


If PDF Data Mapping with Piece Info is configured for a parent document's Document Type, the PieceInfo dictionary is stored at the document level in the PDF.



If PDF Data Mapping with Piece Info is configured for a child document's Document Type, the PieceInfo dictionary is stored at the page level, on the first page of that child document in the PDF.

  • In this example PDF Data Mapping with Piece Info was configured for the "Green" Document Type.
    • With a PDF generated for the parent document folder, the output PDF will be 5 pages long total (because there are a total of five pages in the three child document folders).
    • Page 1 of the child document folder "Green (2)" will be page 2 in the output PDF.
    • The PieceInfo dictionary will therefore be stored in page 2 of the output PDF.
  • Be aware, it doesn't matter if the child document is a multipage document with extracted results on multiple pages. The PieceInfo dictionary is only stored once, on the first page only.



FYI

You can inspect PieceInfo with Adobe Acrobat Pro.

For inspecting PieceInfo at the document level:

  • Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight"). Select "Options" > "Browse Internal PDF Structure...". Click the Lightbulb icon. Expand "The document root" and look for "PieceInfo". Expand "PieceInfo" and look for whatever you named your dictionary in the Piece Info configuration.

For inspecting PieceInfo at the page level:

  • Open the Preflight tool (Go to "All tools" > "Use Print Production" > "Preflight"). Select "Options" > "Browse Internal PDF Structure...". Click the Page icon. Expand a Page and look for "PieceInfo". Expand "PieceInfo" and look for whatever you named your dictionary in the Piece Info configuration.

Known Piece Info Issues

Issue #1: The Elements Property

The Elements property does nothing. Its original intent was to be a kind of filter that allowed for simpler configuration of the Fields property. However, it was never fully implemented. It has been deemed an unnecessary property and will be removed in future versions.

Issue #2: Page Level Classification

When separating and classifying documents using ESP Auto Separation, Grooper performs page-level classification. This can cause Piece Info to create a blank PDF PieceInfo dictionary for every page in certain PDF Data Mapping configurations.

How To: Generate the PDF using Merge or Export

A PDF Data Mapping configuration is applied when Grooper builds a PDF. This will happen when one of two activities is applied to a Batch Folder:

  • Either the Export activity
  • Or the Merge activity


In either case, three conditions must be met for Grooper to create a PDF with the additional PDF Data Mapping settings.

  1. The Batch Folder being processed must be assigned a Document Type that inherits the PDF Data Mapping behavior.
    • PDF Data Mapping will need to be configured for that Document Type, its parent Content Category or its parent Content Model.
  2. A PDF Format must be added.
    • For the Export activity: To the Export Formats configuration in the Export Behavior.
    • For the Merge activity: To the Merge Format configuration.
  3. The PDF Format's Always Build property should be set to True.
    • This will ensure a new output file will be generated in cases where an imported PDF is already attached to the Batch Folder in Grooper.