2023:Behaviors (Property): Difference between revisions

From Grooper Wiki
No edit summary
Line 15: Line 15:
* [[Media:2023_Wiki_Behaviors_Project.zip]]
* [[Media:2023_Wiki_Behaviors_Project.zip]]
|}
|}
== Glossary ==
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Service}}
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Concept}}
<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}}
<u><big>'''AND'''</big></u>: {{#lst:Glossary|AND}}
<u><big>'''Batch Folder'''</big></u>: {{#lst:Glossary|Batch Folder}}
<u><big>'''Batch Process Step'''</big></u>: {{#lst:Glossary|Batch Process Step}}
<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
<u><big>'''Behavior'''</big></u>: {{#lst:Glossary|Behavior}}
<u><big>'''Classification Method'''</big></u>: {{#lst:Glossary|Classification Method}}
<u><big>'''Classification'''</big></u>: {{#lst:Glossary|Classification}}
<u><big>'''CMIS Connection'''</big></u>: {{#lst:Glossary|CMIS Connection}}
<u><big>'''CMIS Export'''</big></u>: {{#lst:Glossary|CMIS Export}}
<u><big>'''CMIS Import'''</big></u>: {{#lst:Glossary|CMIS Import}}
<u><big>'''CMIS Query'''</big></u>: {{#lst:Glossary|CMIS Query}}
<u><big>'''CMIS Repository'''</big></u>: {{#lst:Glossary|CMIS Repository}}
<u><big>'''CMIS'''</big></u>: {{#lst:Glossary|CMIS}}
<u><big>'''Content Category'''</big></u>: {{#lst:Glossary|Content Category}}
<u><big>'''Content Model'''</big></u>: {{#lst:Glossary|Content Model}}
<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
<u><big>'''Data Export'''</big></u>: {{#lst:Glossary|Data Export}}
<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
<u><big>'''Data Section'''</big></u>: {{#lst:Glossary|Data Section}}
<u><big>'''Data Table'''</big></u>: {{#lst:Glossary|Data Table}}
<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
<u><big>'''Document Viewer'''</big></u>: {{#lst:Glossary|Document Viewer}}
<u><big>'''Execute'''</big></u>: {{#lst:Glossary|Execute}}
<u><big>'''Export Behavior'''</big></u>: {{#lst:Glossary|Export Behavior}}
<u><big>'''Export Definition'''</big></u>: {{#lst:Glossary|Export Definition}}
<u><big>'''Export'''</big></u>: {{#lst:Glossary|Export}}
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
<u><big>'''Extractor Type'''</big></u>: {{#lst:Glossary|Extractor Type}}
<u><big>'''FTP'''</big></u>: {{#lst:Glossary|FTP}}
<u><big>'''IMAP'''</big></u>: {{#lst:Glossary|IMAP}}
<u><big>'''Import Query Results'''</big></u>: {{#lst:Glossary|Import Query Results}}
<u><big>'''Labeled Value'''</big></u>: {{#lst:Glossary|Labeled Value}}
<u><big>'''Labeling Behavior'''</big></u>: {{#lst:Glossary|Labeling Behavior}}
<u><big>'''Labelset-Based'''</big></u>: {{#lst:Glossary|Labelset-Based}}
<u><big>'''Project'''</big></u>: {{#lst:Glossary|Project}}
<u><big>'''Render'''</big></u>: {{#lst:Glossary|Render}}
<u><big>'''Repository'''</big></u>: {{#lst:Glossary|Repository}}
<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
<u><big>'''Scope'''</big></u>: {{#lst:Glossary|Scope}}
<u><big>'''SFTP'''</big></u>: {{#lst:Glossary|SFTP}}
<u><big>'''SharePoint'''</big></u>: {{#lst:Glossary|SharePoint}}
<u><big>'''Tabular Layout'''</big></u>: {{#lst:Glossary|Tabular Layout}}
<u><big>'''Transaction Detection'''</big></u>: {{#lst:Glossary|Transaction Detection}}


<!--#region About-->
<!--#region About-->
Line 488: Line 389:
<!--#endregion-->
<!--#endregion-->
<!--#endregion-->
<!--#endregion-->
== Glossary ==
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Service}}
<u><big>'''Activity Processing'''</big></u>: {{#lst:Glossary|Activity Processing Concept}}
<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}}
<u><big>'''AND'''</big></u>: {{#lst:Glossary|AND}}
<u><big>'''Batch Folder'''</big></u>: {{#lst:Glossary|Batch Folder}}
<u><big>'''Batch Process Step'''</big></u>: {{#lst:Glossary|Batch Process Step}}
<u><big>'''Batch Process'''</big></u>: {{#lst:Glossary|Batch Process}}
<u><big>'''Batch'''</big></u>: {{#lst:Glossary|Batch}}
<u><big>'''Behavior'''</big></u>: {{#lst:Glossary|Behavior}}
<u><big>'''Classification Method'''</big></u>: {{#lst:Glossary|Classification Method}}
<u><big>'''Classification'''</big></u>: {{#lst:Glossary|Classification}}
<u><big>'''CMIS Connection'''</big></u>: {{#lst:Glossary|CMIS Connection}}
<u><big>'''CMIS Export'''</big></u>: {{#lst:Glossary|CMIS Export}}
<u><big>'''CMIS Import'''</big></u>: {{#lst:Glossary|CMIS Import}}
<u><big>'''CMIS Query'''</big></u>: {{#lst:Glossary|CMIS Query}}
<u><big>'''CMIS Repository'''</big></u>: {{#lst:Glossary|CMIS Repository}}
<u><big>'''CMIS'''</big></u>: {{#lst:Glossary|CMIS}}
<u><big>'''Content Category'''</big></u>: {{#lst:Glossary|Content Category}}
<u><big>'''Content Model'''</big></u>: {{#lst:Glossary|Content Model}}
<u><big>'''Content Type'''</big></u>: {{#lst:Glossary|Content Type}}
<u><big>'''Data Element'''</big></u>: {{#lst:Glossary|Data Element}}
<u><big>'''Data Export'''</big></u>: {{#lst:Glossary|Data Export}}
<u><big>'''Data Field'''</big></u>: {{#lst:Glossary|Data Field}}
<u><big>'''Data Model'''</big></u>: {{#lst:Glossary|Data Model}}
<u><big>'''Data Section'''</big></u>: {{#lst:Glossary|Data Section}}
<u><big>'''Data Table'''</big></u>: {{#lst:Glossary|Data Table}}
<u><big>'''Document Type'''</big></u>: {{#lst:Glossary|Document Type}}
<u><big>'''Document Viewer'''</big></u>: {{#lst:Glossary|Document Viewer}}
<u><big>'''Execute'''</big></u>: {{#lst:Glossary|Execute}}
<u><big>'''Export Behavior'''</big></u>: {{#lst:Glossary|Export Behavior}}
<u><big>'''Export Definition'''</big></u>: {{#lst:Glossary|Export Definition}}
<u><big>'''Export'''</big></u>: {{#lst:Glossary|Export}}
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
<u><big>'''Extractor Type'''</big></u>: {{#lst:Glossary|Extractor Type}}
<u><big>'''FTP'''</big></u>: {{#lst:Glossary|FTP}}
<u><big>'''IMAP'''</big></u>: {{#lst:Glossary|IMAP}}
<u><big>'''Import Query Results'''</big></u>: {{#lst:Glossary|Import Query Results}}
<u><big>'''Labeled Value'''</big></u>: {{#lst:Glossary|Labeled Value}}
<u><big>'''Labeling Behavior'''</big></u>: {{#lst:Glossary|Labeling Behavior}}
<u><big>'''Labelset-Based'''</big></u>: {{#lst:Glossary|Labelset-Based}}
<u><big>'''Project'''</big></u>: {{#lst:Glossary|Project}}
<u><big>'''Render'''</big></u>: {{#lst:Glossary|Render}}
<u><big>'''Repository'''</big></u>: {{#lst:Glossary|Repository}}
<u><big>'''Review'''</big></u>: {{#lst:Glossary|Review}}
<u><big>'''Scope'''</big></u>: {{#lst:Glossary|Scope}}
<u><big>'''SFTP'''</big></u>: {{#lst:Glossary|SFTP}}
<u><big>'''SharePoint'''</big></u>: {{#lst:Glossary|SharePoint}}
<u><big>'''Tabular Layout'''</big></u>: {{#lst:Glossary|Tabular Layout}}
<u><big>'''Transaction Detection'''</big></u>: {{#lst:Glossary|Transaction Detection}}

Revision as of 09:02, 26 August 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

202520232021
Behaviors are like telling the mind what to let in and out.

Behaviors refer a group of functionality configured using a Content Type's Behaviors property. Behaviors enable different features for how documents of a specific Content Type are processed and define their settings. This includes how they are exported, if Label Sets are used for the Document Type and more.

You may download and import the file(s) below into your own Grooper environment (version 2023). There is a Batch with the example document(s) discussed in this tutorial, as well as a Project configured according to its instructions. Given the proprietary nature of SharePoint and Database connections, the connection objects and their configuriations cannot be shared.
Please upload the Project to your Grooper environment before uploading the Batch. This will allow the documents within the Batch to maintain their classification status.

About

Behaviors are born of the idea that consolidating the control of the flow of data to the objects most relevant to its collection and delivery (Content Types{Content Model, Content Category, Document Type} and the Export activity) creates a more streamlined and effective Grooper experience. If the Content Type controls all the relevant information about how documents are organized and what is to be collected from them, it is logical that controlling how those documents come into Grooper, and how they get out, should also be defined by the same object. It is also a much cleaner experience to not have multiple steps in a Batch Process to cover the types of Export activities (Legacy Export, Mapped Export, Database Export), but instead a single Export activity that leverages the Behaviors of the Content Type, Behaviors defined on the Export activity, or both.

If you take CMIS Connections, for example, Grooper previously created an unnecessary amount of objects to get to the CMIS Content Type. The mappings for importing and exporting were also set here, which made for more places which a person need to go to configure the flow of data into and out of Grooper. It also created a bottle neck of control because if you wanted to have disparate Content Types use a similar CMIS connection, but the import/export mappings inherently needed to be different, you would have had to create a separate CMIS Content Type. Now, one CMIS Repository object can exist, and the import/export mechanisms are controlled via Behaviors (again, defined on Content Types or Export activities.)

Types of Behaviors

There are four types of Behaviors that can be added to a Content Type or Export activity.

Import

Import Behaviors control the flow of documents into Grooper by establishing communication, an Import Definition (of which multiple can exist), between a CMIS Repository object and a Grooper Content Type.

CMIS Content Type objects no longer exist in Grooper 2021, therefore Import mappings are no longer set there. They are instead set on an Import Behavior of a Content Type. This elminates the need for multiple CMIS Conent Type objects to exist for the individual needs of discrete Content Types, and instead consolidates the control to the object most pertinent to the import functionality.

Export

Export Behaviors control the flow of documents out of Grooper by defining export connectivity to external systems such as file systems (directly through a File Export definition, or a CMIS Repository object), content management systems (via CMIS Repository objects), Database Tables, mail servers (directly through an IMAP Export definition, or a CMIS Repository object), or FTP/SFTP servers (directly through FTP/SFTP Export definitions, or a CMIS Repository object).

CMIS Content Type objects no longer exist in Grooper 2021, therefore Export mappings are no longer set there. They are instead set on the Export Behavior of a Content Type , or a Batch Process Step set to Export on its Activity Type property. This eliminates the need for multiple CMIS Content Type objects to exist for the individual needs of discrete Content Types, and instead consolidates the control to the object most pertinent to the export functionality.

Text Rendering

Text Rendering Behaviors control the pagination, or lack thereof, of text based documents (like a .txt file) in Grooper. In Grooper 2.9 and before, text based documents were rendered, by default, as documents, and viewed as such in the document viewer. The rendered document was made, or paginated, to be as wide as the longest line of text.

In Grooper 2021, by default, text documents are handled natively and viewed in the document viewer as you would the text based document in any other normal text document handling software, like Notepad. In order to view a text based file as a paginated document you must add a Text Rendering behavior to a Content Type and classify it appropriately.

Labeling

The Labeling Behavior is a Content Type Behavior designed to collect and utilize a document's field labels in a variety of ways. This includes functionality for classification and data extraction.

The Labeling Behavior functionality allows Grooper users to quickly onboard new Document Types for structured and semi-structured forms, utilizing labels as a thumbprint for classification and data extraction purposes. Once the Labeling Behavior is enabled, labels are identified and collected using the "Labels" tab of Document Types. These "Label Sets" can then be used for the following purposes:

  • Document classification - Using the Labelset-Based Classification Method
  • Field based data extraction - Using the Labeled Value Extractor Type
  • Tabular data extraction - Using a Data Table object's Tabular Layout Extract Method
  • Sectional data extraction - Using a Data Section object's Transaction Detection Extract Method

Understanding the full functionality of the Labeling Behavior is worthy of it’s own article, which can be found by clicking this link.

PDF Data Mapping

The PDF Generate Behavior is a Content Type Behavior designed to create an exportable PDF file with additional native PDF elements, using the classification and extraction content of a Batch Folder. This includes capabilities to export extracted data as PDF metadata, inserting bookmarks, and creating PDF annotations, such as highlighting, checkbox and signature widgets.

Understanding the full functionality of the PDF Data Mapping is worthy of it’s own article, which can be found by clicking this link.

How To: Import and Export Behaviors

Following is a walkthrough demonstrating the setup of Behaviors using a simplified use-case utilizing SharePoint and a database table. It is important to understand that due to the self-contained nature of the aforementioned utilities that this walkthrough will require some translation on the reader's part.

Foundations

Understanding the infrastructure of what's being used for the following walkthrough will help the example make sense.

SharePoint

This test Sharepoint enviornment has three documents it in. The First Name and Last Name fields are left intentionally blank. The Employee ID fields are filled with the intention of reading those values into Grooper upon import. Two of the documents have the DocType fields filled, while the third does not. This will be used to demonstrate how documents can be classified with results from the CMIS repository, while also showing the default behavior otherwise. The Imported fields are set to No. The query used to import the documents will look for the No result to know what to import. Upon importing into Grooper this value will be read and simultaneously changed to InProcess to prevent further pickup. Finally, on export the fields will be written to with Yes to signify they’ve been processed.

Document Understanding

The documents are populated with the first and last names of the employee, which will be extracted and later exported back to the SharePoint to populate the First Name and Last Name columns. The Earnings table will be extracted by Grooper and sent to a database table. The information will be flattened out with the Employee ID read from the SharePoint and sent with the table information so that the SharePoint and database information can be related.

CMIS Repository

  1. Here the imported CMIS Repository is selected in Grooper Design Studio.
  2. On the Type Definitions tab...
  3. ...you can see that because this particular subsite of the SharePoint environment is referenced by a Behavior that the icon has a green dot and...
  4. ...with it selected you can see the stucture of the SharePoint subsite like the column names and their properties.

Database

  1. An imported database table is selected here in Grooper Design Studio.
  2. You can view the structure of the table and see that it includes what we saw from the Earnings table from the document, along with the Employee ID to marry rows from this table back to the SharePoint. In the Data Preview below, you’ll notice there is no data yet.

Data Model & Extraction

  1. Here is the sample Data Model selected with three Data Fields and the accompanying Data Table.
  2. It is configured properly as you can see the successfully extracted information displayed in the Data Model Test Results area.
  3. You can also see the document structure of the extracted information displayed in the Document Viewer.

Batch Process

  1. The Batch Process highlighted is a very simple model for this example setup.
  2. The main thing about this setup is to demonstrate the Export Activity.

Import Behavior Setup

In this section we will cover how to configure an Import Behavior with both read and write properties.

Behaviors

  1. Here we have selected a Content Type, specifically the Content Model.
  2. Select the Behaviors property and click the ellipsis button.
  3. This will bring up the Behaviors - List of Behaviors menu from which you can click the Add drop-down and select Import Behavior.

Import Definitions

  1. Select the Import Definitions property and click the ellipsis button.
  2. This will bring up the Import Definitions - List of Import Defintion menu from which you can click the Add drop-down and select CMIS Import Definition.


  1. With the CMIS Import Definition added you can select the CMIS Repository property and choose an imported CMIS Repository object.


  1. After selecting a CMIS Repository you can select the CMIS Content Type property and choose, in this case of SharePoint, a subsite document library.

Read Mappings

A critical aspect to understand here is document classification upon import. By setting the Read Mappings property you are telling Grooper to populate a Data Element with information. In doing so, you are forcing Grooper to associate the incoming document with the Contnet Type related to the Data Model that owns that Data Element. In short, the incoming documents will be classified as BehavioursContentModel by default.


  1. Select Read Mappings property and click the ellipsis button.
  2. In the CMIS Import Map window you can set which fields you want to map. Here we set the Employee ID field to match the Employee_ID column from SharePoint.

||


The Document Type Name property allows us to read a value from the CMIS Repository and classify documents as a result. This is very useful, and something completely new to this version of Grooper.

If you recall, there are two documents in the SharePoint with the DocType column populated: one test, the other test2. There are two Document Types in the sample Content Model called test and test2. As a result, this read mapping will allow those two documents to come in classified as those Document Types. The third document, having no DocType value from SharePoint, will come in classified as BehavioursContentModel.

A word of caution: If the Document Type Name property is ever set, any and all documents that are brought in via Import Behaviors that leverage the same CMIS Repository will be classified as the Content Type whose Import Behavior has this property set.


  1. Here we are setting the Document Type Name field to match the DocType column from SharePoint.

Write Mappings

We will be looking for documents from the SharePoint that have the Imported value of No. This lets us know the documents have not been processed by Grooper yet. Immediately upon import we will use the Import Write Mappings property to change the values of the Imported column from No to InProcess.


  1. Select the Write Mappings property and click the ellipsis button to bring up the CMIS Update Map window. Here we are setting the Imported field to the expression "InProgress" which will insert it as a string in SharePoint upon import.

Export Behaviors Setup

In this section we will cover how to configure an Export Behaviors for both a Content Type and a Batch Process Step configured for Export. While all the Export Mappings in this sample use case could be set on the Content Model, this walkthrough includes an Export Mapping in the Batch Process to show how both can be leveraged.

Not all types of Export Behaviors are covered (File, SFTP, FTP, and IMAP are not included.) If you feel, after reading this article, that one of the other types should be covered because it is complex enough to warrant a walkthrough, please contact the Training and Education team.

Content Model Export Behavior

The Export Behavior used here is concerned with exporting the document to the CMIS Repository.


  1. With the Content Model selected, click the Behaviors property then click the ellipsis button.
  2. In theBehaviors - List of Behaviors window click the Add drop-down and select Export Behavior.


  1. Select the Export Definitions property and click the ellipsis button.
  2. In the Export Definitions - List of Export Definitions window select the Add drop-down and select CMIS Export.


  1. First we set our CMIS Repository property, then select the Target Folder property and click the ellipsis button.
  2. In Select CMIS Folder window select the CMIS Content Type, in this case we choose a SharePoint sub-site document library, that represents where the documents will be exported.


  1. Select the Write Mappings property and click the ellipsis button.
  2. In the CMIS Export Map window you can map your fields to corresponding value containers of the CMIS connection. Here we auto-mapped to the columns of the SharePoint document library, since they are named the same.

Batch Process Export Behavior

The Export Behavior used here is concerned with exporting tabular information to a Database Table. Again, this is an established, however simple, Batch Process with an Batch Process/Export step/activity that is unconfigured.


  1. Select the Batch Process Step called Export.
  2. Select the Export Behaviors property and click the ellipsis button.
  3. In the Export Behaviors - List of Export Behaviors window click the Add button.
  4. With the Content Type property selected, click the drop-down and select the Content Type desired, in our case BehavioursContentModel.


  1. Select the Export Definitions property and click the ellipsis button.
  2. In the Export Definitions - List of Export Definitions window click the Add drop-down and select the desired export type, in our case Data Export.


  1. Select the Connection property and in the drop-down menu select the desired Database Connection object, in our case BehavioursDataConnection.


  1. Select the Table Mappings property and click the ellipsis button.
  2. In the Table Mappings - List of Table Mappings window select the Source Scope property and in the drop-down menu select the desired scope, in our case the Earnings table.
    Keep in mind, setting the scope to this object will allow not only the contents of the table to be collected, but it will also flatten the fields about it into the exported rows.


  1. Select the Target Table property and in the drop-down menu select the desired Database Table object, in our case dbo.BehavioursExport.


  1. Select the Column Mappings property and click the ellipsis button.
  2. In the Column Map window you can map your columns and fields to the corresponding value containers of the data connection. Here we auto-mapped to the columns of the SharePoint document library, since they are named the same.


The following property is important because it determines priority of Behaviors when sharing between a Content Type and Batch Process Step. Shared refers to the Behaviors of the Content Type, while Local refers to the Behaviors of the Batch Process Step.

  1. Select the Shared Behavior Mode property and select the desired share behavior, in our case SharedOrLocal.
  2. View the help file to understand which option you may select.

Execution and Results

In this section we'll take a look at the results by running a manual import with a CMIS Query and associating it with the sample Batch Process.


  1. To take advantage of automated Activity Processing services, select the Production branch of the Batches folder.
  2. In the Batch drop-down select the desired CMIS Import property, in our case Import Query Results....

Import Query

  1. In the Import Query Results window select the CMIS Query property and click the ellipsis button.


  1. In the CMIS Query Editor window select the Primary Content Type property and select the desired CMIS Content Type from the drop-down menu.


  1. Using either the Select Elements and Where Elements properties, or writing it manually, you can establish a query.
  2. Here we are selecting all documents from the SharePoint document library that have the Imported column set to No.
  3. Execute Query to see the results in the List View.


  1. In the Import Query Results window select the Disable Select Clause Optimization property and set it to the desired setting, in our case True.
  2. This will allow the query to run more quickly/smoothly if set to True.
  3. Be sure to set any remaining properties like the Start Step in our case so it’s associated with a Batch Process.
  4. Analyze first if you want, otherwise Start Import.
  5. Observe the given results and close the window.

Reviewing Classification

  1. Given the simple nature of the Batch Process, we are jumping to a point that has something pertinent to show. Here we are observing that the Document Type Name property, set previously in our Import Mappings, has successfully classified these two documents.


  1. The third document did not have its DocType column set in SharePoint, so it came in with the default Content Type of the associated Content Model.

Viewing Results

Here in the Data Review area you can see that all the data was successfully extracted.


Viewing the SharePoint site we can see that export finished and successfully upated all the empty columns, and set the Imported column values to Yes, meaning they have completed processing and will not be picked up by another import query.


Back in Grooper you can select the associated Database Table object and see the newly exported results in the Data Preview area.

Text Rendering Setup

In this section we well look at the steps required to establish a Text Rendering behavior.

Default Text Document View

By default, when text based documents are viewed in Grooper you will see them as if you would in any other text editing application, which can be obvious when there is no word wrapping. No steps are required to view the text based document in this way.

Non Word-Wrapped Text Document

In previous versions of Grooper native text files were rendered as paginated documents without word wrapping (the rendered document would be as wide as the longest line of text.) To mimic this style, use the following steps.


  1. Select the Content Type to which you will be applying the Text Rendering behavior. In this case, the Content Model.
  2. Select the Behaviors property and click the ellipsis button.


  1. In the Behaviors - List of Behaviors collection editor click Add and select Text Rendering.


  1. Select the newly added Text Rendering behavior and...
  2. ...set the Page Width property to be blank and click OK.


  1. Select your Batch.
  2. Go to the Batch Viewer tab.
  3. Select the text based document and right-click to bring up the object command menu.
  4. Select Assign Document Type.


  1. Select the desired Content Type.


  1. In the Document Viewer notice that line one is not word wrapping.
  2. Notice here in the top right of the Document Viewer we are viewing an image that has been rendered, not the native text file.

Word-Wrapped Document

With a Text Rendering behavior you can not only view a native text document as a non-standard, paginated image, but you can force a document dimension and cause lines of text to wrap accordingly.


  1. Select the Content Type to which you will be applying the Text Rendering behavior. In this case, the Content Model.
  2. Select the Behaviors property and click the ellipsis button.


  1. In the Behaviors - List of Behaviors collection editor select the Text Rendering behavior and...
  2. ...set the Page Width property back to its defaul of 8.5in.


  1. Select your Batch.
  2. Go to the Batch Viewer tab.
  3. Select the text based document.
  4. Notice in the Document Viewer line 1 is word-wrapping.
  5. Notice at the top right of the Document Viewer we are viewing an image that has been rendered, not the native text document.

Glossary

Activity Processing: Activity Processing is a Grooper Service that executes Activities assigned to edit_document Batch Process Steps in a settings Batch Process. This allows Grooper to automate Batch Steps that do not require a human operator.

Activity Processing: Activity Processing is the execution of a sequence of configured tasks which are performed within a settings Batch Process to transform raw data from documents into structured and actionable information. Tasks are defined by Grooper Activities, configurated to perform document classification, extraction, or data enhancement.

Activity: Activity is a property on edit_document Batch Process Steps. Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. Batch Process Steps configured with specific Activities are frequently referred by the name of the Activity followed by the word "step". For example: Classify step.

AND: AND is a Collation Provider option for pin Data Type extractors. AND returns results only when each of its referenced or child extractors gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.

Batch Folder: folder Batch Folder objects are defined as container objects within a inventory_2 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or contract Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.

  • Batch Folders are frequently referred to simply as "documents".

Batch Process Step: edit_document Batch Process Step objects are specific actions within the sequence defined by a settings Batch Process. A Batch Process Step plays a critical role in automating and managing the flow of documents through the various stages of processing within Grooper.

  • Batch Process Steps are frequently referred to as simply "steps".
  • Because a single Batch Process Step executes a single Activity configuration, they are often referred to by their referenced Activity as well. For example, a "Recognize step".

Batch Process: settings Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each inventory_2 Batch of documents is managed systematically and efficiently.

  • Batch Processes by themselves do nothing. Instead, the workflows they execute are designed by adding child edit_document Batch Process Steps.
  • A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as settings Batch Processes.

Behavior: Behaviors refer a group of functionality configured using a Content Type's Behaviors property. Behaviors enable different features for how documents of a specific Content Type are processed and define their settings. This includes how they are exported, if Label Sets are used for the Document Type and more.

Classification Method: A stacks Content Model's Classification Method property determines the technique used for document classification. Classification sorts folder Batch Folders into categories (called "description Document Types"). Grooper's various Classification Methods can utilize text-based pattern matching, machine learning models, or other methodologies to identify and organize documents accurately.

Classification: Classification is the process of identifying and organizing documents into categorical types based on their content or layout. Classification is key for efficient document management and data extraction workflows. Grooper has different methods for classifying documents. These include methods that use machine learning and text pattern recognition. In a Grooper Batch Process, the Classify Activity will assign a Content Type to a folder Batch Folder.

CMIS Connection: cloud CMIS Connection node objects provide a standardized way of connecting to various content management systems (CMS). These objects allow Grooper to communicate with multiple external storage platforms, enabling access to documents and content that reside outside of Grooper's immediate environment.

  • For those that support the CMIS standard, the CMIS Connection connects to the CMS using the CMIS standard.
  • For those that do not, the CMIS Connection normalizes connection and transfer protocol as if they were a CMIS platform.

CMIS Export: CMIS Export is an Export Definition available when configuring an Export Behavior. It exports content over a cloud CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.

CMIS Import: CMIS Import refers to two Import Providers used to import content over a cloud CMIS Connection: Import Descendants and Import Query Results. CMIS Imports allow users to import from various on-premise and cloud based storage platforms.

CMIS Query: A CMIS Query (aka CMISQL Query) is Grooper's way of searching for documents in CMIS Repositories and filtering them upon import when using the Import Query Results Import Provider. CMIS queries are based on a subset of the SQL-92 syntax for querying databases, with some specialized extensions added to support querying CMIS sources.

CMIS Repository: settings_system_daydream CMIS Repository node objects in Grooper allow access to external documents through a cloud CMIS Connection. They allows managing and interacting with those documents within Grooper's framework as if they were local. They are created as a child object of a CMIS Connection and used for various Activities.

CMIS: CMIS (Content Management Interoperability Services) is open standard allowing different content management systems to "interoperate", sharing files, folders and their metadata as well as programmatic control of the platform over the internet.

Content Category: collections_bookmark Content Category node objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.

Content Model: stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.

Content Type: Content Type refers to objects in Grooper used to classify folder Batch Folders. These include: stacks Content Models, collections_bookmark Content Categories, and description Document Types.

Data Element: Data Element refers to the objects in Grooper used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Data Export: Data Export is an Export Definition available when configuring an Export Behavior. It exports extracted document data over a database Data Connection, allowing users to export data to a Microsoft SQL Server or ODBC compliant database.

Data Field: variables Data Field node objects are created as child objects of a data_table Data Model. A Data Field is a representation of a single piece of data targeted for extraction on a document.

Data Fields are frequently referred to simply as "fields".

Data Model: data_table Data Model node objects serve as the top-tier structure defining the taxonomy for Data Elements and are leveraged during the Extract Activity to extract data from a folder Batch Folders. They are a hierarchy of Data Elements that sets the stage for the extraction logic and review of data collected from documents.

Data Section: insert_page_break Data Section objects are grouping mechanisms for related variables Data Fields. Data Sections organize and segment child Data Elements into logical divisions of a document based on the structure and semantics of the information the documents contain.

Data Table: table Data Table objects are utilized for extracting repeating data that's formatted in rows and columns, allowing for complex multi-instance data organization that would be present in table-formatted content.

Document Type: description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.

Document Viewer: The Grooper Document Viewer is the portal to your documents. It is the UI that allows you to see a folder Batch Folder's (or a contract Batch Page's) image, text content, and more.

Execute: tv_options_edit_channels Execute is an Activity that runs one or more specified object commands. This gives access to a variety of Grooper commands in a settings Batch Process for which there is no Activity, such as the "Sort Children" command for Batch Folders or the "Expand Attachments" command for email attachments.

Export Behavior: An Export Behavior defines the parameters for exporting classified folder Batch Folder content from Grooper to other systems. This includes where they are exported to (what content management system, file system, database etc), what content is exported (attached files, images, and/or data), how it is formatted (PDF, CSV, XML etc), folder pathing, file naming and data mappings (for Data Export and CMIS Export).

Export Definition: Export Behaviors are defined by adding and configuring one or more Export Definitions. An Export Definition defines export parameters to external systems, such as file systems, content management repositories, databases, or mail servers.

Export: output Export is an Activity that transfers documents and extracted information to external file systems and content management systems, completing the data processing workflow.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

Extractor Type: An Extractor Type (shorthand for Value Extractor Type) is configured for numerous properties on a wide array of Grooper objects. They are used to return "data instances" from documents for one purpose or another. The Extractor Type defines an operation that reads data from the text or visual content of a document and returns one or more results. Each different Extractor Type uses a specialized logic to return results. Extractor Types are consumed by higher-level objects such as Data Elements, extractor objects, Content Types and more.

FTP: FTP is a CMIS Connection Type that connects Grooper to FTP directories for import and export operations.

IMAP: IMAP is a CMIS Connection Type that connects Grooper to email messages and folders through an IMAP email server for import and export operations.

Import Query Results: Import Query Results is one of two Import Providers that use cloud CMIS Connections to import document content into Grooper. Import Query Results imports files or folders in a settings_system_daydream CMIS Repository that match a "CMISQL query" (a specialized query language based on SQL database queries).

Labeled Value: Labeled Value is an Extractor Type that identifies and extracts a value next to a label. This is one of the most commonly used extractors to extract data from structured documents (such as a standardized form) and static values on semi-structured documents (such as the header details on an invoice).

Labeling Behavior: A Labeling Behavior is a Content Type Behavior designed to collect and utilize a document's field labels in a variety of ways. This includes functionality for classification, field extraction, table extraction, and section extraction.

Labelset-Based: Labelset-Based is a Classification Method that leverages the labels defined via a Labeling Behavior to classify folder Batch Folders.

Project: package_2 Project node objects are the primary containers for configuration nodes within Grooper. The Project is where various processing objects such as stacks Content Models, settings Batch Processes, profile objects, and more are organized and managed. It allows for the encapsulation and modularization of these resources for easier management and reusability.

Render: print Render is an Activity that converts files of various formats to PDF. It does this by digitally printing the file to PDF using the Grooper Render Printer. This normalizes electronic document content from file formats Grooper cannot read natively to PDF (which it can read natively), allowing Grooper to extract the text via the format_letter_spacing_wide Recognize Activity.

Repository: A "repository" is a general term in computer science referring to where files and/or data is stored and managed. In Grooper, the term "repository" may refer to:

Review: person_search Review is an Activity that allows user attended review of Grooper's results. This allows human operators to validate processed contract Batch Page and folder Batch Folder content using specialized user interfaces called "Viewers". Different kinds of Viewers assist users in reviewing Grooper's image processing, document classification, data extraction and operating document scanners.

Scope: The Scope property of a edit_document Batch Process Step, as it relates to an Activity, determines at which level in a inventory_2 Batch hierarchy the Activity runs.

SFTP: SFTP is a CMIS Connection Type that connects Grooper to SFTP directories for import and export operations.

SharePoint: SharePoint is a CMIS Connection Type that connects Grooper to Microsoft SharePoint, providing access to content stored in "document libraries" and "picture libraries" for import and export operations.

Tabular Layout: The Tabular Layout Table Extract Method uses column header values determined by the view_column Data Columns Header Extractor results (or labels collected for the Data Columns when a Labeling Behavior is enabled) as well as Data Column Value Extractor results to model a table's structure and return its values.

Transaction Detection: Transaction Detection is a insert_page_break Data Section Extract Method. This extraction method produces section instances by detecting repeating patterns of text around the Data Section's child variables Data Fields.