What's New in Grooper 2021: Difference between revisions

From Grooper Wiki
Line 53: Line 53:


=== Introducing... API! ===
=== Introducing... API! ===
Beginning in 2021, Grooper offers a RESTful Document Ingestion API.  The document ingestion API provides the ability to create and populate batches, and the ability to monitor the status of batch processes, and retrieve results. It allows users to create dashboards or portals that interface with existing processes, including allowing them to build portals that feed documents into a Grooper process, or dashboards that display, and change extracted values.
The API Has some other capabilities, such as the ability to ingest compressed archives of Grooper notes (which could assist in automation of new repository population) and the ability to query certain pieces of information from the repository.


=== Data Extraction Improvements ===
=== Data Extraction Improvements ===

Revision as of 12:20, 30 August 2021


Welcome to Grooper 2021!

Introducing... Behaviors!

Behaviors are a new set of features designed to centralize the Content Model as the main hub controlling various aspects of document processing. Behaviors are born of the idea that consolidating the the flow of document data to the objects most relevant to its collection and delivery makes for a more streamlined and effective Grooper experience.

This allows a Content Model (and its component Content Types) to wrest control from various other disparate Activities, centralizing command of how documents and their data are modeled and what happens to that data once collected. The result is more focused control around how document data is imported, organized, collected, and exported by a Content Model. In other words, how it "behaves".

The following Behavior Types are introduced in 2021:

  • Import Behavior
  • Export Behavior
  • Labeling Behavior
  • PDF Data Mapping
  • Text Rendering

Introducing... Label Sets!

The Labeling Behavior functionality represents a huge change in how document content can be modeled and collected for structured and semi-structured document sets. It capitalizes on the utility labels provide to understand a document and its data. Grooper collects and uses "Label Sets" for each Document Type for a variety of document processing purposes, including:

  • Document classification - Using the Labelset-Based Classification Method
  • Field based data extraction - Primarily using the Labeled Value Extractor Type
  • Tabular data extraction - Primarily using a Data Table object's Tabular Layout Extract Method
  • Sectional data extraction - Primarily using a Data Section object's Transaction Detection Extract Method

"Label Sets" offer vast improvements to these areas, both simplifying setup and allowing for quicker onboarding of new Document Types for structured and semi-structures forms.

Introducing... PDF Data Mapping!

The PDF Data Mapping functionality is part of the foundation for Grooper's "Smart PDF" architecture. The "Smart PDF" architecture's goal is to unify document content into a single source. Too often it is the case document content is divided in two, with the image-based and text content being represented as a PDF file and the data content living in a database or other content management platform.

PDF Data Mapping allows Grooper to store data content directly to the PDF itself, including separation and classification data as well as Data Fields from a Data Model. This way, even if you do store document data in a database, the document itself retains all the information Grooper collected inside the PDF itself as well.

The PDF Data Mapping functionality includes the ability to embed PDFs with the following data:

  • Metadata
  • Bookmarks
  • Annotations

Introducing... Data Rules!

The Data Rule is a new object available in Grooper 2021. Data Rules allow for complex validation and manipulation of Data Elements in a Data Model. This allows users to create a conditional hierarchy of actions to take if certain conditions are met. This includes clearing, copying, appending, parsing and calculating values based on a series of expression based conditions. Data Rules expand on simpler validation and calculation methods available to Data Element objects, and allow for more simplified setup and net new capabilities for more complicated data normalization projects.

There are also two new Batch Processing Activities that apply Data Rules as well:

  • Apply Rules
  • Convert Data

Introducing... API!

Beginning in 2021, Grooper offers a RESTful Document Ingestion API. The document ingestion API provides the ability to create and populate batches, and the ability to monitor the status of batch processes, and retrieve results. It allows users to create dashboards or portals that interface with existing processes, including allowing them to build portals that feed documents into a Grooper process, or dashboards that display, and change extracted values.

The API Has some other capabilities, such as the ability to ingest compressed archives of Grooper notes (which could assist in automation of new repository population) and the ability to query certain pieces of information from the repository.

Data Extraction Improvements

  • Constrained Wrap for easier pattern matching for data constrained in a box (think table cells).

Changes to Document Export and Database Export

Goodbye Document Export and Database Export... Hello Export!

In 2021, we heavily reworked Grooper's document and data export functionality, to improve the process and allow for new functionality. As part of this process, we unified Document Export and Database Export into a single Activity: Export

Export is now the single Activity driving all export operations in Grooper. Whether exporting PDFs to a content management system, exporting data to a database, or any content to any external storage platform, Export is your way to go.

Goodbye CMIS Content Types... Hello Import and Export Behaviors!

One big change to how things were done before 2021 is how data is mapped according to its Data Model structure to or from an external storage platform upon document import or export. Previously, these mappings were configured using CMIS Content Type objects, created as children of a CMIS Connection.

In 2021, the CMIS Connection object purely serves the function of integrating Grooper with an external storage platform. Import and export mappings are defined using Import or Export Behaviors. This removes some unnecessary object bloat around the CMIS Connection object and lets the Content Model and Document Types drive their associated Data Model mappings.

Import and Export Behaviors are configurable via:

  • Content Models or Content Categories or Document Types
  • The Export Activity (in the case of export related mappings only)

Install and Setup Changes


Miscellaneous

  • Changes to Content Action
  • Document Viewer improvements
  • Text file improvements