What's New in Grooper 2023.1: Difference between revisions

From Grooper Wiki
Line 182: Line 182:
Currently, we are implementing this functionality to move certain table extraction properties from the '''Data Table''' to relevant child '''Data Columns'''.
Currently, we are implementing this functionality to move certain table extraction properties from the '''Data Table''' to relevant child '''Data Columns'''.
{|
{|
|valign=top|
|valign=top style="width:50%"|
For '''''Tabular Layout''''':
For '''''Tabular Layout''''':
* The '''''Column Settings''''' properties no longer exist as a property collection editor on the '''Data Table's''' '''''Tabular Layout''''' property panel.
* The '''''Column Settings''''' properties no longer exist as a property collection editor on the '''Data Table's''' '''''Tabular Layout''''' property panel.

Revision as of 15:07, 23 October 2023

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

Welcome to Grooper 2023.1!


Grooper version 2023.1 is here!

Below you will find brief descriptions on new and/or changed features. When available, follow any links to extended articles on a topic.

Expanded Web Client

Light Mode

Let there be light (mode)!

Users can now toggle between dark mode and light mode in the Grooper web client.

  • Use the user information icon to switch from dark to light mode.
  • Custom Data Model styling supported for both dark and light modes via CSS classes.
    • "dark-mode" for dark mode
    • "lite-mode" for light mode

Improved Scripting for Web Client

Scripting for the web client is better than ever. Grooper users can now use the web client as their debug target thanks to the GrooperSDK Visual Studio extension.

  • The GrooperSDK extension is available for download in the Visual Studio Marketplace.
  • This extension allows users to set a web browser as their debug target.
  • Users can now download scripts and work on them independently of Grooper, saving changes directly from Visual Studio using the GrooperSDK extension.


Furthermore, users can edit scripts from any machine, not just the Grooper web server, with the following components must be installed:

  • Grooper (with a Repository connection made to the Grooper Repository hosted by the web server)
  • Grooper Web Client
  • IIS
  • Visual Studio 2019
  • GrooperSDK

For more information, be sure to check out the "Web Scripting" article COMING SOON!!.

New and Improved Processing Features

Secondary Types

Secondary Types is a new property assignable to Batch Folders. This feature allows a single document to be assigned multiple Document Types. Secondary Types can be used with the following activities:

  • Classify to save a classification result as a Secondary Type.
  • Extract to run extraction for all assigned types
  • Apply Rules to run Data Rules for all assigned types.
  • Convert Data to save converted data as a Secondary Type.
  • Export to run Export Definitions for all assigned types.


Multiple Document Types means multiple Data Models! This means you can collect data using more than just one Data Model. The Data Grid will display data for all assigned Types in Review

  • FYI: If desired, you can configure the Data Viewer's 'Content Type Filter property to limit display to specific Document Types.


For more information, be sure to check out the "Secondary Types" article COMING SOON!!

EDI Support

Grooper can create Data Models directly from an X12 EDI schema and load data from an EDI file directly into a Grooper document.

  • Data Elements adhering to an EDI schema can be created by right-clicking a Data Model, selecting Import Schema... and configuring the EDI Schema Importer.
  • Data from an EDI file can be loaded into Grooper using the Execute activity, configured with an EDI File - Load Data command.


Currently, Grooper can process the following EDI schemas natively:

  • X12 835
  • X12 837 Professional
  • X12 837 Dental
  • X12 837 Institutional

FYI: X12 refers to the organization developing and maintaining the EDI standards. X12 is the common EDI standard in the US and North America.

XML Data Interchange

Grooper users can now ingest and build XML files more effectively with our suite of "XML Data Interchange" capabilities. With this new funcitonality users can:

  • Create Data Elements from an XML schema file right-clicking a Data Model, selecting Import Schema... and configuring the XML Schema Importer.
  • Load data from an XML file that adheres to an XML schema using the Execute activity, configured with an XML File - Load Data command.
  • Validate an XML document against an XML schema using the Execute activity, configured with an XML File - Validate Schema command.
  • Generate an XML document according to an XML schema using a new Merge/Export Format (called XML Format).

New Node Type: Resource Files

Resource Files allow users to store any kind of file in a Project. Simply drag a file from your file system and drop it into a Project (or folder in a Project).


Current uses include:

  • XML schema files
    • The XML Schema Importer can create a Data Model from an XML schema file.
    • The XML Document > Load Data command can load data from an XML file adhering to an XML schema.
    • The XML Format can create an XML file adhering to an XML schema using the Merge or Export activities.
  • CSS files
    • Rather than configuring CSS styling at the root of a Data Model, one or more Data Models can reference styling declared in a single CSS file.
  • Text files
    • Store project notes, instructions, "read me" documentation, or any other information in a text file saved to your Project.

Expression Variables

Simplify expressions with Variables.

Data Models, Data Sections and Data Tables have a new Variables property. Users can add a list of expression variables, each having a name, a value type and an expression.


Variables can be used when:

  • Writting other expressions (such as Calculated Value expressions on Data Fields or a Data Rule's Trigger or Action configuration)
  • Configuring Lookup queries
  • Crafting mapping expressions for exporting data


Variables can break down complex calculations into simpler operations and cut down on the need for hidden fields that act as placeholders for further calculation down the road.

Lookup Improvements

New Lookup Type: XML Lookup

The new XML Lookup allows users to perform a lookup against data in an XML file. It uses XPath selectors with Grooper variables to select data in an XML hierarchy. This gives Grooper users a mechanism to query data that lies somewhere between a Lexicon Lookup and a Database Lookup. It is more capable than a Lexicon Lookup. XML data is more complex than the simple "key-value" list you can query in a Lexicon. However, XML data is not as complex as a relational database. The XML Lookup gives you an option when you need to query fairly static data that lies between a Lexicon and a database in terms of its complexity.

Please Note! For most lookups, you reference Grooper Data Elements with the "@" symbol. However, "@" is already part of the XPath syntax. Data Elements are instead referenced with the "%" symbol for XML Lookups.

New Trigger Mode: Custom

A new Custom option for a lookup's Trigger Mode now allows for conditional lookup execution. Selecting Custom exposes a Trigger for the lookup. This uses an expression to define conditions when to execute the lookup.

This can make lookups more efficient, preventing a query from executing unless certain conditions are met. This can be particularly helpful when performing Database Lookups' that query larger databases during Review. If you don't have to execute a lookup unless certain conditions are met, save you and your reviewers time by only conditionally executing the query with our new Custom Trigger Mode.

This can also create opportunities for "waterfall" query approaches where if the first (generally more specific) query failed to produce a result, you can fall back on a second (generally looser) query.

Changes To Table Extraction

Header-Value Is Out. Tabular Layout Is In

In version 2021, the Tabular Layout table extraction method was created as an improved version of the Header-Value table extraction method. Tabular Layout's logic is similar to Header-Value in that it uses column header labels and value extractors to determine a table's structure and extract cell data. However, Tabular Layout is superior to Header-Value in the following ways:

  • Tabular Layout's initial setup is simpler.
  • Tabular Layout has better built in column and row detection.
  • Tabular Layout is more fully featured, capable of doing things Header-Value cannot.
  • Tabular Layout is Label Set supported.


For these reasons, Tabular Layout will fully replace Header-Value in version 2023.1. Any Data Table using Header-Value will be converted to Tabular Layout upon upgrade.

In the vast majority of cases, this will greatly improve your table extraction results, particularly if you familiarize yourself with all Tabular Layout can do for you.

Data Element Extensions

"Data Element Extensions" are a new way of handling properties for Data Elements given a parent Data Element's configuration. Given a parent Data Element configuration, properties can now extend to child Data Elements.

Currently, we are implementing this functionality to move certain table extraction properties from the Data Table to relevant child Data Columns.

For Tabular Layout:

  • The Column Settings properties no longer exist as a property collection editor on the Data Table's Tabular Layout property panel.
  • Instead, these properties now reside on each individual Data Column.
  • If a Data Table's Extract Method is set to Tabular Layout, each child Data Column will hav a set of properties called Tabular Layout Options.
    • Here, you can set the same Row Detection, Secondary Extract and Secondary Extract Mode' settings found previously in the Column Settings collection editor.

For Grid Layout:

  • The OCR Columns and OMR Columns properties no longer exist on the Data Table's Grid Layout property panel.
  • Instead, these properties no reside on each individual Data Column.
  • If a Data Table's Extract Method is set to Grid Layout, each child Data Column will have a set of properties called Grid Layout Options.
    • Here, you can configure the Read Method property to define f a Data Column is an "OCR Column" or an "OMR Column".

New Tabular Layout Property: Find Column Positions

Prescan Threshold for Labelset-Based Classification

Before 2023.1 the Labelset-Based classification method had a problem. It became less efficient the more Document Types were added to the Content Model (and therefore the more Label Sets were added). The more Document Types, the more Label Sets that must be loaded and checked against each document.

The new Prescan Threshold property available to 'Labelset-Based classification resolves this by only running Label Sets that are likely to succeed, not every one on each document. Prescan Threshold determines this based on word statistics. A minimum percent of Label Set words must be present on the document to be considered for classification. This has been demonstrated to speed up classification up to 15x what it was without a Prescan Threshold. This extends the Labelset-Based classification method to Content Models with thousands of Document Types.

New "Summary" Tabs

There are new and improved "Summary" tabs on Data Models, Data Elements, and Document Types in the web client. These summaries give Grooper Design users useful "at a glance" information about these objects in your Content Models. Furthermore, these summaries have helpful links to associated objects, allowing for quicker navigation through complicated Content Models with large numbers of Data Elements and Document Types.


Data Model "Summary" tabs now list all expressions configured for any Data Element configured with an expression (Default Value, Calculated Value, Is Required, etc.)

  • Clicking on a listed Data Element will take you directly to that element in the Data Model.

Data Element "Summary" tabs also list any configured expressions for the selected element. This summary also lists any Document Types configured with overrides for the selected element.

  • Clicking on a listed Document Type will take you directly to the Document Type.

Document Type "Summary" tabs also have an "Overrides" section, listing all Data Elements being overridden for the selected Document Type.

  • Clicking on a listed Data Element will take you directly to that element.

Batch Management & Review Enhancements

"FIFO" Batch Priority

Traditionally in Grooper, Batches are created with a set priority (between "1" and "5"). When all Batches have the same priority, processing can bottleneck around processor intensive activities. This can cause situations where the first Batch that was imported isn't necessarily the first Batch to finish processing or get to a Review step first.

Batches can now be processed with a "First In, First Out" priority in Grooper. Import Providers have a new Batch Creation option called Increment Priority. Turning this property on increments each Batch's priority by "1" each time a new Batch is created. This causes each Batch to fully run through a Batch Process before the next is started.

Please note! This functionality is intended for large user-directed imports, not Batch imports directed by an Import Watcher service.

Dynamic Value Lists

Review Improvments

  • Swap panes
  • Horizontal/Vertical view toggle
  • Double-click capture (alternative to rubberband OCR)
  • Data Grid zoom in/zoom out
  • Sticky field support in web client
  • Display labels (Use “\auto” to use labels in Label Sets!)

Misc Other Changes

The Labeled Value extractor can now function without a Value Extractor configured whether using Label Sets or not.

  • Previously, this functionality was only accessible if the extractor's label was collected via the Labeling Behavior.
  • Be aware, how the extractor collects results without a configured Value Extractor is very different from the extractor's typical logic. It is intended as a "dumbed down" way of collecting the most simplest of data in the most simplest of layouts.