2023.1:AND (Collation Provider)

From Grooper Wiki
Revision as of 13:27, 7 March 2024 by Dsmith (talk | contribs)

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

AND Collation is a Collation Provider that works by providing a result as long as its child extractors return one result. If all child extractors return a result (assuming there are more than one), then these results are fed into the parent extractor and displayed as a single result on the parent.



You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About


Setting Up the Collation Provider

Setting up for Data Extraction

AND Collation can be set up on a Data Type. To do so, select a Data Type in the Node Tree. Under General properties, select Collation. Expand the drop-down menu and select AND. Now to add the children. These can be Value Readers or Data Types, it doesn't matter. What does matter is that they return a result.

Minimum Hits

Be aware of a property called "Minimum Hits" that can be found when expanding the AND Collation Provider. Normally, this property is defaulted to zero. This ensures that all extractors must produce a hit in order to get a positive result. If the number of Minimum Hits is changed to, let's say 2, then all exactors need to meet at least two of the whatever many criteria are needed to produce a result. Be cautious, as this changing the Minimum Hits property could skew results. This is illustrated below.

Setting up for Classification

How does Classification play into the AND Collation Provider? Since the AND Collation relies on positive hits to extract data, you can make use of it through the Positive Extractor property on a Document Type. Simply use your configured Collation Provider as a referenced extractor for your Positive Extractor.

What Does This Mean for Classification?

So, how exactly can an AND Collation Provider assist with Classification? Simply put, it is a tool that can be referenced on a Positive Extractor to help Grooper identify certain Document Types.