2023.1:AND (Collation Provider)

From Grooper Wiki
Revision as of 10:58, 22 April 2024 by Randallkinard (talk | contribs)

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

AND is a Collation Provider option for pin Data Type extractors. AND returns results only when each of its referenced or child extractors gets at least one hit, thus acting as a logical “AND” operator across multiple extractors.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

Setting Up the Collation Provider

Setting up for Data Extraction

AND Collation can be set up on a Data Type. To do so, select a Data Type in the Node Tree. Under General properties, select Collation. Expand the drop-down menu and select AND. Now to add the children. These can be Value Readers or Data Types, it doesn't matter. What does matter is that they return a result.

Minimum Hits

Be aware of a property called "Minimum Hits" that can be found when expanding the AND Collation Provider. Normally, this property is defaulted to zero. This ensures that all extractors must produce a hit in order to get a positive result. If the number of Minimum Hits is changed to, let's say 2, then all exactors need to meet at least two of the whatever many criteria are needed to produce a result. Be cautious, as this changing the Minimum Hits property could skew results. This is illustrated below.


Setting up for Classification

How does Classification play into the AND Collation Provider? Since the AND Collation relies on positive hits to extract data, you can make use of it through the Positive Extractor property on a Document Type. Simply use your configured Collation Provider as a referenced extractor for your Positive Extractor.



What Does This Mean for Classification?

So, how exactly can an AND Collation Provider assist with Classification? Simply put, it is a tool that can be referenced on a Positive Extractor to help Grooper identify certain Document Types.