2023.1:Key-Value List (Collation Provider)

From Grooper Wiki
Revision as of 16:10, 17 April 2024 by Rpatton (talk | contribs) (update // via Wikitext Extension for VSCode)

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

A Key-Value List is one of many Collation Providers you can use in Grooper to combine or organize extracted data based on the data's layout relationship. A Key-Value List collects a "list" of information with a spatial relationship to a label or a "Key".

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

The Key-Value List Collation Provider is designed to collect a list of things that follow a label of some kind. Similar to the Key-Value Pair Collation Provider, it is configured by having a "Key" and a "Value" extracted in a Data Type's child objects. When collated, the parent Data Type will first locate the "Key" and then look either vertically or horizontally for a list of text as extracted by the "Value" extractor object. Only the list of terms from the "Value" extractor object will be returned as a result.

The Key-Value List is also useful in extracting Table information via Row Match when not every row contains data.

How To

  1. Create a Data Type with two child Value Readers: one for the Key, and one for the Value.


  1. Set your Extractor property for the Key Value Reader. In this example we have set it to a List Match.


  1. Configure the extractor to return the label of your list.


  1. Set your Extractor property for the Value Value Reader. In this example we have set it to a Pattern Match.


  1. Configure your extractor to return all text you are wanting to collect near your label.
    • It is fine if the extractor collects more than just the terms you want to return. Here we are collecting all generic text on the page.


  1. Click back on the parent Data Type.
  2. The Collation property by default is set to Individual.
  3. Each text segment extracted by the child objects will be returned individually.


  1. Change the Collation to a Key-Value List.
  2. Choose the appropriate layout for the data you are collecting. Since the list in our example is aligned vertically, we are enabling the Vertical Layout property.
  3. Now only the desired list is returned.

Key-Value Lists and Data Tables

  1. We have created a Data Type with multiple child extractors to collect information on a table.
  2. The Data Type is configured with an Ordered Array.
  3. For this page in the Batch, this configuration works just fine. We are collecting all rows in the table.


  1. However, for this table, our current configuration falls short.