2023.1:Key-Value List (Collation Provider): Difference between revisions

From Grooper Wiki
How To start // via Wikitext Extension for VSCode
update // via Wikitext Extension for VSCode
Line 11: Line 11:


<blockquote>
<blockquote>
A '''''Key-Value List''''' is one of many '''''Collation Providers''''' you can use in Grooper to combine or organize extracted data based on the data's layout relationship.  
A '''''Key-Value List''''' is one of many '''''Collation Providers''''' you can use in Grooper to combine or organize extracted data based on the data's layout relationship. A '''''Key-Value List''''' collects a "list" of information with a spatial relationship to a label or a "Key".  
</blockquote>
</blockquote>


Line 24: Line 24:


== About ==
== About ==


== How To ==
== How To ==
Line 48: Line 50:


#<li value=5> Configure your extractor to return all text you are wanting to collect near your label.  
#<li value=5> Configure your extractor to return all text you are wanting to collect near your label.  
#* It is fine if the extractor collects mroe than just the terms you want to return. Here we are collecting all generic text on the page.  
#* It is fine if the extractor collects more than just the terms you want to return. Here we are collecting all generic text on the page.  


[[File:2023.1 Key-Value-List-(Collation-Provider) 02 Key-Value-List-Configuration 05.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 02 Key-Value-List-Configuration 05.png]]
Line 65: Line 67:


[[File:2023.1 Key-Value-List-(Collation-Provider) 02 Key-Value-List-Configuration 07.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 02 Key-Value-List-Configuration 07.png]]
=== Key-Value Lists and Data Tables ===
# We have created a '''Data Type''' with multiple child extractors to collect information on a table.
# The '''Data Type''' is configured with an ''Ordered Array''.
# For this page in the '''Batch''', this configuration works just fine. We are collecting all rows in the table.
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 01.png]]
#<li value=4> However, for this table, our current configuration falls short.
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 02.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 03.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 04.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 05.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 06.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 07.png]]
[[File:2023.1 Key-Value-List-(Collation-Provider) 03 Tables 08.png]]

Revision as of 14:29, 17 April 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

A Key-Value List is one of many Collation Providers you can use in Grooper to combine or organize extracted data based on the data's layout relationship. A Key-Value List collects a "list" of information with a spatial relationship to a label or a "Key".

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023). The first contains a Project with resources used in examples throughout this article. The second contains one or more Batches of sample documents.

About

How To

  1. Create a Data Type with two child Value Readers: one for the Key, and one for the Value.


  1. Set your Extractor property for the Key Value Reader. In this example we have set it to a List Match.


  1. Configure the extractor to return the label of your list.


  1. Set your Extractor property for the Value Value Reader. In this example we have set it to a Pattern Match.


  1. Configure your extractor to return all text you are wanting to collect near your label.
    • It is fine if the extractor collects more than just the terms you want to return. Here we are collecting all generic text on the page.


  1. Click back on the parent Data Type.
  2. The Collation property by default is set to Individual.
  3. Each text segment extracted by the child objects will be returned individually.


  1. Change the Collation to a Key-Value List.
  2. Choose the appropriate layout for the data you are collecting. Since the list in our example is aligned vertically, we are enabling the Vertical Layout property.
  3. Now only the desired list is returned.

Key-Value Lists and Data Tables

  1. We have created a Data Type with multiple child extractors to collect information on a table.
  2. The Data Type is configured with an Ordered Array.
  3. For this page in the Batch, this configuration works just fine. We are collecting all rows in the table.


  1. However, for this table, our current configuration falls short.