Key-Value Pair (Collation Provider): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 7: Line 7:
''Key-Value Pair'' collation is one of the most commonly used Collation Providers.  It provides an excellent way to extract data when a value exists next to a label on a document, whether next to it horizontally, vertically, or even in a "right-to-left & top-to-bottom" text flow.
''Key-Value Pair'' collation is one of the most commonly used Collation Providers.  It provides an excellent way to extract data when a value exists next to a label on a document, whether next to it horizontally, vertically, or even in a "right-to-left & top-to-bottom" text flow.


{|cellpadding="10" cellspacing="5"
|-style="background-color:#ed2330; color:white"
|style="font-size:14pt"|'''WIP'''||This article is a work-in-progress and may abruptly stop in the middle of a section.
|}


== About ==
== About ==


How are these two boxes related to each other?
The ''Key-Value Pair'' Collation Provider utilizes the [[Data Context|spatial relationship]] between two related extractor results to return a single result.


[[file:kvp relate 1.png]]
{|cellpadding=10 cellspacing=5
|style="width:40%" valign=top|
For structured documents, it is common for a piece of data to be identified by some sort of label, usually to the left of it, or above it.


They're next to each otherYou could draw a horizontal line connecting the two boxes.
In these images, the field label, highlighted in blue, identifies the field's value, highlighted in yellow.  We use this kind of labeling relationship to identify data on documents all the timeThe ''Key-Value Pair'' Collation Provider is perfectly suited to use these labeling relationships.


[[file:kvp relate 2.png]]
''Key-Value Pair'' collated '''Data Types''' (often just referred to as '''Key-Value Pairs''') collate the results of two extractors, a "key extractor" and a "value extractor".  The "key extractor" will locate the label (or whatever context is being used to return the data you want). The "value extractor" will return ''all possible'' values matching the data you want to return.
 
How are these two boxes related?
 
[[file:kvp relate 3.png]]
 
They're also next to each other, just vertically instead of horizontally.
 
[[file:kvp relate 4.png]]
 
How is the green box different from the orange boxes?
 
[[file:kvp relate 5.png]]
 
It's horizontally aligned with the blue box!
 
[[file:kvp relate 6.png]]
 
This is how Key-Value Pair collation works.  Key-Value Pair collation uses the relationship between two extractors, a "Key" extractor and a "Value" extractor, to return the value aligned with the key.
 
The beauty of Key-Value Pair collation is even if the Value extractor returns multiple hits on the page, only the one next to the specified Key will be returned.
 
[[file:kvp relate 7.png]]
 
Key-Value Pair collation is set using the "Collation" property of a Data Type.
 
 
[[file:kvp 2.png|center|900px]]


Once collated, the '''Key-Value Pair''' will return the closest value to the key, according to the assigned '''''Layout Settings''''' (The top image uses a '''''Horizontal Layout''''' because the labels are aligned next to each other horizontally.  The bottom uses a '''''Vertical Layout''''').
|
[[File:Key-Value Pair 01.png]]
|}


The Key-Value Pair provider collates the results of two child extractors, a "Key" extractor and a "Value" extractor.
{|cellpadding=10 cellspacing=5
|style="width:60%"|
''Key-Value Pair'' collation also has applications in unstructured document processing.  Unstructured documents convey information in paragraphs and sentences more than they do with structured fields.  Because of this, the value may not be horizontally or vertically aligned, but somewhere before or after a labeling key in the text flow.


The Key extractor locates a word, phrase or piece of data that relates to the value you wish to extract.  The Key is the context you as a human use to find a value on a page.  This is typically a label for the value, such as the word "Date" labeling a particular date.
For these situations, the '''''Flow Layout''''' can be used, which will use the relationship between the key and the value in the text data's right-to-left and top-to bottom text flow.
 
 
{|
|+The Key extractor, here, locates the key label "Date"


A '''Key-Value Pair''' could be build to extract the driver name (highlighted here in yellow), using the phrase "driver's name" in the text flow before it.
|
[[File:Key-Value Pair 02.png]]
|}
|}
The Value extractor matches the value you wish to return.  The beauty of Key-Value Pair collation is the Value extractor can match multiple results besides the specific one you want to return.  Only the result laid out next to the key will be returned (This layout

Revision as of 10:48, 26 August 2020

Key-Value Pair is a Collation Provider for Data Type extractors. It uses the layout relationship between a key and a value on a document to return a result.

Key-Value Pair collation is one of the most commonly used Collation Providers. It provides an excellent way to extract data when a value exists next to a label on a document, whether next to it horizontally, vertically, or even in a "right-to-left & top-to-bottom" text flow.


About

The Key-Value Pair Collation Provider utilizes the spatial relationship between two related extractor results to return a single result.

For structured documents, it is common for a piece of data to be identified by some sort of label, usually to the left of it, or above it.

In these images, the field label, highlighted in blue, identifies the field's value, highlighted in yellow. We use this kind of labeling relationship to identify data on documents all the time. The Key-Value Pair Collation Provider is perfectly suited to use these labeling relationships.

Key-Value Pair collated Data Types (often just referred to as Key-Value Pairs) collate the results of two extractors, a "key extractor" and a "value extractor". The "key extractor" will locate the label (or whatever context is being used to return the data you want). The "value extractor" will return all possible values matching the data you want to return.

Once collated, the Key-Value Pair will return the closest value to the key, according to the assigned Layout Settings (The top image uses a Horizontal Layout because the labels are aligned next to each other horizontally. The bottom uses a Vertical Layout).

Key-Value Pair collation also has applications in unstructured document processing. Unstructured documents convey information in paragraphs and sentences more than they do with structured fields. Because of this, the value may not be horizontally or vertically aligned, but somewhere before or after a labeling key in the text flow.

For these situations, the Flow Layout can be used, which will use the relationship between the key and the value in the text data's right-to-left and top-to bottom text flow.

A Key-Value Pair could be build to extract the driver name (highlighted here in yellow), using the phrase "driver's name" in the text flow before it.