Key-Value Pair (Collation Provider): Difference between revisions
Dgreenwood (talk | contribs) |
Dgreenwood (talk | contribs) |
||
| Line 67: | Line 67: | ||
# Configure the "Key Extractor" to return the key result you're looking for. | # Configure the "Key Extractor" to return the key result you're looking for. | ||
#* Here, we're looking for the label "home phone". We simply type "home phone" as the value pattern. | #* Here, we're looking for the label "home phone". We simply type "home phone" as the value pattern. | ||
# Notice, we | # Notice, we get three results. That's ok! The ''Key-Value Pair'' collation settings will help us narrow down the result ultimately returned. | ||
| | | | ||
[[File:Key-Value Pair - Grooper Screenshots 02.png]] | [[File:Key-Value Pair - Grooper Screenshots 02.png]] | ||
| Line 74: | Line 74: | ||
<tab name="Create the Value Extractor" style="margin:20px"> | <tab name="Create the Value Extractor" style="margin:20px"> | ||
=== Create the Value Extractor === | === Create the Value Extractor === | ||
For ''Key-Value Pair'' extractors, the "Value Extractor" is the extractor looking for the data you ultimately want to return. Here, we're looking for a home phone number. So, we simply need an extractor that finds phone numbers | |||
{|cellpadding=10 cellspacing=5 | |||
|style="width:40%" valign=top| | |||
# Add the "Key Extractor" to the '''Data Type'''. | |||
#* Here, we've added a child '''Data Format''' to the '''Data Type''' and named it "VALUE". However, the "Value Extractor" could also be a child '''Data Type''' or a referenced extractor. | |||
#* But ''not'' the parent '''Data Type's''' internal '''''Pattern'''''. ''Key-Value Pair collation'' requires two extractors, the "Key Extractor" and "Value Extractor". The order in which the parent '''Data Type's''' extractors execute (or "fire") matters a great deal. The first one to execute is always the "Key Extractor". The second is always the "Value Extractor". A '''Data Type's''' internal '''''Pattern''''' ''always'' executes first. Hence, if it is used as one of the two extractors, it will ''always'' be the "Key Extractor". Extractor execution order of operations is as follows: | |||
## Internal '''''Pattern''''' | |||
## Child Extractor (In order from top to bottom in the Node Tree) | |||
##* Here, since the child '''Data Format''' named "VALUE" is below the '''Data Format''' named "KEY", the '''Data Format''' named "VALUE" is the "Value Extractor". It doesn't have anything to do with its name. | |||
## Referenced Extractor (In order from top to bottom in the Referenced Extractor list) | |||
# Notice, we get six results, all the phone numbers on the page. That's totally fine (Indeed, it's what we want). We will narrow down which specific phone number we're looking for using the ''Key-Value Pair'' collation settings. | |||
| | |||
[[File:Key-Value Pair - Grooper Screenshots 03.png]] | |||
|} | |||
</tab> | </tab> | ||
<tab name="Set the Collation Provider" style="margin:20px"> | <tab name="Set the Collation Provider" style="margin:20px"> | ||
=== Set the Collation Provider === | === Set the Collation Provider === | ||
{|cellpadding=10 cellspacing=5 | |||
|style="width:40%" valign=top| | |||
# Navigate back to the parent '''Data Type'''. | |||
# Under '''''Output''''', select the '''''Collation''''' property. | |||
# Using the dropdown list, select ''Key-Value Pair from the list of Collation Providers. | |||
Once you select ''Key-Value Pair'', you will ''not'' see the results list change. It will still appear as if the two child extractors' results are being returned one by one (like the ''Individual'' Collation Provider). Some Collation Providers, such as ''Key-Value Pair'', require some configuration before their results are collated. Specifically, you ''must'' choose which '''''Layout''''' method is used. | |||
| | |||
[[File:Key-Value Pair - Grooper Screenshots 03.png]] | |||
|} | |||
</tab> | </tab> | ||
<tab name="Set the Layout Setting" style="margin:20px"> | <tab name="Set the Layout Setting" style="margin:20px"> | ||
=== Configure the Layout Settings === | === Configure the Layout Settings === | ||
They '''''Layout''''' method will define how the "Key Extractor" and "Value Extractor" results are spatially related to each other. Is the key above the value? Is it to the left of it? The right? Configuring these settings will dictate where you expect to find the value in relation to the key. | |||
The '''''Layout''''' may be: | |||
# Horizontal | |||
# Vertical | |||
# Flow | |||
==== Horizontal Layout ==== | |||
# View the ''Key-Value Pair'' configuration properties by expanding the the '''''Collation''''' property. | |||
# Select '''''Horizontal Layout'''''. | |||
# Change the property from ''Disabled'' to ''Enabled''. | |||
</tab> | </tab> | ||
</tabs> | </tabs> | ||
Revision as of 14:38, 26 August 2020

Key-Value Pair is a Collation Provider for Data Type extractors. It uses the layout relationship between a key and a value on a document to return a result.
Key-Value Pair collation is one of the most commonly used Collation Providers. It provides an excellent way to extract data when a value exists next to a label on a document, whether next to it horizontally, vertically, or even in a "right-to-left & top-to-bottom" text flow.
About
The Key-Value Pair Collation Provider utilizes the spatial relationship between two related extractor results to return a single result, typically looking for a piece of data (the value) next to a label (the key).
|
For structured documents, it is common for a piece of data to be identified by some sort of label, usually to the left of it, or above it. In these images, the field label, highlighted in blue, identifies the field's value, highlighted in yellow. We use this kind of labeling relationship to identify data on documents all the time. The Key-Value Pair Collation Provider is perfectly suited to use these labeling relationships. Key-Value Pair collated Data Types (often just referred to as Key-Value Pairs) collate the results of two extractors, a "key extractor" and a "value extractor". The "key extractor" will locate the label (or whatever context is being used to return the data you want). The "value extractor" will return all possible values matching the data you want to return. Once collated, the Key-Value Pair will return the closest value to the key, according to the assigned Layout Settings (The top image uses a Horizontal Layout because the labels are aligned next to each other horizontally. The bottom uses a Vertical Layout). |
|
Key-Value Pair collation also has applications in unstructured document processing. Unstructured documents convey information in paragraphs and sentences more than they do with structured fields. Because of this, the value may not be horizontally or vertically aligned, but somewhere before or after a labeling key in the text flow. For these situations, the Flow Layout can be used, which will use the relationship between the key and the value in the text data's right-to-left and top-to bottom text flow. A Key-Value Pair could be build to extract the driver name (highlighted here in yellow), using the phrase "driver's name" in the text flow before it. |
How To
Create A Key-Value Pair Extractor
Create a Data Type
Data Type extractors use Collation Providers to combine, filter, or otherwise manipulate extraction results. Collation Providers are set using the Data Type's Collation property.
|
So, the very first thing to do is create a Data Type. Here, we are creating the Data Type in the Local Resources folder of a Content Model.
|
Create the Key Extractor
Key-Value Pair extractors must have exactly two extractors, a "Key Extractor" and a "Value Extractor". The "Value Extractor" is ultimately the value the Key-Value Pair returns. The "Key Extractor" is how you find the value. It's result will be used as a positional anchor to find the value. Our goal with the document seen here is to differentiate between the various "home phone" numbers from the "cell phone" numbers. So, our key extractor simply needs to find the label, "home phone".
|
Create the Value Extractor
For Key-Value Pair extractors, the "Value Extractor" is the extractor looking for the data you ultimately want to return. Here, we're looking for a home phone number. So, we simply need an extractor that finds phone numbers
|
Set the Collation Provider
Once you select Key-Value Pair, you will not see the results list change. It will still appear as if the two child extractors' results are being returned one by one (like the Individual Collation Provider). Some Collation Providers, such as Key-Value Pair, require some configuration before their results are collated. Specifically, you must choose which Layout method is used. |
Configure the Layout Settings
They Layout method will define how the "Key Extractor" and "Value Extractor" results are spatially related to each other. Is the key above the value? Is it to the left of it? The right? Configuring these settings will dictate where you expect to find the value in relation to the key.
The Layout may be:
- Horizontal
- Vertical
- Flow
Horizontal Layout
- View the Key-Value Pair configuration properties by expanding the the Collation property.
- Select Horizontal Layout.
- Change the property from Disabled to Enabled.




