2023:List Match (Value Extractor): Difference between revisions
No edit summary |
No edit summary |
||
| Line 18: | Line 18: | ||
<!--In here put information on typing in the list as a local resource or lexicons--> | <!--In here put information on typing in the list as a local resource or lexicons--> | ||
To configure a '''''List Match''''', you can input the desired extracted value as a '''''Local Entry''''' or reference a pre-configured | To configure a '''''List Match''''', you can input the desired extracted value as a '''''Local Entry''''' or reference a pre-configured '''Lexicon'''. | ||
Unlike a '''''Pattern Match''''', the '''''List Match''''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a '''''Pattern Match''''', '''''Suffix''''' and '''''Prefix Patterns''''' can be added to help anchor the expression and limit the amount of false positives extracted. | Unlike a '''''Pattern Match''''', the '''''List Match''''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a '''''Pattern Match''''', '''''Suffix''''' and '''''Prefix Patterns''''' can be added to help anchor the expression and limit the amount of false positives extracted. | ||
| Line 24: | Line 24: | ||
==How To== | ==How To== | ||
A '''''List Match''''' is most commonly used when configuring objects such as ''Value Readers'' or ''Data Types''. It is great for extracting text information such as: | A '''''List Match''''' is most commonly used when configuring objects such as '''Value Readers''' or '''Data Types'''. It is great for extracting text information such as: | ||
* Specific company names | * Specific company names | ||
| Line 32: | Line 32: | ||
* Exact numbers | * Exact numbers | ||
If the information you need to extract | If the information you need to extract follows a specific pattern, such as a date or social security number, then it may be better to consider a different extractor like a '''''[[Pattern Match]]'''''. | ||
A '''''List Match''''' can be configured using a '''''Local Entry''''' or a '''''Lexicon'''''. '''''Local Entries''''' are simple and easy to set up, especially if you only need to add a few entries. If you plan to extract a large amount of information multiple times across different objects, it might be more efficient to set up a '''''Lexicon''''' to reference first. | A '''''List Match''''' can be configured using a '''''Local Entry''''' or a '''''Lexicon'''''. '''''Local Entries''''' are simple and easy to set up, especially if you only need to add a few entries. If you plan to extract a large amount of information multiple times across different objects, it might be more efficient to set up a '''''Lexicon''''' to reference first. | ||
| Line 39: | Line 39: | ||
<tab Name="Configuring With Local Entries (Value Readers)"> | <tab Name="Configuring With Local Entries (Value Readers)"> | ||
{|cellpadding=10 | {|cellpadding=10 cellspacing=5 | ||
|valign=top style="width:40%"| | |valign=top style="width:40%"| | ||
# In your '''Node Tree''', right click and create a '''Value Reader'''(if you have not already created one). | # In your '''Node Tree''', right click and create a '''Value Reader'''(if you have not already created one). | ||
| Line 45: | Line 45: | ||
# Select the "Value Reader" tab. | # Select the "Value Reader" tab. | ||
# Click the drop down list next to '''''Extractor''''' and select ''List Match''. | # Click the drop down list next to '''''Extractor''''' and select ''List Match''. | ||
# Save and select the "Tester" tab making sure the ''Expressions'' sub-tab is selected. | |||
| | |||
[[File:List Match Steps 3 and 4.png]] | |||
|- | |||
|valign=top style="width:40%"| | |||
#<li value=5> Save and select the "Tester" tab making sure the ''Expressions'' sub-tab is selected. | |||
# Under '''''LOCAL ENTRIES''''', type the desired text to be extracted. | # Under '''''LOCAL ENTRIES''''', type the desired text to be extracted. | ||
#* Hit Enter after each entry to extract multiple text segments under one ''List Match''. | #* Hit Enter after each entry to extract multiple text segments under one ''List Match''. | ||
#* If needed, add a '''''Prefix''''' and '''''Suffix Pattern''''' to anchor your extraction (When using tabs as an anchor (<code>\t</code>) make sure ''Tab Marking'' is set to Enabled under ''Preprocessing'' in your ''Properties'' tab). | #* If needed, add a '''''Prefix''''' and '''''Suffix Pattern''''' to anchor your extraction (When using tabs as an anchor (<code>\t</code>) make sure ''Tab Marking'' is set to Enabled under ''Preprocessing'' in your ''Properties'' tab). | ||
# Save and test your extraction. | # Save and test your extraction. | ||
| | | | ||
[[File:List Match Steps 5 and 6.png]] | |||
[[File:List Match Steps 5 and 6.png | |||
|} | |} | ||
Revision as of 17:13, 20 January 2023
| WIP |
This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly. This tag will be removed upon draft completion. |
A List Match is an extractor type that can be used when configuring several data extraction tools such as a Value Reader or Data Type. It is designed to return values matching one or more items in a defined list. By default, the List Match extractor does not use or require regular expressions (regex).
About
The List Match is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract numbers, specific words, or full phrases contained within a document. A List Match extractor returns an exact match including any spaces, numbers, punctuation, or special characters.
To configure a List Match, you can input the desired extracted value as a Local Entry or reference a pre-configured Lexicon.
Unlike a Pattern Match, the List Match extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, Suffix and Prefix Patterns can be added to help anchor the expression and limit the amount of false positives extracted.
How To
A List Match is most commonly used when configuring objects such as Value Readers or Data Types. It is great for extracting text information such as:
- Specific company names
- Field labels
- Headers and Footers
- Full phrases
- Exact numbers
If the information you need to extract follows a specific pattern, such as a date or social security number, then it may be better to consider a different extractor like a Pattern Match.
A List Match can be configured using a Local Entry or a Lexicon. Local Entries are simple and easy to set up, especially if you only need to add a few entries. If you plan to extract a large amount of information multiple times across different objects, it might be more efficient to set up a Lexicon to reference first.
|
|
|
- In your Node Tree, right click and create the desired object such as a Data Type or Value Reader.
- Select the created object to bring up its configuration properties.
- In the Value Reader tab, click the drop down list next to Extractor and select List Match.
- Save and select the Tester tab. Then make sure the Properties sub-tab is selected.
- Click the arrow next to Vocabulary to access additional properties.
- Click the ellipsis button next to the Included Lexicons property. This should open a new window where you can add pre-configured Lexicons.
- In the new window, click through the Projects and Folders until you find the desired Lexicon. Click the check boxes next to the desired Lexicons.
- Click OK to apply the Lexicon.
- Save and test your extraction.

