2023:List Match (Value Extractor): Difference between revisions
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
<blockquote> | <blockquote> | ||
A '''List Match''' is | A '''''List Match''''' is an extractor type that can be used when configuring several data extraction tools such as a '''Value Reader''' or '''Data Type'''. It is designed to return values matching one or more items in a defined list. The '''''List Match''''' extractor does not use or require regular expressions (regex) by default. | ||
</blockquote> | </blockquote> | ||
| Line 8: | Line 8: | ||
The '''List Match''' extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A '''List Match''' extractor returns an exact match including any spaces, numbers, punctuation, or special characters. | The '''List Match''' extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A '''List Match''' extractor returns an exact match including any spaces, numbers, punctuation, or special characters. | ||
<!--In here put information on typing in the list as a local resource or lexicons--> | |||
Unlike a Pattern Match, the '''List Match''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted. | Unlike a Pattern Match, the '''List Match''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted. | ||
| Line 13: | Line 15: | ||
==How To== | ==How To== | ||
(Just putting this here because I wrote this for About and then felt like it maybe didn't belong in about, but I want to keep it anyway): | <!--(Just putting this here because I wrote this for About and then felt like it maybe didn't belong in about, but I want to keep it anyway): | ||
A '''List Match''' can reference '''Lexicons''' to aid in extraction. This can save the user time and energy when extracting the same information across multiple data extraction tools. Instead of typing out the full desired list for each value reader or data type, one '''Lexicon''' can be configured and referenced multiple times across multiple extractors. | A '''List Match''' can reference '''Lexicons''' to aid in extraction. This can save the user time and energy when extracting the same information across multiple data extraction tools. Instead of typing out the full desired list for each value reader or data type, one '''Lexicon''' can be configured and referenced multiple times across multiple extractors. | ||
Revision as of 10:32, 13 January 2023
A List Match is an extractor type that can be used when configuring several data extraction tools such as a Value Reader or Data Type. It is designed to return values matching one or more items in a defined list. The List Match extractor does not use or require regular expressions (regex) by default.
About
The List Match extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A List Match extractor returns an exact match including any spaces, numbers, punctuation, or special characters.
Unlike a Pattern Match, the List Match extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted.