2023:List Match (Value Extractor): Difference between revisions
Created page with "<blockquote> A '''List Match''' is a type of extractor that can be used when configuring several data extraction tools such as a value reader or data type. This extractor allows users to return values from a document by simply typing the desired text in a list format. The '''List Match''' extractor does not use or require regular expressions (regex) by default. </blockquote> ==About== The '''List Match''' extractor is one of the the simplest extractors used in Groope..." |
|||
| Line 5: | Line 5: | ||
==About== | ==About== | ||
The '''List Match''' extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A '''List Match''' extractor returns an exact match including any spaces, numbers, punctuation, or special characters. | The '''List Match''' extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A '''List Match''' extractor returns an exact match including any spaces, numbers, punctuation, or special characters. | ||
Unlike a Pattern Match, the '''List Match''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted. | Unlike a Pattern Match, the '''List Match''' extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted. | ||
A '''List Match''' can reference '''Lexicons''' to aid in extraction. This can save the user time and energy when extracting the same information across multiple data extraction tools. Instead of typing out the full desired list for each value reader or data type, one '''Lexicon''' can be configured and referenced multiple times across multiple extractors. | A '''List Match''' can reference '''Lexicons''' to aid in extraction. This can save the user time and energy when extracting the same information across multiple data extraction tools. Instead of typing out the full desired list for each value reader or data type, one '''Lexicon''' can be configured and referenced multiple times across multiple extractors. | ||
==How To== | ==How To== | ||
Revision as of 17:13, 12 January 2023
A List Match is a type of extractor that can be used when configuring several data extraction tools such as a value reader or data type. This extractor allows users to return values from a document by simply typing the desired text in a list format. The List Match extractor does not use or require regular expressions (regex) by default.
About
The List Match extractor is one of the the simplest extractors used in Grooper. It is designed to return values matching one or more items in a defined list. This can be used to extract anything from field labels on a form or a list of names, to specific words and full phrases contained within a document. A List Match extractor returns an exact match including any spaces, numbers, punctuation, or special characters.
Unlike a Pattern Match, the List Match extractor does not use or require regular expressions by default, but regex can be enabled in the properties menu. Similar to a Pattern Match, suffix and prefix patterns can be added to help anchor the expression and limit the amount of false positives extracted.
A List Match can reference Lexicons to aid in extraction. This can save the user time and energy when extracting the same information across multiple data extraction tools. Instead of typing out the full desired list for each value reader or data type, one Lexicon can be configured and referenced multiple times across multiple extractors.