Lexicon Lookup (Lookup Specification)

From Grooper Wiki
Revision as of 13:55, 27 October 2025 by Randallkinard (talk | contribs) (Created page with "{{AutoVersion}} <blockquote>{{#lst:Glossary|Lexicon Lookup}}</blockquote> {|class="download-box" | File:Asset 22@4x.png | You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains a '''Batch''' with a sample document. The second contains a '''Project''' with resources used in examples throughout this article. * Media:2025_Wiki_Lexicon-Lookup_Batch.zip * Media:2025_Wiki_Lexicon-Lookup_Project.zip |}...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Lexicon Lookup is a Lookup Specification that performs a lookup against a dictionary Lexicon.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains a Batch with a sample document. The second contains a Project with resources used in examples throughout this article.

Introduction

A Lexicon Lookup in Grooper is a type of Lookup that validates or populates data by referencing a list of values stored in a Lexicon node. Unlike other Lookup types (such as Database Lookup, CMIS Lookup, Web Service Lookup, or XML Lookup), Lexicon Lookup is designed for fast, local lookups against a static or managed list, rather than querying external databases or services. This makes it ideal for scenarios where you need to cross-reference extracted data against a controlled vocabulary, code list, or set of valid values.

Lexicon Lookup fits into the broader Lookup framework as a specialized option for matching field values to entries in a Lexicon, supporting both exact and fuzzy matching.

When to use

Use Lexicon Lookup when:

  1. You need to validate extracted data against a known list of values (e.g., codes, names, abbreviations).
  2. You want to auto-populate additional fields based on a matched entry in the Lexicon.
  3. The list of valid values is static, managed within Grooper, or does not require a live connection to an external system.

Ideal use cases:

  • Validating codes, abbreviations, or names.
  • Populating descriptions or related fields from a code list.
  • Enforcing data integrity for fields with a finite set of valid values.

Prerequisites:

How to Add and Configure a Lexicon for Lookup Use

  1. Right-click a Project. From the pop-out menu choose "Add", then "Lexicon". This could also be done to a Folder within the Project, or the Local Resources of a Content Model.
  2. Name the Lexicon in the "Add" window.
  3. Add a key-value pair delimited list to the Lookup Lexicon. The value to the left of the equals sign is the key, which represents the trigger value. The delimited list to the right of the equals sign represents the values that will populate lookup fields, where each delimited value represents a column.
  4. Save the changes made to the Lexicon.
  5. Select the parent node of the new Lexicon, then click the "Refresh Node Tree" button. This is done to prevent caching issues of Lexicon objects, and must be done anytime changes are made to the Lexicon.

How to Add and Configure a Lexicon Lookup

  1. Select a container element, then click the ellipsis button for the Lookups property.
  2. Click the add button in the "Lookups" collection editor, then choose "Lexicon Lookup" from the drop-down menu.
  3. Expand the Lexicon sub-properties by clicking the drop-down arrow to the left of the property. Click the ellipsis button for the Included Lexicons property.
  4. Select the Lexicon configured for lookups in the "Included Lexicons" window.
  5. Click the drop-down for the Lookup Field property, then select the Data Element that will trigger the lookup from the drop-down menu.
  6. Click the ellipsis button for the Field Mappings property.
  7. Click the "Add" button in the "Field Mappings" window, then click the drop-down for the Target Field property.
  8. Select a target field from the drop-down menu.
  9. Set the Column Number property to an appropriate position in the delimited values of the target Lexicon.
  10. Repeat this process for the remaining Field Mappings.

How to Test a Lexicon Lookup

  1. Select a container element with a configured Lexicon Lookup, then click the Tester tab.
  2. Click the "Select Batch" button in the Batch Viewer, then select an appropriate Batch.
  3. Select a Batch Folder in the Batch Viewer, then click the "Test Extraction" button.
  4. Populate the Lookup Field with appropriate data, then tab out of the field. You will notice that the Target Fields get populated with appropriate data from the delimited list of the Lexicon.

How fields are populated

Lexicon Lookup uses the "Lookup Field" property to determine which Data Field's value to match against the Lexicon. When a match is found:

  • The "Field Mapping" property defines how columns from the Lexicon entry are mapped to target Data Fields.
  • If the Lexicon entry contains multiple columns, each can be mapped to a different field.
  • If multiple matches are found, Grooper can be configured to handle them according to the Lookup Specification's population and error handling settings.
  • Data types are interpreted based on the target field's type; values are converted as needed.

Properties overview

Below is a comprehensive list of Lexicon Lookup properties, including their definitions, remarks, and typical use cases.

General

  • Lexicon
    • Specifies the Lexicon data to be used for the lookup operation.
    • The Lexicon should contain keys which match potential values for the Lookup Field. If the value of the Lookup Field is not present in the Lexicon, then the lookup operation will fail, and the Miss Disposition will be triggered.
  • Lookup Field
    • The field to be used as the lookup key.
    • Specifies the Data Field whose value will be used as the key for the lookup operation.
      • The value of this field is searched in the Lexicon to determine if a match exists.
      • For validation, the lookup succeeds if the value is present in the Lexicon.
      • For field population, the value is used to retrieve associated data (such as a mapped value or columns).
  • Field Mappings
    • Defines one or more mappings which populate fields with values from the Lexicon.
    • Specifies how values from the Lexicon are mapped to Data Fields when a lookup is successful.
      • Each mapping links a target field to a value or column from the Lexicon entry.
      • For single-value lookups, map the value directly to a field.
      • For multi-column lookups, use the 'Column Number' property in each mapping to specify which column to use.
      • If no mappings are defined, the lookup is used for validation only.
  • Column Delimiter
    • When the lexicon contains multi-column values, specifies the separator character used between columns.
    • Defines the character used to separate multiple values within a single Lexicon entry.
      • Required when the Lexicon contains entries with more than one value per key (e.g., 12345=City,State,County).
      • Each value is split using this delimiter, and mapped to fields via 'Field Mappings'.
  • Similarity
    • If set to a value less than 100%, enables fuzzy matching.
    • Controls the similarity threshold for fuzzy matching during lookups.
      • A value of 1.0 (100%) requires an exact match.
      • Lower values (e.g., 0.9 or 0.8) allow approximate matches, useful for handling typos or OCR errors.
      • Fuzzy matching is especially helpful when user-entered data may not exactly match the Lexicon entries.
  • Description
    • An optional description for this lookup operation.
    • Use this property to provide a human-readable explanation of the lookup's purpose, configuration, or any special instructions. This description is displayed in the property grid and can help other users understand the intent and behavior of the lookup.

Lookup Options

  • Trigger Mode
    • Controls when the lookup is executed.
    • Determines the conditions under which a lookup operation is performed in Grooper. This setting allows you to control whether lookups are executed automatically, only under certain conditions, manually, or based on a custom expression. Choosing the appropriate trigger mode is essential for balancing automation, user control, and data integrity in your solution.
  • Miss Disposition
    • Specifies what happens if the lookup returns no results.
    • Determines how Grooper responds when a lookup operation does not return any matching records from the external data source. This setting controls whether errors are flagged, target fields are cleared, or no action is taken.
  • Conflict Disposition
    • Specifies what happens if the lookup returns multiple results.
    • Controls how Grooper handles situations where a lookup operation returns more than one matching record from the external data source. This setting determines whether errors are flagged, data is cleared, the first result is accepted, or no action is taken.
  • Field Population
    • Specifies how target fields are populated with the lookup results.
    • Controls whether and how the results of a lookup operation are written to the target fields in Grooper. This setting determines if lookup results will overwrite existing values, supplement only empty fields, or leave all fields unchanged.

Lookup Info

  • Lookup Fields
    • The list of fields and variables used as lookup criteria.
    • This property displays a comma-separated list of all Data Fields and Variable Definitions that are used as input for the lookup query. These are the values that will be sent to the external data source to perform the lookup.
  • Target Fields
    • The list of fields that will be populated by the lookup operation.
    • This property displays a comma-separated list of all Data Fields that will be set or validated by the results of the lookup. These are the target fields that will receive values from the external data source.

Summary

Lexicon Lookup is a powerful tool for validating and populating data in Grooper using managed lists of values. By configuring a Lexicon node and mapping fields appropriately, users can ensure data integrity, automate field population, and streamline document processing workflows. For best results, carefully curate your Lexicon contents, test your Lookup configuration, and review diagnostic output to troubleshoot any issues.