XML Lookup (Lookup Specification)

From Grooper Wiki
(Redirected from XML Lookup)

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

XML Lookup is a Lookup Specification that performs a lookup against an XML file stored as a draft Resource File in the package_2 Project. XML Lookups use XPath expressions to select XML nodes and map XML attributes or an XML element's text to Grooper fields.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

About

XML Lookup was designed to be a middle-ground between Grooper's Lexicon Lookups and Database Lookup. Lexicon Lookup and Database Lookups have their own pros and cons.

Lexicon Lookup

  • Pro: Portability - Lexicon Lookups use a Grooper Lexicon to perform a lookup operation. Lexicons can be easily shared between Grooper users (or since a Lexicon is essentially just a text-based list, their text can be easily copied and pasted).
  • Pro: Handle simple data relationships well - Lexicon Lookups are the simplest lookup type. They are essentially key-value lists where the lookup field is the key and the target fields are parsed from comma separated values.
  • Con: Handle complex data relationships poorly - Because they are so simple, they cannot begin to express complex data structures like a relational database can.

Database Lookup

  • Pro: Handle both simple and complex data relationships well - Database are effective at describing simple data relationships, such as simple key-value pairs but excel at organizing complex data structures.
  • Con: Portability - Databases require at least some hardware and software infrastructure to support them. Sharing them from one environment to another is not always easy (certainly not as easy as passing a file around).


Thus, the XML Lookup was born. XML can describe more complex data relationships than a Grooper Lexicon through XML node hierarchy and attributes. But, an XML file is just as portable as a Lexicon (if not more so). If you have a fairly complex (but also static) data structure you want to use to perform a lookup, consider using XML Lookup.

The general setup

  1. Import the XML file into a Grooper Project by dragging it onto the Project (or a folder in the Project). This will create a Resource File for the XML file.
  2. Add a XML Lookup to Data Model (or Data Section or Data Table if appropriate)
  3. Point to the XML file by configuring the XML Lookup's "Source" property.
  4. Configure the XML Lookup's "Record Selector".
    • This uses an XPath expression to select the XML nodes (records) you want to retrieve.
    • Use the % character to insert field variables in your XPath expression (e.g. %GrooperFieldName).
    • These are the "lookup fields" for XML Lookup. This will insert insert values from Grooper fields into the XPath expression at runtime.
    • Example: /Root/Record[xmlElement='%GrooperFieldName']
  5. Configure the XML Lookup's "Value Selectors".
    • This is how you map data in the XML file to Grooper fields (XML Lookup's "target fields").
    • One or more Value Selector may be added.
    • Each Value Selector specifies an XPath expression and maps the result to a Grooper field.
      • The Value Selector XPath is relative to the record node returned by the Record Selector. The path should start at the root of the record not the root of the XML itself.
      • If the Record Selector itself selects the XML element you want to map to a target field, there will be no "child element" to select. Instead, enter the dot expression (.) to select it.

The general execution

XML Lookup (like all lookups) execute when a document's data is collected by the Extract activity. The process goes like this:

  1. The Data Model executes its Data Field/Data Section/Data Table extractors.
  2. XML Lookup loads the "Source" XML file.
  3. The Record Selector XPath is evaluated (with lookup variables replaced by field values). The lookup will either:
    • Hit - If the XPath selects exactly one record node, this results in a successful lookup.
    • Miss - If your XPath does not return any nodes, no data will be populated and the "Miss Disposition" will determine what happens next.
    • Conflict - If multiple record nodes are returned, the "Conflict Disposition" will determine how multiple results are handled.
  4. For successful lookups, the Value Selectors evaluate their XPath relative to the record node and populate the corresponding Grooper fields.

Example Source, Record Selector and Value Selector

FYI

New to XPath? Check out w3schools XPath Tutorial for a primer on XPath.

Need to test an XPath expression? There are several XPath testers online. Just copy the XML and paste it into the tester and enter the XPath expression you want to test. These are some popular XPath testers:

Example Source

The XML data below describes data you might find in a bookstore.

  • There are three <book> nodes in this XML. Each node, its attributes and its child XML elements describe a book sold by the bookstore.
  • Each <book> node has a collection of child XML elements. Each XML element contains field data related to the book:
    • <title> and its lang attribute
    • <author>
    • <listPrice>
  • The <book> nodes also have additional data stored as attributes (notably the isbn13 which will be used to lookup each book in this example).
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="Fiction" isbn13="9780307796844">
    <title lang="en">The Complete Stories of Robert Louis Stevenson</title>
    <author>Robert Louis Stevenson</author>
    <listPrice>13.99</listPrice>
  </book>
  <book category="Fiction" isbn13="9781409086284">
    <title lang="en">A Study In Scarlet</title>
    <author>Arthur Conan Doyle</author>
    <listPrice>6.99</listPrice>
  </book>
  <book category="Fiction" isbn13="9789363185579">
    <title lang="en">The Murder on the Links</title>
    <author>Agatha Christie</author>
    <listPrice>0.23</listPrice>
  </book>
</bookstore>

Example Record Selector

This Record Selector would select a <book> node based on an isbn13 value using a Grooper Data Field named "ISBN": //book[@isbn13='%ISBN']

%ISBN is the lookup variable for our lookup field. If Grooper extracted "9780307796844" for the "ISBN" Data Field, the following node would be selected:

<book category="Fiction" isbn13="9780307796844">
  <title lang="en">The Complete Stories of Robert Louis Stevenson</title>
  <author>Robert Louis Stevenson</author>
  <listPrice>13.99</listPrice>
</book>

Example Value Selectors

Now that we have a record node selected, we can map data to Grooper fields using Value Selectors.


Configuring a Value Selector is done by configuring a Path (an XPath expression to the desired data in the XML file) and a Target Field (the Data Field you want to map that data to).

Imagine we have five target Data Fields. We would add one Value Selector for each:

  • Title
  • Author
  • List Price
  • Category
  • Language

Each Value Selector's XPath expression should select an attribute (i.e. @attributeName) or a child element (i.e. childElementName). If an attribute is selected, its value is returned. If a child element is selected, the element's inner text is returned (i.e. <childElementName>This is inner text</childElementName>). These values are mapped

The corresponding Value Selectors for the Data Fields in this example would be:

  • title
  • author
  • listPrice
  • @category
    • category is an attribute of the <book> element itself. Attributes are selected using the @ symbol.
  • title/@lang
    • Notice the <title> element's lang attribute was selected by first selecting title then pathing to its attribute with the @ symbol.

FYI

If the Record Selector itself selects the XML element you want to map to a target field, there will be no “child element” to select. Instead, enter the dot expression (.) to select it.

Examples

Use XML Lookup to populate target fields

Using a Lookup Specification to populate fields is the most common reason people add lookups to a Data Model.

In this example, Grooper will extract an ISBN 13 barcode and use this to look up information in an XML file using XML Lookup.

The process follows this order:

  1. Grooper executes the "lookup field's" extractor.
  2. XML Lookup runs its "Record Selector" to locate one or more XML nodes.
  3. XML Lookup runs its "Value Selectors" against the record and maps the selected attribute or element's text to a Grooper Data Field.

1. Grooper extracts the lookup field.

2. XML Lookup's "Record Selector" selects an XML node.

3. XML Lookup's "Value Selectors" map XML values to Data Fields.

In this case, the "ISBN Lookup" Data Field in this Data Model.

In this case, the selector selects a <book> node whose isbn13 attribute matches the value Grooper extracted.

Data from the <book> node is mapped to corresponding Data Fields, like "Book Title" and "Author Name" in this case.


To add an XML Lookup used to populate target fields, do the following:

  1. Drag the XML file to the Project (or a folder in the Project).
    • If you want to follow along with this tutorial, download and add this file to the Project:
  2. Select the Data Element you want to add XML Lookup to.
    • Commonly lookups are added the Data Model. However, lookups may be added to Data Sections and Data Tables as well.
      • If added to the Data Model, XML Lookup will execute once for the document instance.
      • If added to a Data Section, XML Lookup will execute for each section instance.
      • If added to a Data Table, XML Lookup will execute for each row instance.
  3. Open the "Lookups" editor (Press the "..." button).
  4. Press the "Add" button and select "XML Lookup".
  5. Using the "Source" button, select the XML file you added to the Project in Step 1.
  6. Open the "Record Selector" editor (Press the "..." button).
  7. Enter an XPath expression to select an XML node from the source XML file.
    • The Record Selector is crucial to XML Lookup. The lookup will fail if no record is selected.
    • Use field variables (%FieldName) to insert "lookup fields".
    • The lookup field(s) value will be inserted into the XPath when XML Lookup runs.
    • Example: //book[@isbn13='%ISBN_Lookup']
  8. Open the "Value Selectors" editor (Press the "..." button).
  9. Press the "Add" button to add a new Value Selector.
  10. In the "Path" editor, enter an XML selector that selects an attribute or child XML element whose value you want to collect. When XML Lookup runs, the Value Selector's "target field" will be populated with the selected attribute or element's text.
  11. Open the "Target Field" dropdown and select the corresponding Data Field in the Data Model. When XML Lookup runs, the Data Field you select will be populated with the attribute or element's text selected by the Value Selector's Path.
  12. Add as many Value Selectors as needed.
  13. Press "OK" when finished to exit the Value Selectors editor.
  14. When finished configuring the XML Lookup, press "OK" to exit the Lookups editor.
  15. Save changes to the Data Element.

Use XML Lookup to validate lookup fields

Using a Lookup Specification to validate data is the second most application for lookups in Grooper. Validation lookups are used to ensure data collected from a document matches a value stored in an external data source. Only lookup fields are used in these kinds of lookups.

In this example, Grooper will extract an ISBN 13 barcode and a price listed on a page. It will use XML Lookup to look up to verify there is a corresponding XML record that have both the ISBN 13 and price. If not, Grooper will flag the lookup fields.

  • Lookups fail whenever they do not produce a hit. When a lookup fails, all lookup fields are flagged. It's important to note Grooper doesn't "know" why the lookup failed. It simply flags the lookup fields if the Record Selector's XPath expression did not return a match.


To add an XML Lookup used to validate target fields, do the following:

  1. Drag the XML file to the Project (or a folder in the Project).
  2. Select the Data Element you want to add XML Lookup to.
    • Commonly lookups are added the Data Model. However, lookups may be added to Data Sections and Data Tables as well.
      • If added to the Data Model, XML Lookup will execute once for the document instance.
      • If added to a Data Section, XML Lookup will execute for each section instance.
      • If added to a Data Table, XML Lookup will execute for each row instance.
  3. Open the "Lookups" editor (Press the "..." button).
  4. Press the "Add" button and select "XML Lookup".
  5. Using the "Source" button, select the XML file you added to the Project in Step 1.
  6. Open the "Record Selector" editor (Press the "..." button).
  7. Enter an XPath expression to select an XML node from the source XML file.
    • Use field variables (%FieldName) to insert "lookup fields".
    • The lookup field(s) value will be inserted into the XPath when XML Lookup runs.
    • Example: //book[//book[@isbn13='%ISBN_Lookup']/listPrice='%List_Price_Validation']
    • For validation lookups, no data is populated into target fields. Instead, you are only alerted if the lookup fails. All lookup fields will be flagged if the lookup fails.
  8. Set the "Field Population" property to "None".
    • Because we did not configure Value Selectors for this XML Lookup this is not strictly necessary. But is considered best practice to always set "Field Population" to "None" if a lookup is used for validation only.
  9. When finished configuring the XML Lookup, press "OK" to exit the Lookups editor.
  10. Save changes to the Data Element.

Use XML Lookup to populate List Values

Example "County" list when "State" is "Delaware".
Example "County" list when "State" is "Hawaii".

"List Values" is a set of configurable properties for Data Fields and Data Columns. When enabled, it allows Grooper designers to set a list of values users can pick from a dropdown selection when reviewing the field in a Data Viewer.

When the "Include Lookup Results" option is enabled, the field will use the results of a lookup (such as XML Lookup) to generate the selection list.

  • This is helpful for use cases where the possible entry values for Field B change based on Field A's value.
  • Example: A "State" lookup field could be used to generate list values for a "County" field using this mechanism.

BE AWARE: Set the Lookup Specification's "Conflict Disposition" to "Ignore"

The "Conflict Disposition" determines what happens when the Lookup Specification returns multiple hits (multiple rows for Database Lookup, multiple records for XML Lookup or Web Service Lookup, or multiple lines for Lexicon Lookup).

  • You must set the Lookup Specification's "Conflict Disposition" to "Ignore" in order to use a lookup to generate a selection list for a Data Field/Data Column.
  • In normal lookups (for field population/validation), Grooper is designed to alert you if the lookup produces multiple hits.
  • However, in this case, we want to ignore that warning to include all results in the selection list.


To use XML Lookup to generate a field's list values:

  1. Drag the XML file to the Project (or a folder in the Project).
  2. Select the Data Element you want to add XML Lookup to.
    • Commonly lookups are added the Data Model. However, lookups may be added to Data Sections and Data Tables as well.
      • If added to the Data Model, XML Lookup will execute once for the document instance.
      • If added to a Data Section, XML Lookup will execute for each section instance.
      • If added to a Data Table, XML Lookup will execute for each row instance.
  3. Open the "Lookups" editor (Press the "..." button).
  4. Press the "Add" button and select "XML Lookup".
  5. Using the "Source" button, select the XML file you added to the Project in Step 1.
  6. Open the "Record Selector" editor (Press the "..." button).
  7. Enter an XPath expression to select an XML node from the source XML file.
    • The Record Selector is crucial to XML Lookup. The lookup will fail if no record is selected.
    • Use field variables (%FieldName) to insert "lookup fields".
    • The lookup field(s) value will be inserted into the XPath when XML Lookup runs.
    • Example: //state[@name='%State']/counties/county would select <county> XML nodes in a hierarchy of <state><counties><county> nodes where the <state> node's name attribute is the Grooper extracted value for a "State" Data Field.
  8. Open the "Value Selectors" editor (Press the "..." button).
  9. Press the "Add" button to add a new Value Selector.
  10. In the "Path" editor, enter an XML selector that selects an attribute or child XML element whose value you want to collect. When XML Lookup runs, the Value Selector's "target field" will be populated with the selected attribute or element's text.
    • Example: @name would select an XML attribute named "name".
    • In cases where the Record Selector selects XML elements whose inner text you want to use for List Values, enter the dot expression (.) to select the elements' text.
  11. Open the "Target Field" dropdown and select the corresponding Data Field in the Data Model. When XML Lookup runs, the Data Field you select will be populated with the attribute or element's text selected by the Value Selector's Path.
    • Example: "County" Data Field
    • Most typically, one XML Lookup will be used to generate one Data Field's list. This means you'll only need a single Value Selector. However, you could add multiple Value Selectors if multiple fields selection lists change depended on the lookup field's (or fields') value(s).
  12. Press "OK" when finished to exit the Value Selectors editor.
  13. Set the "Conflict Disposition" to "Ignore".
    • The selection list will not be generated if not set to "Ignore". Multiple lookup hits are what we want in this case. Telling Grooper to "ignore" the conflict will allow all lookup hits to be used to populate the list values.
  14. When finished configuring the XML Lookup, press "OK" to exit the Lookups editor.
  15. Save changes to the Data Element.
  16. Navigate to the target Data Field/Data Column selected in step 11.
  17. At the bottom of the property grid, expand the "List Values" settings.
  18. Enable "Include Lookup Results" by changing it to "True".