Alignment (Property)

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Alignment is a grouping of properties found on Fill Methods and Data Elements that manipulate the prompt provided to an LLM chatbot in an attempt to provide accurate highlighting of values displayed within the Document Viewer.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2024). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

About

As data values are copied into the Grooper document, they may optionally be aligned to a location in the document content. The process of mapping AI responses back to the document allows Grooper to highlight regions on the document as users tab into the associated fields, and can be critical for efficient review of the documents. If the documents will not be reviewed by human operators after data extraction, then alignment may be unnecessary.

Keep in mind that changing the Alignment will affect the length of the prompt given to the AI. This in turn will affect the tokens consumed which will impact the "cost" of the prompt in both time of processing and monetary value. Therefore no Alignment will cost the least and produce no highlighting, which is best when fully automating and doing no human review.

Alignment is controlled by first configuring the alignment properties of the AI Extract Fill Method with defaults for how Data Sections, Data Tables, and Data Fields will be aligned. After the defaults have been established, they can be overridden on individual Data Sections, Data Tables, and Data Fields using the AI Extract Section Options, AI Extract Table Options, and AI Extract Field Options properties on these Data Elements.

It is worth noting that results from LLM AIs are unpredictable. The Alignment properties represent a "best effort" to achieve document highlighting. It is difficult, if not impossible at times, to get exactly the results you may seek for document highlighting.

Alignment Properties of AI Extract

These properties are set within AI Extract on container Data Elements (Data Model, Data Section, Data Table). Think of these as the "global" settings for descendant elements.

Default Field Alignment: This is the default alignment mode for all descendant Data Fields.
- None: Data values captured by the AI will not be aligned to the document and no highlighting will be produced.
- Natural: Data values will be aligned if their value appears verbatim in the document.
  - For example consider a Data Field set to a Value Type of DateTime with a Format Specifier of d. A date value is on a document with a format of "01/01/2001". The value in the Data Field would return as "01/01/2001" and would highlight on the document. If the value on the document was instead "January 1st, 2001" the value in the Data Field would return as "01/01/2001" and there would be no highlight.
  - Token consumption is higher than None.
- Quoted: Data values will be aligned using a quote from the document content.
  - Consider the previous example. In this case, if the value on the document was "January 1st, 2001" and the displayed value in the Data Field was "01/01/2001" this setting would produce a result.
  - Token consumption is higher than Natural.
- Labeled: Data values will be aligned using a label and a quote from the document content.
  - If multiple dates with similar values are on a document, this setting would allow the specific date related to a label to be returned. If not, the first occurrence of the value will always be highlighted.
  - Token consumption is higher than Quoted. This is the most expensive option, but may prove to be the most accurate for highlighting.
Default Section Alignment: This is the default alignment mode for all descendant Data Sections.
- None: The section will not be aligned to the document and no highlighting will be produced.
- Split: Ask the AI to quote the first line of each section instance, and then split the input at these positions.
  - Mainly used for Multi-instance Data Sections. Think of it like the Data Section Extract Method of Divider with a Split Position of Begin. The pink bounding box drawn will start on the first line of the instance of the section, and end at the first line of the next repeated section.
  - Token consumption is higher than None.
- BoundingQuotes: Ask the AI to quote the first and last lines of the section instance.
  - Think of it like the Data Section Extract Method of Divider with a Split Position of Between. The pink bounding box drawn wills tart on the first line of the instance of the section, and end at the last line of the section.
  - Token consumption is higher than Split.
- Quote: Ask the AI to quote the entire section content.
  - Think of it like the Data Section Extract Method of Simple. In this case you would create an extractor that would return an entire section instance.
  - Token consumption is higher than BoundingQuotes. This is the most expensive option, but may prove to be the most accurate for highlighting.
Default Table Alignment: This is the default alignment mode for all descendant Data Tables.
- None: The table will not be aligned to the document and no highlighting will be produced.
- Split: Ask the AI to quote the first line of each row, and then split the input at these positions.
  - Useful when considering multi-line table rows, but could be functional for single line table rows.
  - Token consumption is higher than None.
- BoundingQuotes: Ask the AI to quote the first and last lines of the row.
  - This is only really useful when considering multi-line table rows.
  - Token consumption is higher than Split.
- Quote: Ask the AI to quote the entire row.
  - Token consumption is higher than BoundingQuotes. This is the most expensive option, but may prove to be the most accurate for highlighting.

Alignment Properties of Data Elements

These properties are set on individual Data Elements. Think of thees as "overrides" to the "global" properties set on parent containers.

AI Extract Field Options is a property found on Data Fields that is set by configuring its sub-property Field Alignment. The settings found in the drop-down for Field Alignment mirror those seen in the Default Field Alignment property described above.
AI Extract Section Options is a property found on Data Sections that is set by configuring its sub-properties Section Alignment and Field Alignment. The settings in the drop-down for Section Alignment mirror those seen in the Default Section Alignment property described above. The settings in the drop-down for Field Alignment mirror those seen in the Default Field Alignment property described above, and affect descendant Data Fields.
AI Extract Table Options is a property found on Data Tables that is set by configuring its sub-properties Table Alignment and Field Alignment. The settings in the drop-down for Table Alignment mirror those seen in the Default Table Alignment property described above. The settings in the drop-down for Field Alignment mirror those seen in the Default Field Alignment property described above, and affect descendant Data Columns. Think of the "label" in this case as the column header.

How To

Field Alignment

Select the Data Model from the provided Project.
Click the ellipsis button on the Fill Methods property.
Note that the Alignment properties are set to defaults.
- Think of these as the "global" settings for descendant elements.

Click the "Tester" tab.
In the Batch Viewer, select the document from the provided Batch.
Click the "Test" button.
Click on the "Document Date 01" field in the Data Model Preview.
Note that the value is not highlighted in the Document Viewer.
- This particular field is set to a Value Type of Date with a Format Specifier of d. As a result, the format of the contained result does not match "verbatim" the result displayed on the document. Due to this, the default setting of Natural for the Default Field Alignment property on the AI Extract Fill Method will not suffice to produce highlighting.

Select the "DOcument Date 01" Data Field.
Expand the AI Extract Field Options property and set the Field Alignment sub-property to Quoted.

Go back to the Data Model and test extraction again.
Note now the value when selecting the "Document Date 01" field in the Data Model Preview is highlighted in the Document Viewer.
- The Quoted setting allows for the context of the result to be leveraged instead of the verbatim string required by Natural.

Select the "Document Date 03" Data Field.
Set the Field Alignment property to Labeled.

Go back to the Data Model and test extraction again.
Note the value when selecting the "Document Date 03" field in the Data Model Preview is highlighted in the Document Viewer.
- The Labeled setting provides further context to be leveraged as part of the prompt to the chatbot. Instead of simply providing the value of the result, the "label" used to distinguish this result from others is provided and used to give more accurate highlighting. This of course consumes more tokens as it is increasing the size of the prompt, but it gives the most accurate result.

Section Alignment

Click on the Data Model and test extraction.
Click the "Provider" field in the Data Model Preview.
Note that the correct value is highlighted, but the bounding box of the Data Section is not being displayed.

Select the Data Section.
Set the Section Alignment property to BoundingQuotes.

Click the Data Model and re-run extraction.
Note that the bounding box for the Data Section is now shown in the Document Viewer when selecting a descendant field from the Data Model Preview.
- The BoundingQuotes setting is providing the first and last lines of the captured Data Section as part of the prompt. This allows for the "top" and "bottom" of the highlight box to be defined which is enough to draw the simple polygon.

Table Alignment

Select the Data Model and run extraction.
Click the "Total" cell of line one of the extracted table in the Data Model Preview.
Note that teh wrong value is highlighted in the Document Viewer.
- Note also that the row instance is not being highlighted. While "$1.00" is the correct value for the "Total" cell, the highlighting is inaccurate because there wasn't enough context provided in the prompt to distinguish that specific result.

Select the Data Table.
Set the Table Alignment property to Quote and the Field Alignment property to Labeled.

Select the Data Model and re-run extraction.
Note now that when selecting the "Total" cell from the extracted table in the Data Model Preview the correct value is being highlighted in the Document Viewer.
- The Quote setting for the Table Alignment property provides the full row instance as part of the prompt, and you can see the row instance is highlighted in the Document Viewer as a result. This will allow highlighting to discern each row instance accurately. The Labeled setting of the Field Alignment property combined with the specific row instance will essentially allow for a cross-section to be formed, where the intersection of the row with the column header will allow a specific value to be distinguished from another within the table.