DI Layout (Quoting Method)

From Grooper Wiki
(Redirected from DI Layout)

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

DI Layout is a Quoting Method that sends Azure Document Intelligence layout results into an AI prompt. Instead of quoting only plain text, it can quote document structure and semantics produced by DI Analyze, such as paragraphs, tables, and page layout. This is useful when the AI needs more than OCR text alone to understand the document.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2025). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

Introduction

DI Layout is a Grooper Quoting Method that sends Azure Document Intelligence layout results into an AI prompt.

In practical terms, it uses the output created a previously run DI Analyze Activity to send recognized text, layout information, style information, and semantic document elements to an LLM to aid in extraction. It is useful when an LLM needs document structure and not just plain text to accurately extract data.

DI Layout is most useful when your extraction depends on document structure. Use DI Layout when extraction depends on things like:

  • Tables
  • Labels near values
  • Page structure
  • Selected document regions
  • Multi-page filtering

Some common examples of when to use DI Layout include:

  • Invoices with line-item tables
  • Bank statements with repeating rows
  • Forms where field meaning depends on nearby labels
  • Packets where only selected pages should be considered
  • Workflows that need region-based quoting from a specific part of the page

Before you use DI Layout, make sure:

  • You have an account with Microsoft Azure and a subscription that allows the use of Azure Document Intelligence
  • The Azure Document Intelligence repository option is configured and available in Grooper
  • DI Analyze has already been run on the same Batch Folder or Batch Page
  • The "Model Name" in DI Layout matches the model used by DI Analyze
  • Your AI workflow is configured to use a Quoting Method

Why use DI Layout?

Quoting Methods allow you to pick and choose what information about a document gets sent to the LLM when you are using Grooper's AI functionality such as the AI Extract Fill Method. The DI Layout Quoting Method specifically sends data obtained from Azure Document Intelligence through the DI Analyze Activity. There are pros and cons to using the DI Layout Quoting Method.

Benefits:

  • preserves layout-aware context better than plain OCR text
  • supports Markdown, JSON, or HTML output depending on the AI task
  • can limit quoting to selected pages or selected regions
  • can help with table extraction and spatially sensitive documents
  • can split large content into chunks with "Max Tokens Per Chunk"
  • can improve text location of extraction data

Drawbacks:

  • requires a prior DI Analyze step
  • adds Azure processing time and cost
  • may provide more content than needed if page and region filters are not used
  • JSON and HTML formats can be more verbose than plain text
  • if the wrong model name is selected, no matching results will be loaded

How DI Layout differs from other Quoting Methods

Other Quoting Methods often quote text, extracted values, or simpler document content. DI Layout is different because it uses Azure Document Intelligence output that already describes the page structure. In practice, this makes DI Layout a better choice when layout matters and a less necessary choice when plain extracted text is enough.

When to use DI Layout

Use DI Layout when:

  • you need the AI to understand tables, sections, or page structure
  • you want to quote only certain pages of a document
  • you want to quote only the content inside a scoped Data Element location
  • you need geometry-aware output for downstream AI reasoning
  • you are working with long documents and want chunked quoting

Use a simpler Quoting Method when:

  • the needed text is already available in a single field or region
  • layout does not affect meaning
  • minimizing prompt size is more important than preserving structure

How to

Before you can use the DI Layout Quoting Method, you must run DI Analyze on your documents. For instructions on how to run DI Analyze, take a look at the DI Analyze Wiki Page.

Configure DI Layout

To Verify DI Analyze has been run on documents:

  1. Navigate to a Batch in your Batches --> Test folder.
  2. Expand out the Batch in your Node Tree.
  3. Select the node where you would expect the DI Analyze data to be attached.
  4. Click over to the "Advanced" tab.
  5. Locate the DI Analyze JSON file in the list on the right. If you do not see a DI Analyze JSON, then the document has not been run through DI Analyze.

To configure the DI Layout Quoting Method:

  1. On the Grooper Design page, open the AI configuration that uses Quoting Methods.
  2. Set the Document Quoting property to DI Layout.
  3. Expand out the Document Quoting sub properties.
  4. The "Model Name" should autopopulate with prebuilt-layout. You should not need to change this.
  5. The Quote Format indicates in what format you want to send the data in to the LLM. Set "Quote Format" to one of the three options:
    • Markdown for readable text and general extraction
    • JSON for structured layout data
    • HTML when you want layout-rich markup and location-aware output
      • If you selected HTML, decide whether to enable "Include Row Bounds".
  6. If only certain pages should be quoted, enter a "Page Filter".
  7. If only a certain part of the document should be quoted, set "Scope" and optionally "Region Extractor".
  8. If the document is large and your workflow supports chunking, set "Max Tokens Per Chunk".
  9. Save your changes.
  10. Test the workflow on sample documents and confirm the AI is receiving the expected content.

To configure the DI Layout Quoting Method with improved location and highlighting data, follow these additional instructions:

  1. When configuring the DI Layout Quoting Method, select HTML for your Quote Format.
  2. In the Alignment sub properties for the AI method you are using for extraction, set the Field Alignment, Section Alignment, and Row Alignment properties to Geometric.
  3. Save your changes and test your documents to ensure highlighting is working appropriately.

Property overview

Property Purpose How to use it
"Name" Gives the Quoting Method a user-friendly label. Use a descriptive name such as Invoice Layout or Statement Tables. This helps identify the quote in multi-quote workflows.
"Description" Explains to the AI what the quoted content represents. Describe the layout being provided and what the AI should do with it.
"Type" Shows the Quoting Method type. Read-only. Useful for confirming you are using DI Layout.
"Settings" Shows a summary of the current configuration. Read-only. Useful for quick review.
"Model Name" Selects which Azure Document Intelligence result file to use. Must match the model used in DI Analyze. If these do not match, the expected result file will not be found.
"Quote Format" Chooses the format of the quoted content. Use Markdown for general extraction, JSON for structured layout data, and HTML for markup with layout detail.
"Include Row Bounds" Adds row-level bounds in HTML output. Only relevant when "Quote Format" is HTML. Useful for table-oriented review or location-aware output.
"Page Filter" Limits quoting to specific pages. Leave blank for all pages. Use page numbers and ranges such as:
1
1,3,5
2 to 5
-1
1 to -2
"Region Extractor" Finds regions whose content should be quoted. Use this when you want DI Layout to quote only extractor hits instead of the full document area.
"Scope" Limits quoting to instances of a selected Data Element. Use this when a parent or sibling instance defines the area that should be quoted.
"Max Tokens Per Chunk" Splits large DI Layout content into smaller chunks. Use for large documents when your AI workflow processes chunked input. Best for values that can be captured from one chunk at a time.
"Filename" Shows the expected Azure result file name. Read-only. Confirms which DI Analyze output file DI Layout will load.

How page and region filtering work together

DI Layout can narrow content in two ways:

  • "Page Filter" limits which pages are considered
  • "Scope" and "Region Extractor" limit which regions on those pages are quoted

This is often the best way to control prompt size. For example:

  • use "Page Filter" to quote only pages 1 to 2 of a packet
  • use "Scope" to quote only a repeating section
  • use "Region Extractor" to quote only a detected table area

Troubleshooting tips

No DI Layout content is returned

Check the following:

  • DI Analyze was run before the AI step
  • "Model Name" matches the model used in DI Analyze
  • the file shown in "Filename" exists on the document
  • the selected "Scope" actually has location data
  • the "Page Filter" is valid and includes the needed pages

The wrong pages are quoted

Review "Page Filter". It uses 1-based page numbering. Negative numbers count backward from the end of the document.

Examples:

1       first page
-1      last page
1 to 3  pages 1 through 3
1,-1    first and last page

The quote is too large

Try one or more of the following:

  • narrow the "Page Filter"
  • use "Scope" to limit the quote to a specific Data Element
  • use "Region Extractor" to limit the quote to detected regions
  • switch from HTML or JSON to Markdown if structure can be simplified
  • set "Max Tokens Per Chunk" for large documents

The AI is missing table structure

Try:

  • confirming DI Analyze used the correct model
  • reviewing the DI Analyze HTML and Markdown diagnostics
  • using JSON or HTML instead of Markdown if layout detail matters
  • enabling "Include Row Bounds" when using HTML

Folder-level results are missing

If a folder-level DI file is not present, DI Layout can still work from page-level results when those results exist. If neither folder-level nor page-level results are available, rerun DI Analyze.