What's New in Grooper 2025
COMING SOON
Grooper 2025 has not been released yet. This page is a placeholder for internal use only.
AI Assistants and the Chat Page
What is an AI Assistant?
AI Assistants are Grooper's conversational AI personas. They define a role to be used in Grooper Chat sessions. Each AI Assistant has access to a collection of user-defined resources.
- Normally conversational AIs ("chatbots") only have access to whatever they were trained on.
- These user-defined resources extends the AI Assistant's ability to answer questions on domain specific information contained in documents, databases or retrieved from a web service.
FYI |
AI Assistants are a replacement for the "AI Analyst" object. AI Analysts were Grooper's first attempt at a conversational AI. AI Assistants are a substantial improvement. They are able to access document content and data quicker. They are able to answer questions across larger document sets (even an entire Grooper Repository). They have access to more knowledge resources, such as information obtained from a database. |
How does a user interact with an AI Assistant?
In Grooper
Users access AI Assistants using the Grooper Chat page. From here, users can select an AI Assistant previously configured in Grooper Design. Users can start new conversations or continue conversations they have previously started.
Outside of Grooper
Users can also extend AI Assitants to external applications, including Teams, Slack or custom built applications. This allows Grooper assistants to be used in multiple channels.
There are two ways to extend a Grooper AI Assistant:
- Azure Bot Services
- Microsoft's Azure Bot Framework allows us to expose AI Assistants to a multitude of applications they call "channels".
- Channels include Teams, Slack, email, SMS, and more.
- More information on Azure Bot Service's channel support can be found here.
- Communication is secured with OAuth client credentials. Users have further control over if and how documents are linked in AI responses.
- Integrating Grooper with Azure Bot Services requires setup in Grooper, in Azure, and in your own server infrastructure. For a quick reference, visit the Azure Bot Service article in our wiki.
- Microsoft's Azure Bot Framework allows us to expose AI Assistants to a multitude of applications they call "channels".
- Grooper Web Services (GWS) REST API
- GWS is a new API set in Grooper.
- The "/assistant" endpoints were specifically created for developers who want to chat with AI Assistants via web calls.
- This allows developers a way to use AI Assistants in their own applications.
- See below for more information on GWS.
What resources can AI Assistants connect to?
AI Assistants can connect to the following resources:
- Search Index References - Allows the AI Assistant to retrieve document text content in an Azure AI Search index. Both metadata search and vector searches are supported.
- Table References - Allows the AI Assistant to retrieve data from database tables using SQL queries (If the user defined in the Grooper Data Connection has write permissions, the AI Assistant may also write data to the database).
- Web Service References - Allows the AI Assistant to retrieve data from APIs using web service calls.
The AI Assistant's "retrieval plan" determines which of these resources should be used to respond to the chat. This allows users to query vast amounts of document text (using vector searches in a Search Index Reference), extracted data (stored as metadata in a Search Index Reference) and supplement information in the Grooper Repository with data from external sources (SQL tables and web services). All of this is done with a natural language prompt. No complex syntax required.
What are some benefits to AI Assistants?
AI Assistants provide users with a new way to interact with documents and other resources the AI Assistant can connect to (like databases).
- Users can search for documents and their data using natural language.
- Provides on-demand access to data inside documents. Users can find information without setting up a Data Model and its extraction logic.
- Provides near instant time-to-value. Minimal processing is required in Grooper before users can start chatting with a single document or across large document sets.
- Reduces the need to extract everything up front. Allows users to gain insights into documents without complicated extraction.
HTTP Import
HTTP Import is a new Import Provider in 2025. It allows users to import website content into Grooper Batches. HTTP import can be used to import:
- Individual webpages
- Documents hosted on a website accessible from a URL
- Entire websites
Mechanisms to select links using CSS and filter pages using regular expressions are included in the HTTP Import configuration.
Websites are a great resource for AI Assistants. They can serve as one of many different knowledge resources that can be used to answer users' questions from the Chat page.
HMTL conditioning commands
There are several new HTTP and HTML commands in Grooper. These commands will condition HTML documents for further processing. These commands are particularly useful for preparing HTML documents for an AI Assistant.
- HTTP Link > Load Content - Allows webpages to be imported into Grooper sparsely then loaded multithreaded.
- HTML Document > Condition HTML - This command has several cleanup and normalization options for webpages.
- The "Body Selector" uses CSS selectors to match an element to replace the HTML's body. This gets rid of unnecessary text content before feeding webpages to an AI Assistant.
- The "Removal Selector" uses CSS selectors to remove HTML elements. This can help remove unnecessary or repetitive content before feeding webpages to an AI Assistant.
- The "Site URL" can be prepended to relative links in the HTML page. This will give users a better viewing experience when the page is loaded in the Document Viewer.
- HTML Document > Convert to PDF - Converts the HTML page to a PDF document. Grooper can then process the PDF just like it processes any PDF.
- HTML Document > Convert to Text - Converts the HTML page to a TXT document. This is useful for only for webpages that present as text files (For example this page from the US Code of Federal Regulations hosted on govinfo.gov). It will get rid of unnecessary HTML elements and leave you with just plain text.
AI productivity helpers
Full article on AI productivity helpers
Grooper introduced two "AI productivity helpers" in version 2024. These features use a large language model (LLM) to assist Grooper Design users in their work building Grooper assets. They can be used for help with regular expressions, SQL queries for Database Lookups, even creating full Data Models.
- You must enable the "LLM Connector" option in your Grooper Repository to use these tools.
List of AI productivity helpers
- AI Generated Schema Importer - This helps create Data Models quickly. This tool generates Data Elements in a Data Model from a natural language prompt. Enter something like "Create a Data Model for invoice processing." and this will create unconfigured Data Sections, Data Fields and Data Tables related to invoice processing.
- AI Query Helper - This helps users search for documents in the Grooper Search page. The Search page uses a powerful set of syntaxes to retrieve documents in a search index. For users unfamiliar with this syntax, the AI Query Helper builds the search them from the prompt they enter.
- AI Expression Helper - This helps users craft regular expressions for the Pattern Match extractor.
- Db Lookup Helper - This helps users craft SQL queries for Database Lookups.
- XSLT Helper - This helper can be found in the XML Transform activity's XSLT Tester. This will generate an XSLT transform from the user's prompt.
- AI Helper - This helper shows up all over the place in Grooper, wherever there is a text editor. Potential uses include"
- Lexicon (and Local List) editors - Generate lists for List Match extractors.
- Description editors - Generate field descriptions to assist AI Extract.
- Code expression editors (Calculated Value editors, Default Value editors, Should Submit editors, etc) - Generate expressions based on natural language prompts!
Grooper Web Services (GWS)
Grooper Web Services (GWS) is a new set of Grooper REST API endpoints. GWS is installed as a separate website by the Grooper Web Client installer. It was created to extend our initial API set. New endpoints are included to access AI Assistants and Grooper Search using web calls.
- Eventually GWS will fully replace the initial Grooper REST API offered by API Services. However, API Services will continue to function in this version.
GWS endpoint collections
AI Assistant related:
- /assistants - These endpoints are for development using AI Assistants. Use this API to implement your own chat client that allows users to interact with Grooper's AI Assistants.
- /bot - These endpoints integrate AI Assistants with Microsoft Azure Bot services. These endpoints are called by the Azure Bot service. Do not call these endpoints directly.
Search related:
- /search - These endpoints are for executing document searches. Use this API to query Grooper search indexes using natural language searches, full text searches, or metadata searches.
Document processing related:
- /batches - These endpoints access and manage Batches in Grooper.
- /documents - These endpoints access and manage documents in Grooper.
- /processes - These endpoints are used to get information about published Batch Processes and their steps.
Miscellaneous:
- /nodes - These endpoints manage nodes in the Grooper node tree. This provides low-level access to the Grooper Repository's tree structure (i.e. the node tree in the Design page). Use with caution!!
- /commands - These endpoints can execute commands on Grooper nodes, including Batches, documents (Batch Folders), or other node types.
OAuth Support
OAuth is an authentication method that allows third-party applications web access without sharing passwords.
- Microsoft Entra ID (formerly Azure Active Directory) is the only supported OAuth provider at this time.
Benefits to OAuth:
- Security - Users do not have to share their passwords with third party applications.
- Simplified logins - Users can log into multiple applications with existing accounts. In the case of Grooper, with a Microsoft Entra ID account (formerly Azure Active Directory).
- Integrations - OAuth is the security standard for app-to-app communication. Securing Grooper with OAuth allows us new integration options, including using Azure Bot Services to extend AI Assistants to external chat channels.
Both the Grooper website and the GWS website can be configured with OAuth authentication.
- Grooper and OAuth - When you configure the Grooper website to use OAuth, users will log into Grooper using their Entra ID credentials. Microsoft will ask you to approve the login and grant access. This allows Grooper log in using Microsoft authentication servers.
- Previous login methods are still supported and Windows remains default login method for the Grooper web app.
- OAuth is required if you are (1) extending an AI Assistant to an external channel like Teams using Azure Bot Services and (2) want users to be able provide users links to download Grooper documents or open documents in Grooper in the chat response. This mechanism secures the links sent between Grooper, the Azure bot, and the chat channel.
- GWS and OAuth - GWS uses OAuth client credentials to communicate with Azure Bot Services.
- This authentication method is required for users wanting to extend AI Assistants to external channels like Teams using Azure Bot Services.
- For users that what to provide links in the chat response to download Grooper documents or open documents in Grooper, both the Grooper website and GWS websites will need to be secured with OAuth.
There is some additional setup required to configure OAuth authentication. You must register Grooper as an application in Microsoft Entra ID and you must configure each website's web.config files. Full instructions on setting up OAuth are coming soon.
Miscellaneous
New features
New Classification Method: Search Classifier

"Search Classifier" classifies documents using a search index. Document Types are assigned by finding similar documents in a search index.
- This method relies on a search index to classify documents. As such, an Indexing Behavior must be configured for the Content Model.
- This method relies on "vector searches" to compare unclassified documents to documents in the search index. As such, the Indexing Behavior must have its "Vector Search" property enabled to collect and store vector embeddings for each document.
- This method requires documents to be present in the search index before the Classify activity can classify unclassified documents. This means there is some manual effort required to fill the search index with examples of each Document Type.
- The basic process is to add one or more examples to the search index manually. First, assign it a Document Type manually. Then, add it to the search index.
- Then, run Classify on a set of new documents. The Search Classifier method will take vector embeddings for each document and use them to compare against each document in the index. It will find a match for the most similar document based on these embeddings. The unclassified document will be assigned the Document Type of that match.
- The idea is classification will improve over time as a document's type is corrected over time. As the search index fills up with corrected examples of each Document Type, the Search Classifier method has more data to compare against and make a decision.
BE AWARE: Search Classifier has not been tested against real-world document sets. Its efficacy has not been proven in production scenarios.
BE AWARE: It is likely Search Classifier will be a compatible classification method for ESP Auto Separation as well. However, this has not been fully vetted at this time.
New Fill Method: "Fill Descendants"

The "Fill Descendants" method was created solely to increase Extract efficiency when using AI Extract on multiple Data Sections in a Data Model. Fill Descendants will execute fill methods (i.e. AI Extract) on descendant Data Elements (i.e. Data Sections) in parallel.
- This will use multiple threads to perform "prefetch operations" for the document instead of just one.
- For AI Extract, LLM completion operation are performed during prefetch. Fill Descendants allows multiple LLM completion operations to be performed in parallel, significantly reducing extraction time when dealing with Data Models with multiple Data Sections running AI Extract.
- In a scenario where eleven Data Sections were configured with AI Extract, extraction time when from 5 minutes to 25 seconds!
Detect Language: New AI-based language detection activity
There is a new and improved "Detect Language" activity in Grooper. It uses large language models to determine what language the text on a document is. Because modern LLMs are adept at natural language processing across multiple languages, this activity does an excellent job determining a document's native language with little to no setup.
- The detected language gets stored as the document's (Batch Folder's) "Culture" property.
Note: The older Detect Language activity still exists in 2025. It is named "Detect Language (Legacy)"
Use experience improvements
Upload documents from the Batches page

Batches page users will find a new "Upload" button in the context toolbar at the top of the screen.

This button will allow users to upload one or more files and pick a Batch Process. Grooper will create a paused Batch of documents with each file attached.
- This is the easiest way for "transactional" document processing in Grooper.
- Users wanting to process only a few documents on their local drive no longer have to go through the trouble of configuring an Import Provider from the Imports page.
Search text in any Document Viewer
"Reports" tab replaces Content Type "Summary" tabs
Search page improvements
We've made changes to the way search queries are saved and managed.
LLM fine-tuning improvements
XML processing and transform improvements
- schema importer improvements - xml commands - xml transform namespace support
Efficiency improvements
Activity Processing improvements
Dispose Batch improvements
Other changes
Test imports in the Design page
ONLY FOR TESTING Large scale production imports should still be managed from the Import Jobs page or by Import Watcher schedules.