Batch Archiving Guidance

From Grooper Wiki

You've got your Grooper Repository all set up. You've architected your Content Model. You have a Batch Process up and running. Documents are coming in and being processed accordingly. But, now that you've reached the end of the Batch Process what do you do with the Batch itself?

You have a choice to make here:

  • Do I just delete the Batch?
  • Do I archive the Batch for longer term storage?

Deleting the Batch vs Archiving the Batch

There are pros and cons to deleting Batches and archiving them for long term storage. There are also different ways of archiving Batches which we will consider in this section of the article. Before we get there, you need to decide if you want to archive them at all. To help you make that decision, consider the pros and cons of each option.

Pros and cons: Deleting Batches

Production Batches can be deleted from Grooper in one of three ways:

  1. They can be manually deleted by a user.
  2. They can be deleted automatically by a Dispose Batch step in a Batch Process.
  3. They can be first moved to a folder in the "Test" branch by a Dispose Batch step in a Batch Process. Then, they can be regularly scheduled for deletion by the System Maintenance Service using its Purge Batches feature.

What are the pros and cons of deleting Batches from Grooper?

Pros for deleting Batches

Can save on storage

If your files and data have been exported out of Grooper, keeping content in the Grooper Repository may be unncessicarily duplicating files in Grooper.
  • Sparsely importing documents into Grooper can help mitigate this. Sparse documents are not actually loaded into the Grooper file store. They are instead accessed on-demand by a link between Grooper are the external content management system.

Don't have to manage content in two systems

If you've exported content out of Grooper, that external system is likely your "system of record." It is where documents and/or their data live long term. By keeping that content in Grooper and your external system, it is critical to keep track of that document in both locations to ensure changes to the document in one system are reflected to the corresponding document in the other.
  • In version 2024, Grooper introduced its first ever document search and retrieval mechanism, AI Search. This makes it possible for Grooper to be your sole system of record. AI Search has robust querying and filtering capabilities, allowing you to locate documents using both their full-text content and data extracted by Grooper.

Keeps the Grooper Repository lean

A repository with fewer Batches (and other Grooper nodes period) is generally more responsive than one brimming with content. This is particularly true when content is poorly organized. An excessive amount of Batches in a single folder can impact Grooper's performance when trying to navigate viewers that render lists of Batches (such as the Batches Page).
  • This is less true in newer versions of Grooper. Version 2024 started implementing efficiency changes to make it possible to keep Batches in a Grooper Repository long term without dramatically affecting performance.

Cons against deleting Batches

Deleting Batches makes it more difficult to reprocess documents

If you delete a Batch, everything done to those documents in Grooper is gone too. If you need to reprocess previously exported content (say to extract new fields from them) and they have been deleted from Grooper, you will need to start all over again at the start of the Batch Process. If the exported content is already in Grooper, and is still classified, still has OCR data, still has an extracted Data Model or whatever else the Batch Process did, you

It's not really necessary in newer Grooper versions

You may have noticed each "Pro" point above had an "FYI" providing a caveat of sorts. Traditionally Grooper has been an intermediary. It's job was to take unstructured document content, make sense of it, collect the data you want, and export documents and data into a structured end destination. It simply was not designed to hold content long term. However, starting in version 2024, new strides in Grooper's efficiency and features like AI Search make it possible to keep content in Grooper long term.


If you decide to keep Batches in Grooper long term, you will need to develop an archiving strategy. There are a few different ways to archive Batches in Grooper. In the next section, we will discuss these methods and give you some best practice advice.

Ways to archive Batches