File Store (Node Type)

From Grooper Wiki

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

hard_drive File Store nodes are a key part of Grooper's "database and file store" architecture. They define a storage location where file content associated with Grooper nodes are saved. This allows processing tasks to create, store and manipulate content related to documents, images, and other "files".

  • Not every node in Grooper will have files associated with it, but if it does, those files are stored in the Windows folder location defined by the File Store node.

The File Store is a critical part of a Grooper Repository's infrastructure. A Grooper Repository is composed of two things:

  • A Grooper Database - This stores nodes and their property configurations.
  • A Grooper File Store - This stores any files associated with those nodes.

As a node in the Node Tree, the File Store is mostly just a folder path location in a file share. However, the File Store's importance in Grooper processing cannot be understated. Any time Grooper needs access to file content associated with a node, it will travel the path defined in a File Store node to locate, modify and return it as needed.

  • BE AWARE: File Stores can be any folder you have writable access to, but it is best practice to use a fully qualified UNC path.


Many Grooper Repositories will only have one File Store node that is created with the Grooper Repository is first initialized. This is the "Primary" File Store node that is created in the File Stores folder node. By default, this is set as the Grooper Repository's "Active File Store" (using the property of the same name on the Grooper Root node).


Circumstances where a second (or more) File Store node needs to be created are rare. Examples include:

  • A new File Store node may be created for the Dispose Batch activity. It can be configured to move content to a different File Store. This allows user to offload processed content to lower tier archival storage, but still access it in a Grooper Repository.
  • The current File Store runs out of space. In this scenario, a new File Store could be created and set as the Active File Store the old File Store would then serve as archival storage.


Files stored in the File Store include:

  • Images for Batch Page nodes.
  • Imported files (PDFs, TIFFs etc) attached to Batch Folder nodes.
  • Files Grooper generates for a node, such as a "Grooper.DocumentData.json" file generated for a Batch Folder by the Extract activity.

FYI

If you select an node in the Node Tree, then go to the "Advanced" tab, then go to the "Files" tab underneath each file listed is stored in a File Store.

About

The File Store in Grooper is a file share in a Windows environment. It houses the files associated with nodes in Grooper that have information that would otherwise be inefficient to store in a cell in a database table.

The Grooper File Store exists at a user-specified location. This should always be a network path (UNC path). If a File Store is given a local path, computers connecting to that repository remotely will not be able to access it. To set up a Grooper Repository so that other computers can connect to it, make sure you reference the File Store using a UNC path!

The File Store contains three levels of directories. A File Store entry will exist on disk as a file in the lowest level with a .grp extension (e.g. 00 > 00 > 00 > 00000000-0000-0000-0000-000000000000.grp). Each of the lowest-level folders in the File Store will have a maximum of 256 files, at which point a new folder at that level will be created. If the lowest level contains 256 folders, a new folder will be created at the level above; this gives the Grooper File Store a limit of 256 ^ 4 = 4,294,967,296 files stored on disk.

While the File Store entries are all given .grp extensions, the contents of the file are unaltered from their "actual" form. If you navigate, for example, to the GRP file associated with an pdf imported using full import, you can open it and view it with a PDF viewer. The files in the file store are intentionally obfuscated to prevent users from interacting with them outside of Grooper, as they are essentially "Grooper-internal" nodes.

Although the majority of files in the File Store relate to Batch nodes (a page's image or imported files), some files are the result of other "in-Grooper" processes, such as layout data, OCR character data, extracted index data, and more.

Information on migrating a File Store

New files, such as scanned or imported images of Batch Pages, are always written to the "Active File Store". This is a property set on the Grooper Root in the Node Tree. It will default to the File Store created when the Grooper Repository was initialized, unless otherwise specified.

If you need to "migrate", or "backup and restore" a File Store please visit this article: