Import Provider

From Grooper Wiki

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 202320212.90

Import Providers enable Grooper to import file-based content from numerous sources, including Windows file systems, SFTP file systems, mail servers and various content management systems (CMS). An Import Provider is selected and configured when configuring "Import Jobs". Import Jobs are submitted in one of two ways:

  • By a user from the Imports page: Ad-hoc or "user directed" Import Jobs are submitted from the Imports Page, using the "Submit Import Job" button.
  • From an Import Watcher service: Automated or "scheduled" Import Jobs are submitted by an Import Watcher service according to its Poling Loop or Specific Times specification.

In both cases, an Import Provider is selected and configured using using the "Provider" property.

About

What is an import provider?

Import Providers are a core feature in Grooper, enabling the automated or manual ingestion of documents and folders from a wide variety of external sources into Grooper Batches. They provide a flexible, configurable interface for bringing content into the system, supporting both high-volume production imports and targeted, ad-hoc acquisitions.

An Import Provider defines how content is brought into Grooper from external systems. It acts as a bridge between Grooper and sources such as file systems, FTP/SFTP servers, email servers, content repositories, and more. Each Import Provider is designed to connect to a specific type of source, retrieve content, and create new Batches or add to existing ones.

Purpose and benefits

  • Automates the process of acquiring documents and folders from external sources.
  • Supports both scheduled and on-demand imports.
  • Enables consistent batch creation, naming, and content type assignment.
  • Offers advanced options for filtering, disposition (move, delete, flag), and incremental import.
  • Supports both standard and sparse import modes, optimizing storage and performance.

How import providers are used

Import Providers can be used in several ways:

  • From the Imports page for user-directed, manual imports.
  • As part of the Import Watcher service for scheduled, unattended imports.
  • To import documents from search results or other dynamic queries.

Types of import providers

CMIS Import Providers

The CMIS Import providers are used to import content over a CMIS Connection from CMIS Repositories. This allows users to import from various on-premise and cloud based storage platforms.

Documents are imported from CMIS Connections using either the Import Descendants or Import Query Results providers.

  • Import Descendants will import all documents within a designated folder location of a CMIS Repository (including documents in sub-folders).
  • Import Query Results allows you to use a query syntax similar to a SQL query (called a CMIS Query) to set conditions for import based on the item's available metadata, such as a documents name, file type, creation date, archive status, or other variables.
  • BE AWARE! Import Descendants is a simplified version of Import Query Results.
    • It is best practice to use Import Query Results when you are able. It is more fully featured and more developed.
    • Import Descendants should only be used for CMIS Connection Types that are not query-able.


For more information on how to set up a CMIS Connection and how to import using a CMIS Import provider, visit the following articles:

Other Import Providers

For most document import scenarios, users will use one of the CMIS Import providers and import files from a CMIS Repository. However, the following import mechanisms cannot utilize a CMIS Repository. These have very specialized import mechanisms requiring unique Import Providers.

  • HTTP Import
    • This provider is used to import web-based content (web pages and files hosted on an HTTP server). HTTP Import can be used to ingest individual web pages, defined portions of a website or entire websites into Grooper.
  • OPEX Import
    • This provider is necessary to import content using OPEX Scanning Devices (commonly used for mailroom automation).

used to import web-based content (web pages and files hosted on an HTTP server). HTTP Import can be used to ingest individual web pages, defined portions of a website or entire websites into Grooper.

  • Search Import
    • This provider allows you to import documents from the Search Page. It will create a Batch from documents in the search query's result list.
  • Test Batch
    • This provider allows users to import content from a Grooper Test Batch (a Batch in the "Batches > Test" branch of the Grooper node tree).
    • This provider has no real use in production scenarios. It is typically used by Grooper designers testing a Batch Process.

Legacy Import Providers

These providers exist in Grooper for backwards compatibility.

  • In newer versions of Grooper, it is best practice to use a CMIS Import provider with an analogous "CMIS Binding" (selected when configuring a CMIS Connection).
  • However, they are still supported for older versions of Grooper that are upgraded to more current ones.

The Legacy Import Providers are as follows:

  • File System Import
    • This provider imports content over a local network to the Windows file system.
    • The analogous CMIS Connection binding is "NTFS".
  • FTP Import
    • This provider imports content using the FTP protocol.
    • The analogous CMIS Connection binding is "FTP".
  • SFTP Import
    • This provider imports content using the SFTP protocol.
    • The analogous CMIS Connection binding type is "SFTP".
  • Mail Import
    • This provider imports content to mail servers using the IMAP protocol.
    • The analogous CMIS Connection binding type is "IMAP".
    • FYI: If you are importing from a Microsoft Exchange server (i.e. you are using Outlook) you should use the Exchange binding. It is more fully featured, and will interoperate with Exchange email clients better.

Common configuration options

All Import Providers share a set of core configuration options:

  • "Sparse Import": When enabled, only a Content Link is created for each imported item, deferring content loading until needed.
  • "Skip Count": Skips a specified number of items at the beginning of the import sequence.
  • "Max Items": Limits the number of items imported in a single operation.
  • "Batch Creation": Controls how Batches are created and organized during import.
  • "Batch Name Options": Configures naming conventions for new Batches.
  • "Content Type": Assigns a Content Type to imported documents or folders, ensuring correct classification and processing.

Sparse import mode

Sparse import is a powerful feature that allows Grooper to create a Content Link for each imported item, rather than copying the file content into the repository. This reduces storage requirements and speeds up imports, especially for large volumes. Content can be loaded on demand later in the process, and all Grooper activities work with sparse documents as with normal documents.

See the Sparse Import article for more information.

Disposition options

Many Import Providers offer disposition options to control what happens to source files after import:

  • NoChange: Leave files in place.
  • Flag: Set or clear properties or attributes on import (available properties/attributes will differ based on the Import Provider selected).
  • Move (Not available when using Sparse import mode): Move files to a specified directory after import.
  • Delete (Not available when using Sparse import mode): Delete files after import.

Batch creation and naming

Import Providers allow you to control how Batches are created, including Batch size, structure, and naming conventions. This ensures imported content is organized according to your workflow and processing requirements.