2023:Import Mode (Property)

From Grooper Wiki
Revision as of 17:33, 9 November 2023 by Randallkinard (talk | contribs) (Created page with "{|cellpadding="10" cellspacing="5" |-style="background-color:#ed2330; color:white" |style="font-size:14pt"|'''WIP'''||This article is a work-in-progress and may abruptly stop in the middle of a section. |} <onlyinclude> <blockquote style="font-size:14pt"> The '''''Import Mode''''' property of an '''''Import Provider''''' allows you to control whether a document's content (i.e. the images and, in the case of PDF documents, their text), its properties, and a link betw...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
WIP This article is a work-in-progress and may abruptly stop in the middle of a section.


The Import Mode property of an Import Provider allows you to control whether a document's content (i.e. the images and, in the case of PDF documents, their text), its properties, and a link between the document's source location and Grooper are created.

There are three Import Modes in Grooper:

  1. Full - This mode fully imports the document. Both their content and their properties are loaded into a Grooper Batch. Because the files are fully copied from the source into a Grooper environment, this is the slowest of the three import modes.
  2. Sparse - The Sparse import mode imports a document's properties as it does in Full mode. However, instead of fully importing the document's content, a link between Grooper and their content at the import source is created. Particularly when importing large document sets, this can greatly reduce the time it takes to import documents. If needed, the content can also be loaded in parallel using the Execute activity.
  3. LinkOnly - This mode only creates a link between the document and and the source.


Previous Versions

Import Mode and Document Linking - 2.80

About

Where is the Import Mode Set?

Forget about Import Modes for a second. How do you even import documents into Grooper at all? You import documents into a Grooper environment using an Import Provider.

The simplest way to import in Grooper is to "Submit a new import job" from the "Imports" page.

It is considered best practice to use the Import Providers highlighted in red below. These leverage CMIS Repository objects for their connection configuraiton, which give them the most functionality and are the most developed means of importing in Grooper. Of the two shown, Import Query Results should be your first choice as it even more fully featured than the Import Descendants option. However, Import Query Results can only be leveraged by "query-able", indexed content systems. Import Descendants should only be used in cases where the content system connected to by your CMIS Connection is not "query-able".

There are a myriad of articles related to "CMIS" here on the Grooper Wiki, but the ones most related to this topic would be:

What is an Import Mode?

For importing, documents contain two important sets of information:

  • Content - Images and native text data
  • Properties - Metadata associated with the file. Digital information, such as the document's filename, file type, creation date, and more.
As far as it's content goes, each page of a document will have a corresponding image, such as this W-4 form.
For native PDFs, they may also have text data already embedded in the document too.

How Does Sparse Import Save Time?

What Is a Document Link?