Import Descendants (Import Provider): Difference between revisions

From Grooper Wiki
No edit summary
Tag: Manual revert
No edit summary
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is a redirect page.
{{AutoVersion}}


<blockquote>{{#lst:Glossary|Import Descendants}}</blockquote>
<blockquote>{{#lst:Glossary|Import Descendants}}</blockquote>
For information on '''''Import Descendants''''' visit the following resources:
* [[CMIS Import]] - This is an article covering imports using '''CMIS Connections''' in general.
* [[CMIS Import#About CMIS Import]] - This portion of the "CMIS Import" article briefly covers the differences between '''''Import Descendants''''' and '''''Import Query Results'''''.
* [[CMIS Import#Import Descendants]] - This portion of the "CMIS Import" article explains '''''Import Descendants'''''.
<!---


== About ==
== About ==


'''Import Descendants''' is an [[Import Provider]] used to import objects from a CMIS Repository that are "children" or "descendants" of a base CMIS Folder.  This is done using a SQL-style "filter" to select the items that will be imported.
"Import Descendants" is one of the [[CMIS Import]] providers in Grooper. It is used to import files from '''CMIS Repositories''' for Batch processing in Grooper. It will import files from a folder structure of an on-premise or cloud-based document storage platform.
 
:*<li class="fyi-bullet"> While less common, Import Descendants can also import ''folders'' from CMIS Repositories. However, since importing files is most common, we focus on importing ''files'' in this article.
* <code>AT_LEVEL</code> selects items at a specific level
* <code>MATCHES</code> allows RegEx against property values
 
=== Example ===
 
<tabs>
<tab name="Step 1">
===== Create a Batch =====
The easiest way to use Import Descendants is by creating a new batch.
* In Grooper Design Studio or Grooper Dashboard, create a new batch by pressing "Batch > CMIS Import > Import Descendants...".
 
 
[[image:1556571123332-495.png|center]]
 
 
</tab>
<tab name="Step 2">
===== Choose the Repository =====
 
If you have multiple CMIS repositories available, you'll need to choose the one from which you wish to import.
* Select the appropriate repository from the "Repository" property.
 


[[image:1556571692162-486.png|center|900px]]


Just like any other Import Provider, Import Descendants is used to submit "'''Import Jobs'''". Import Jobs are how Grooper brings in files from a storage location for processing. For example, it's how PDFs from a Windows folder get into Grooper or messages from an email inbox get into Grooper. When an Import Job runs, Grooper first creates a Batch and then creates a Batch Folder for each imported file. A copy of the file is attached to the Batch Folder. This becomes the Batch Folder's "attachment" and is used when applying activities like "Split Pages".
:*<li class="fyi-bullet"> When files are imported into Grooper, a link to that file is stored on the Batch Folder. This link maintains a connection between the file's source location and the document in Grooper. This link also makes "Sparse" imports possible. [[#Import Mode (and "Sparse" imports)|See below for more.]]


[[image:1556632783258-847.png|center|900px]]


Import Jobs are submitted in one of two ways:
* '''By a user from the Imports page''': Ad-hoc or "user directed" Import Jobs are submitted from the [[Imports Page]], using the "Submit Import Job" button.
* '''From an Import Watcher service''': Automated or "scheduled" Import Jobs are submitted by an '''[[Import Watcher]]''' service according to its Poling Loop or Specific Times specification.
In both cases, an "Import Descendants" can be selected and configured using using the "Provider" property.


</tab>
{{#lst:CMIS Import|import_query_results_and_descendants_similarities}}
<tab name="Step 3">
===== Analyze and Import =====


Now we can test it out.
== Prereqs: CMIS Repository ==
* Pressing "Analyze" will run a search against your repository with your configured settings.
* Pressing "Import" will create a batch of documents that were imported using your configured settings.


Using the "Starting Step" property, you can choose from your published Batch Processes the step where your batch should be created. For example, in the screenshot below, the batch will be created at the "Image Review" step of a Batch Process.
A CMIS Repository allows Grooper access to files and folders within a storage platform.


Because Import Descendants imports from a CMIS Repository, you can import from numerous storage platforms determined by the "CMIS Binding" used. These CMIS Bindings include:
* [[NTFS]] to connect to Windows folders
* [[FTP]] to connect to FTP directories
* [[SFTP]] to connect to SFTP directories
* [[Exchange]] to connect to Outlook inboxes
* [[SharePoint]] to connect to SharePoint sites (and document libraries)
* [[OneDrive]] to connect to OneDrive drives
* [[Box]] to connect to Box accounts
* [[AppXtender]] to connect to AppEnhancer applications


[[image:1556632834217-872.png|center|900px]]
Before you can import files from these platforms using Import Descendants or Import Query Results, there's some setup required in the Grooper Design page. You must:
# Create and configure a CMIS Connection.
# Import a folder location as a CMIS Repository.


This will allow you to import files from folders accessed by the CMIS Repository. For information on CMIS Connections and CMIS Repositories, including how to create them in Grooper, visit the [[CMIS Connection]] page.


</tab>
[https://app.supademo.com/demo/cm8ddjcr91rb92ugqp7k9phqo Click here for an interactive walkthrough.]
</tabs>


=== Example Queries ===
== Example Import Descendants configuration ==


These are samples of what you could type into the "Import Filter" to narrow which objects are imported. They take the following general form.
Regardless of the platform you're accessing with a CMIS Repository, you configure Import Descendants largely the same. Just pick a CMIS Repository and configure the rest of Import Descendants as needed.
* Import Descendants will import all files in the base folder, ''including all descendant files in any subfolders if present.''
* Unless you configure the "Base Folder" property, Import Descendants will start at the root of the CMIS Repository and continue down the folder structure.
* When using Import Descendants, setting the "Base Folder" to a terminal branch in the folder structure (a folder with no subfolders) is the only way to import files from a folder without importing descendant files (because there are none in this case).
**<li class="fyi-bullet"> A more technical way of saying this is Import Descendants does not support the <code>IN_TREE</code> CMIS search predicate. It only supports the <code>IN_FOLDER</code> predicate.


<code>SELECT * FROM <ContentType> WHERE <Criteria></code>
<big>Example: Submitting Import Descendants from the Imports Page</big>


Below are a few examples of this syntax in action.
[https://app.supademo.com/demo/cm8d4b2wt1f812ugqwtbi17bu Click here for a step by step walkthrough.]


{|
# Go to the Imports Page.
|-valign="top"
# Press the "New Import Job" button.
|style="width:50%"|'''Filter'''||'''Description'''
# This brings up the "Submit Import Job" editor.
|-valign="top"
# Enter a description in the Description property (This is required).
|<code>SELECT * FROM File</code>||Import all descendant files.  This will import all files in the repository without any foldering.
# Open the "Provider" dropdown (Press the "☰" button).
|-valign="top"
# Select "Import Descendants" from the dropdown list.
|<code>SELECT * FROM File WHERE AT_LEVEL(1)</code>||Import files which are immediate children.  This will only import files at that level, not from subsequent levels.
# Expand the Provider settings to configure it.
|-valign="top"
# Open the "Repository" node selector (Press the "☰" button).
|<code>SELECT * FROM Folder</code>||Import folders which are immediate children.  This will import both files and their foldering.
# Select the CMIS Repository you wish to import from.
|-valign="top"
# Select a Base Folder, as needed.
|<code>SELECT * FROM File WHERE cmis:name MATCHES '^\d{4}-\d{2}-\d{2}'</code>||Import files with a specific naming pattern, using regular expressions.
# Configure the Import Mode property, as needed.
|-valign="top"
# Configure the Batch Creation settings, as needed.
|<code>SELECT * FROM File WHERE cmis:name LIKE 'ca%'</code>||Import files with a name starting with ca.
# Configure the file disposition options (Delete Item, Move To Folder, or Update Properties), as needed.
|-valign="top"
# Configure any remaining Import Descendants properties, as needed.
|<code>SELECT * FROM File WHERE cmis:contentStreamLength > 10000</code>||Import files larger than 10,000 bytes.
# Press the "Submit" button when finished.
# Your Import Watcher service will pick up and execute the Import Job.

Latest revision as of 15:56, 28 May 2025

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Import Descendants is one of two Import Providers that use cloud CMIS Connections to import document content into Grooper. Import Descendants imports files from a settings_system_daydream CMIS Repository folder location, including any files in any sub-folders (i.e. all "descendant" files).

About

"Import Descendants" is one of the CMIS Import providers in Grooper. It is used to import files from CMIS Repositories for Batch processing in Grooper. It will import files from a folder structure of an on-premise or cloud-based document storage platform.

  • While less common, Import Descendants can also import folders from CMIS Repositories. However, since importing files is most common, we focus on importing files in this article.


Just like any other Import Provider, Import Descendants is used to submit "Import Jobs". Import Jobs are how Grooper brings in files from a storage location for processing. For example, it's how PDFs from a Windows folder get into Grooper or messages from an email inbox get into Grooper. When an Import Job runs, Grooper first creates a Batch and then creates a Batch Folder for each imported file. A copy of the file is attached to the Batch Folder. This becomes the Batch Folder's "attachment" and is used when applying activities like "Split Pages".

  • When files are imported into Grooper, a link to that file is stored on the Batch Folder. This link maintains a connection between the file's source location and the document in Grooper. This link also makes "Sparse" imports possible. See below for more.


Import Jobs are submitted in one of two ways:

  • By a user from the Imports page: Ad-hoc or "user directed" Import Jobs are submitted from the Imports Page, using the "Submit Import Job" button.
  • From an Import Watcher service: Automated or "scheduled" Import Jobs are submitted by an Import Watcher service according to its Poling Loop or Specific Times specification.

In both cases, an "Import Descendants" can be selected and configured using using the "Provider" property.


Similarities and differences between Import Query Results and Import Descendants

Overall, "Import Descendants" is a "simpler" version of "Import Query Results".

  • We advise to use Import Query Results over Import Descendants, when possible.
    • Import Query Results can do everything Import Descendants can do and more.
    • Import Query Results has more robust file filtering capabilities. This allows for more targeted, selective imports.
    • Import Query Results is newer (and better maintained) than Import Descendants.
    • There are only a handful of scenarios where Import Descendants must be used over Import Query Results.


Similarities

  • Both providers import files from a CMIS Repository.
  • Both providers have the same Batch Creation settings.
  • Both providers are capable of "Sparse" imports by changing the "Import Mode" to "Sparse".
  • Both providers can dispose of files on import (using the "Delete Item", "Move Item", or "Update Properties")

Differences

The biggest difference is in how the providers determine which files are imported (import criteria).

  • Import Descendants will import all files from a target location. This includes all files in all subfolders if present. You can, however, set a "Base Folder" within the CMIS Repository.
  • Import Query Results will import files that match a CMIS Query. This is a specialized query language based on SQL syntax. This gives you many more options for import conditions, using a "WHERE" clause in the query. CMIS Queries also give you the capability to restrict imports to a folder location without importing files in subfolders (This is something Import Descendants cannot do).
  • Import Descendants does have an "Import Filter" it can use to set import conditions. It also uses a SQL-like syntax. However, it is not as advanced as the CMIS Queries that Import Query Results uses.


CMIS Repositories that can only use Import Descendants

Certain CMIS Bindings are not queryable using CMIS Queries. Because of this, certain CMIS Repositories cannot utilize Import Query Results. The following CMIS Repositories must use Import Descendants to import file content:

  • FTP
  • SFTP
  • NTFS (only if the directory has not been indexed by the Windows Search service or the Windows Search service is not running)


Prereqs: CMIS Repository

A CMIS Repository allows Grooper access to files and folders within a storage platform.

Because Import Descendants imports from a CMIS Repository, you can import from numerous storage platforms determined by the "CMIS Binding" used. These CMIS Bindings include:

  • NTFS to connect to Windows folders
  • FTP to connect to FTP directories
  • SFTP to connect to SFTP directories
  • Exchange to connect to Outlook inboxes
  • SharePoint to connect to SharePoint sites (and document libraries)
  • OneDrive to connect to OneDrive drives
  • Box to connect to Box accounts
  • AppXtender to connect to AppEnhancer applications

Before you can import files from these platforms using Import Descendants or Import Query Results, there's some setup required in the Grooper Design page. You must:

  1. Create and configure a CMIS Connection.
  2. Import a folder location as a CMIS Repository.

This will allow you to import files from folders accessed by the CMIS Repository. For information on CMIS Connections and CMIS Repositories, including how to create them in Grooper, visit the CMIS Connection page.

Click here for an interactive walkthrough.

Example Import Descendants configuration

Regardless of the platform you're accessing with a CMIS Repository, you configure Import Descendants largely the same. Just pick a CMIS Repository and configure the rest of Import Descendants as needed.

  • Import Descendants will import all files in the base folder, including all descendant files in any subfolders if present.
  • Unless you configure the "Base Folder" property, Import Descendants will start at the root of the CMIS Repository and continue down the folder structure.
  • When using Import Descendants, setting the "Base Folder" to a terminal branch in the folder structure (a folder with no subfolders) is the only way to import files from a folder without importing descendant files (because there are none in this case).
    • A more technical way of saying this is Import Descendants does not support the IN_TREE CMIS search predicate. It only supports the IN_FOLDER predicate.

Example: Submitting Import Descendants from the Imports Page

Click here for a step by step walkthrough.

  1. Go to the Imports Page.
  2. Press the "New Import Job" button.
  3. This brings up the "Submit Import Job" editor.
  4. Enter a description in the Description property (This is required).
  5. Open the "Provider" dropdown (Press the "☰" button).
  6. Select "Import Descendants" from the dropdown list.
  7. Expand the Provider settings to configure it.
  8. Open the "Repository" node selector (Press the "☰" button).
  9. Select the CMIS Repository you wish to import from.
  10. Select a Base Folder, as needed.
  11. Configure the Import Mode property, as needed.
  12. Configure the Batch Creation settings, as needed.
  13. Configure the file disposition options (Delete Item, Move To Folder, or Update Properties), as needed.
  14. Configure any remaining Import Descendants properties, as needed.
  15. Press the "Submit" button when finished.
  16. Your Import Watcher service will pick up and execute the Import Job.