2021:CMIS Export (Export Definition): Difference between revisions

From Grooper Wiki
Line 516: Line 516:
=== Intro to Data Mapping:  Folder Pathing and File Naming ===
=== Intro to Data Mapping:  Folder Pathing and File Naming ===


=== Perform a Mapped CMIS Export ===
=== A Further Mapped CMIS Export ===

Revision as of 11:46, 29 September 2021

CMIS Export is one of the Export Types available when configuring an Export Behavior. It exports content over a CMIS Connection, allowing users to export documents and their metadata to various on-premise and cloud-based storage platforms.

CMIS Connections allow Grooper to standardize most, if not all, export configuration for a variety of storage platforms. This object can connect Grooper to both cloud based storage platforms, such as true CMIS content management systems, a Microsoft OneDrive account, or an Online Exchange email server, as well as on-premise platforms, such as a Windows file system or an on-premise Exchange server. It standardizes access to these platforms by exposing connectivity as if they were CMIS endpoints using the CMIS standard.

The CMIS Connection connects to an individual platform using a CMIS Binding, which defines the logic required for document interchange between Grooper and the storage platform. For example, the NTFS binding is used to connect to a Windows file system for import and export operations.

CMIS Export allows for the most advanced types of document export. It allows you to utilize document metadata and data Grooper extracts for export in a variety of ways. Many content management systems allow for document storage as well as storing metadata in fields in the storage platform. For applicable CMIS Bindings, CMIS Export document metadata and extracted data can be mapped to corresponding locations within the content management system, mapping a connection between objects or properties in a Content Model within Grooper (such as Data Fields in a Data Model) and their corresponding locations in the content management system (such as a column in a SharePoint site). Even for simpler platforms (like an NTFS file system) metadata can be used for file name and folder indexing.


About CMIS and CMIS Connections

CMIS stands for "Content Management Interoperability Services".  It is an open standard that allows different content management systems to inter-operate over the Internet.  This standard protocol allows Grooper to use many different platforms for importing and exporting documents and their contents.  Once a CMIS Connection is created, Grooper can exchange documents with these platforms. "Interoperability " means Grooper has the same access to control the system as a human being does. It is a "one-to-one" connection to the platform, allowing full and total control.

Upon connecting to an external content management system, Grooper will be able to see the "repositories" associated with it.  A repository, in computer science, is a general term for a location where data lives. Different systems refer to "repositories" in different ways.  An email inbox could be a repository. A folder in Windows could be a repository. A cabinet in ApplicationXtender could be a repository. It's a place to put things. We standardize the various terms used by various storage platforms to simply "repository".

These repositories are "imported" into Grooper as a CMIS Repository object, as a child of the CMIS Connection object. This doesn't import data into Grooper in the traditional sense of importing documents into a new Batch. "Importing" here is more like bringing the repository into a framework Grooper can use. Upon importing the repository, Grooper has full file access to that location in the storage platform.

For our purposes, repositories are like filing cabinets full of documents.  Once a connection is established, it's like giving Grooper a key to that cabinet.  You can open the various drawers of that cabinet. You can pull out files and put files into. The storage platform or content management system is like the cabinet.

  • The CMIS Connection object is like the key.
  • The CMIS Repository object is like a drawer in the cabinet.
  • You "connect" to the cabinet by turning the key. You "import" the repository by opening the drawer. Now you can see there are documents in there! You can take them out. You can read them and put them back in. You can put new ones in. You can use this "open" connection to the "drawer" however you need.

CMIS+ Architecture

Grooper expanded on this idea in version 2.72 to create our CMIS+ architecture. CMIS+ unifies all content platforms under a single framework as if they were traditional CMIS endpoints. Prior to version 2.72, there was only one type of CMIS Connection, a true CMIS connection using CMIS 1.0 or CMIS 1.1 servers. Now, connections to additional non-CMIS document storage platforms can be made via "CMIS Bindings". This provides standardized access to document content and metadata across a variety of external storage platforms.

Using this architecture, Grooper is able to create a simpler and more efficient import and export workflow, using a variety of storage platforms. You now use the CMIS Import Import Provider and the CMIS Export Export Type, regardless of the storage platform. They connect to a CMIS Repository imported from a CMIS Connection and use that as Grooper's import or export path.

How you create a CMIS Connection only differs from CMIS Binding to CMIS Binding, as each binding has a different way of connecting to it. You don't connect to an Outlook inbox the same way you connect to a Windows file folder, for example. Thus, the property configuration for the Exchange binding is different from the NTFS binding.

CMIS Bindings

A CMIS Binding provides connectivity to external storage platforms for content import and export. Grooper's CMIS+ architecture expands connectivity from traditional CMIS servers to a variety of on-premise and cloud-based storage platforms by exposing connections to these platforms as CMIS Bindings.

Each individual CMIS Binding contains the settings and logic required to exchange documents between Grooper and each distinct platform. For example, the AppXtender Binding contains all the information Grooper uses to connect to the ApplicationXtender content management system.

CMIS Bindings are used when creating a CMIS Connection object. The first step to creating a CMIS Connection is to configure the Connection Type property. Which binding you use (and therefore which platform you connect to) is set here. First, the user selects which CMIS Binding they want to use, selecting which storage platform they want to connect to. The second step is to enter the connection settings for that binding, such as login information for many bindings.

Current CMIS Bindings

Grooper can connect to the following storage platforms using below using CMIS Bindings:

  • AppXtender- Defining connection to the ApplicationXtender document management platform.
  • Box - Defining connection to the Box.com cloud storage platform.
  • FileBound - Defining connection to the FileBound document management platform.
  • IBM FileNet Connector - Defining connection to the FileNet content management platform.
  • CMIS - Defining connection to any content management systems using CMIS 1.0 or CMIS 1.1 servers.
  • The following Microsoft content platforms
    • Exchange - Defining connection to the Microsoft Exchange mail server platform (i.e. Outlook mailboxes).
    • OneDrive - Defining connection to the OneDrive cloud storage platform.
    • SharePoint - Defining connection to Microsoft SharePoint sites.
  • FTP - Defining connection to an FTP (File Transfer Protocol) server.
  • SFTP - Defining connection to an SFTP (SSH File Transfer Protocol) server.
  • IMAP - Defining connection to IMAP mail servers.
  • NTFS - Defining connection to the Microsoft Windows file system.

How To

Prereqs: Understanding the Content Model and Documents Used in These Tutorials

You may download and import the file below into your own Grooper environment (version 2021). This contains a Batch with the example document(s) discussed in this tutorial and a Content Model configured according to its instructions.

  • [[]]

In the following "how to" tutorials, we will use a simple Content Model used for purchase order and invoice processing. First, we should familiarize ourselves with the Content Model and some of the documents. Understanding our content, both in terms of the documents themselves as well as the Content Type and Data Model hierarchy of the Content Model will make it easier to follow along with the subsequent tutorials.

The Documents

In our sample test Batch, we have a series of documents we ultimately want to export in one way or another, using CMIS Export. In this Batch you will find the following kinds of documents:

  1. If you've imported the zip for this tutorial, you will find a sample test Batch navigating the following path in the Node Tree:
    • Root Node > Batch Processing > Batches > Test > Export Activity > Sample Export Batch - POs and Invoices

Click Me to Return To the Top

The Content Model

These documents are modeled by our example Content Model named "Export Example Model - POs and Invoices".

  1. If you've imported the zip for this tutorial, you will find a sample test Batch nagivating the following path in the Node Tree:
    • Root Node > Content Model > Export Activity > Export Example Model - POs and Invoices

Our document set is represented by the Content Type hierarchy of our Content Model.

  1. The invoices from various vendors are modeled by the "Invoice" Document Type.
  2. The purchase orders are modeled by the "Purchase Order" Document Type.
  3. All the different pricing letters, notifying a vendor of a price increase, decrease or promotional types, are child Document Types of the "Price Letters" Content Category.
  4. The "Price Decrease Letter", "Price Increase Letter", and "Promo Price Letter" Document Types model these corresponding kinds of price letters.

Click Me to Return To the Top

The Data Model Hierarchy

Part of modeling a document set with a Content Model and its component Document Types is modeling the data elements you wish to extract. This is done with one or more Data Models in the Content Model's hierarchy.

Any Content Type can have a child Data Model. Data you wish to extract is defined by adding child Data Elements to the Data Model, such as Data Field and Data Table objects. These objects are then configured with extractors to parse a Batch Folder's text data and return a value, stored as the document's index data when the Extract activity is executed.

Ultimately, understanding a Document Type's Data Model and how it inherits Data Elements from parent Content Types will be critical for configuring CMIS Export (and truly, any Export Type). We can use extracted data in a variety of ways from document folder pathing and naming to mapping extracted data to storage locations in content management systems (for those that support it). Understanding how the data flows through a Content Model's Content Type and thus Data Model hierarchy is necessary to understand how to call it out later on down the line during export.

  1. In our case the parent Content Model has a Data Model with a single child Data Field named "Document Date"
    • Extraction logic is already configured for this Data Field to return a date for any of our documents, such as an invoice date for our invoices or the letter date for our pricing letters.

All child Document Types will inherit the Data Elements of their parent Content Type's Data Model. This means all Document Types will extract the "Document Date" Data Field when the Extract activity runs.

  1. For example, the "Invoice" Document Type has its own Data Model, with its own Data Elements.
    • These are various Data Fields and a Data Table (the one named "Invoice Line Items") that only relate to invoices.
  2. This kind of data is specific to invoices and will only be extracted if a Batch Folder is assigned the "Invoice" Document Type during extraction.
    • We can see these Data Elements in the Data Model preview panel with the Data Model selected in the Node Tree.
  3. However, the "Invoice" Document Type's Data Model inherits any Data Elements from any parent Content Type.
    • In this case, the "Document Date" Data Field is inherited from the parent Content Model's Data Model.
    • This Data Field also shows up in the "Invoice" Data Model's preview panel. In essence, it becomes a part of the "Invoice" Document Type's Data Model.

  1. Similarly, the "Purchase Order" Document Type has its own child Data Model with its own Data Elements relating just to purchase orders.
    • We only want these Data Elements extracted if the Batch Folder is classified as the "Purchase Order" Document Type.
  2. However, it too is a child of the parent Content Model. As such, it inherits the Content Model's Data Model as well.
    • So, it too has a "Document Date" Data Field as part of its Data Model.
FYI You may have noticed the "Invoice" Document Type and "Purchase Order" Document Type both have a "PO Number" and "Vendor" Data Field in their Data Models.

Be aware, these are two separate objects in the Node Tree. They have different extractors extracting their data. These Data Fields extract data in different ways depending on the Batch Folder's Document Type. They just happen to share the same name.

However, they are in totally different locations in the Content Model's hierarchy, and thus are distinct objects.

  1. For the three different kinds of pricing letters, a Data Model is added to the "Price Letters" Content Category.
    • This Data Model has its own pricing letter related Data Elements.
  2. Remember, Data Elements flow through a Content Model's Content Type hierarchy. The "Price Letters" Content Category is the parent Content Type of the three pricing letter Document Types ("Price Decrease Letter", "Price Increase Letter", and "Promo Price Letter").
  3. As such, any Batch Folder assigned one of these three Document Types will inherit the Content Category's Data Elements for its Data Model.
    • Furthermore, for our made up use case here, the individual pricing letter Document Types don't have their own Data Models. We don't need them! For each of these three Document Types we want to extract the same set of data. The parent "Price Letter" Content Category's Data Model will apply to all three Document Types. Creating a unique Data Model for each Document Type would be a waste of time, in our case.
  4. Not only that any grandparent Data Elements are inherited as well, such as the top level Content Model's "Document Date" Data Field.

Click Me to Return To the Top

The Batch Process

  1. If you've imported the zip for these tutorials, you will find a sample test Batch navigating the following path in the Node Tree:
    • Root Node > Batch Processing > Processes > Working > Export Activity > Export Process - POs and Invoices
    • This is a simple Batch Process used to process our document set. It will recognize their text, classify them according to our Content Model's configuration, and extract data (as described by the previous tab). The documents in the sample test Batch provided have been processed according to this Batch Process.
  2. The last step in this Batch Process is an Export activity step.
    • As we discuss CMIS Export set up in the following how-to tutorials, keep in mind the Export activity is the activity that drives document export. It's all part of a process (a Batch Process) where content is ingested into a Batch, processed by various activities, and ultimately exported by the Export activity.
  3. The Export activity exports Batch Folder document content according to an Export Behavior configuration.
    • CMIS Export is one way to get that done, exporting content over a CMIS Connection. Specifically, as will be described in the next tutorial, CMIS Export is an Export Type configuration for an Export Behavior.
    • FYI: The Export step in this Batch Process will be unconfigured if you imported the zip for these tutorials. Part of configuring CMIS Export involves connecting to external systems. Obviously we can't connect to your personal storage environments. However, this content will get you started to follow along using the subsequent lessons.

Click Me to Return To the Top

Perform a Basic CMIS Export

CMIS Exports can range from very simple exports of Batch Folder content, to more complex exports, utilizing Grooper extracted content in a variety of ways. We will start with the most basic configuration of a CMIS Export. These steps will be largely applicable to any CMIS Export. By the end of this tutorial, we will export PDF files generated from the image and OCR text content of the Batch Folders in our Batch, as well as an XML metadata file generated from the extracted Data Model Elements for each Batch Folder.

Establish the CMIS Connection and CMIS Repository

Before configuring a CMIS Export, you must have created a CMIS Connection and imported a CMIS Repository. For more information on how to create a CMIS Connection and import a CMIS Repository refer to the CMIS Connection article.

For this example, we will simply export to a Windows folder on a local drive.

  1. We have created a CMIS Connection using the NTFS Connection Type
  2. We have imported a CMIS Repository connecting Grooper to a folder named "Grooper Import Export".
  3. And we will be exporting to this subfolder named "Export".

Click Me to Return to the Top

Configure an Export Behavior

CMIS Export is one of the Export Type options when configuring an Export Behavior. Export Behaviors control what document content for a Batch Folder is exported where, according to its classified Document Type. As such, in order to configure a CMIS Export, you must first configure an Export Behavior for a Content Type (a Content Model or its child Content Categories or Document Types).

Export Behaviors can be configured in one of two ways:

  1. Using the Behaviors property of a Content Type object
    • A Content Model
    • A Content Category
    • Or, a Document Type
  2. As part of the Export activity's property configuration

Option 1: Content Type Export Behaviors

An Export Behavior configuration can be added to any Content Type object (i.e. Content Models, Content Categories, and Document Types) using its Behaviors property. Doing so will control how a Document Type "behaves" upon export.

  1. For example, here we have a Content Model selected in the Node Tree.
  2. To add an Export Behavior, first select the Behaviors property.
  3. Then, press the ellipsis button at the end of the property.

  1. This will bring up the Behaviors collection editor window.
  2. Press the "Add" button.
  3. Select Export Behavior.
    • You can only configure one Export Behavior per Content Type object.
    • Children Content Type objects will inherit export settings from their parent Content Type's Export Behavior configuration.
    • However, multiple Export Behaviors may be added by configuring the Behaviors property of multiple Content Types. For example, if every Document Type needed a unique Export Behavior configuration, you could configure the Behaviors property for each one, adding one Export Behavior to the Behaviors list for each one.

  1. You will see the Export Behavior added to the Behaviors list.
  2. Selecting it, you can now add one or more Export Definitions with the Export Definitions property.


FYI When configured using the Behaviors property of a Content Type object, the Export activity will export Batch Folder content in a Batch according to the Export Definition settings configured for the Batch Folder's assigned Document Type
  • Or its parent Content Category or parent Content Model depending on which Content Type's Behavior property is configured in the Content Model's hierarchy.
  • Option 2: Export Activity Export Behaviors

    Export Behaviors can also be configured as part of the Export activity's configuration. These are called "local" Export Behaviors. They are local to the Export activity in the Batch Process.

    1. For example, here we have a working Batch Process selected in the Node Tree.
      • This is a simple Batch Process used to import purchase order, invoice, and other related documents, recognize their text, and extract some basic data from them. The last step in this Batch Process is an Export step.
    2. Select the Export step of the Batch Process.
    3. To add an Export Behavior, select the Export Behaviors property.
    4. Then, press the ellipsis button at the end of the property.

    1. This will bring up the Export Behaviors collection editor window.
    2. Press the "Add" button to add a new Export Behavior
    3. An Export Behavior will be added to the list.
    4. With the Export Behavior selected you must define which Content Type the behavior applies to using the Content Type property.
      • Note in both cases, a Content Type is involved in configuring Export Behaviors. Whether local to the Export activity or as part of a Content Model's configuration, Grooper needs to know what to do upon export, given a certain Content Type (and its children Content Types if scoped to a Content Model or Content Category). Once Grooper knows what kind of document it's looking at, we can then inform it what to do in terms of exporting its document content.
    5. Using the dropdown menu, select which Content Type scope should utilize the Export Behavior by selecting either a top-level parent Content Model or one of its child Content Categories or Document Types.
      • Keep in mind you can only select a single Content Type here. You can only configure one Export Behavior per Content Type object.
      • Children Content Type objects will inherit export settings from their parent Content Type's Export Behavior configuration.
    6. However, multiple Export Behaviors may be added locally to the Export activity. For example, if every Document Type needed a unique Export Behavior configuration, you could add one Export Behavior to the list for each one.

    1. Once a Content Type is selected, you can add one more more Export Definitions with the Export Definitions property.

    Click Me to Return to the Top

    Add an Export Definition

    1. Going forward in this tutorial, we will scope our Export Behavior to the parent Content Model "Export Example Model - POs and Invoices".
      • What Content Type scope in your Content Model you choose will be paramount if you want to use extracted Data Element values for data mapping purposes. However, we do not need to concern ourselves with that for this tutorial. This is the most basic (or "unmapped") version of a CMIS Export. For more information on data mapping, visit the Perform a Mapped CMIS Export tutorial later in this article.
    2. We will choose to configure the Export Behavior using "Option 1", adding it to the Content Model's set of Behaviors properties.

    Regardless if you choose to configure the Export Behavior on a Content Type object, or if you configure it local to to Export activity's configuration, your next step is adding an Export Definition.

    1. Once you've added an Export Behavior, select the Export Definitions property.
    2. To add an Export Definition, press the ellipsis button at the end of the property.

    1. This will bring up an Export Definition list editor to add one or more Export Types.

    Click Me to Return to the Top

    Add a CMIS Export

    Export Definitions functionally determine three things:

    1. Location - Where the document content ends up upon export. In other words, the storage platform you're exporting to.
    2. Content - What document content is exported: image content, full text content, and/or extracted data content.
    3. Format - What format the exported content takes, such as a PDF file or XML data file.

    Export Definitions do this by adding one or more Export Type configurations to the definition list. The Export Type you choose determines how you want to export content to which platform. In our case, we want to use a CMIS Connection to export content to a connected CMIS Repository. We will add a CMIS Export to the definition list.

    1. To do this, press the "Add" button.
    2. Choose CMIS Export from the list.

    1. This will add an unconfigured CMIS Export to the Export Definitions list.
    2. For all CMIS Export configurations, you must choose what CMIS Repository using the CMIS Repository property.
    3. Location! Location! Location! This is how Grooper knows where you want to export your document content. In our case, we want to export documents to a folder in the "Grooper Import Export" folder on this machine's Windows hard drive.
      • Using the drop down menu, we will select the "NTFS - Grooper Import Export" CMIS Connection's CMIS Repository named "Grooper Import Export".

    Selecting an Export Subfolder Location

    With a CMIS Repository selected, a set of Filing Location properties will appear.

    1. The Target Folder property allows you to select a subfolder in the repository as the export path, rather than just putting documents at the root of the repository.
    2. Selecting this property, you can press the ellipsis button at the end to bring up a navigation window to select a subfolder.
    3. This will bring up the following window. The CMIS Repository you selected for the CMIS Repository property will be at the root of this node tree. Expand the root and subsequent nodes to explore the folder structure of the CMIS Repository.
    4. In our case, we selected the subfolder here, named "Export"

    Now that we know what location we're exporting to, we need to define what content we want to export and what format that content should take.

    1. In a very general way, that's what the Object Data set of properties are all about.
      • More specifically, we will use the Export Formats property to export a simple PDF for each Batch Folder in our sample test Batch.

    Click Me to Return to the Top

    Configure Export Formats

    Next, we need to tell Grooper what content we want to export and how we want to export it. Keep in mind, we're exporting to a Windows file folder. Regardless of the storage system, we are always in some way limited by the constraints of the storage system. Some have greater capabilities to house customized metadata fields, for example. The NTFS file system, however, is pretty basic. It is a hierarchical file system to store and organize files.

    So, what can we export? Files. The good news is there's all kinds of different file formats out there. Even just exporting simple files, we can get Grooper processed content (the document's image, the document's full text data, and the document's index data) out of Grooper. To do this, we will add Export Formats to dictate what content exports to what file type.

    There are a variety of Export Formats available to generate and export content from Grooper.

    1. PDF Format - This will output a PDF file from the Batch Folder content. This includes capabilities to embed full text data obtained from the Recognize activity.
    2. XML Metadata - This will output extracted Data Model values to an XML file.
    3. JSON Metadata - This will output extracted Data Model values to a JSON file.
    4. Simple Metadata - This will output extracted Data Model values to a text file.
      • This file formats Data Fields and their values as simple "key-value pairs"
    5. Delimited Metadata - This will also output extracted Data Model values to a text file.
      • This formats Data Field values as a delimiter-separated value array.
    6. TIF Format - This will output image content only as a TIF file.
    7. Text Format - This will output full text content only, generated from OCR data, as a text file.
    8. Attachment - For document files that were imported from a digital source, this will output the Batch Folder's attachment file. This option can also output any file attached to a Batch Folder by referencing a filename. This is how Grooper exports custom generated files from activities such as XML Transform or custom scripted activities.
      • If the Batch Folder has no attachment, this option will generate an image version of the document from all child Batch Pages in the folder.

    We will export PDF files generated from the image and OCR text content of the Batch Folders in our Batch, as well as an XML metadata file generated from the extracted Data Model Elements for each Batch Folder.

    1. To add an Export Format, first select the Export Formats property.
    2. Press the ellipsis button at the end of the property.

    1. This will bring up the Export Formats collection editor.
    2. By default, there will always be an Attachment Export Format present in the list.
    3. We're going to use a different format. We will get rid of it by selecting it in the list, and pressing the "Delete" button.

    1. To add a new Export Format, press the "Add" button.
    2. Select the format you wish to output from the list.
      • We will first choose the PDF Format.

    1. This will add a PDF Format to the list of Export Formats.
    2. With an Export Format selected, the right panel will allow you to further configure the exported file.
      • For example, in our case, we've enabled the Searchable property under Build Options. This will embed the full text data generated by the Recognize activity in our Batch Process into each page in the PDF.

    You can add as many Export Formats as you want. This allows you to export multiple files generated from the Batch Folder content in your Batch. For example, we've extracted data from our documents, using the Extract activity of our Batch Process. We can create an XML metadata file with all that data using the XML Metadata Export Format.

    1. To add a new Export Format to the list, press the "Add" button again.
    2. Select the additional file format you wish to output from the list.
      • We will choose XML Format.

    Upon executing the Export activity, now two files will be exported for each Batch Folder in the Batch, one for each Export Format in our list.

    1. A PDF file generated from the PDF Format.
    2. An XML file generated from the XML Metadata.
    3. Press "OK" on this and all subsequent windows to save your changes.

    Click Me to Return to the Top

    Export the Documents

    With the Export Behavior configured, we can now test our export.

    1. The Export activity in our Batch Process will apply our Export Behavior to every Batch Folder in the Batch.
    2. FYI: Because we configured the Export Behavior on our Content Model (using its Behaviors property editor), we do not have to configure the Export activity's local properties.
      • We've given Grooper all the information it needs to export content. The Export activity will go through every Batch Folder, one by one, in the Batch. It will see the Batch Folders are classified with one of the Document Types in our Content Model. Since we configured the Export Behavior on the Content Model, all child Document Types will use its configuration settings to export document content.

    We will test our export using the Export activity's "Unattended Activity Tester" tab.

    1. Expand the Batch Process to reveal its child Batch Step nodes.
    2. Select the Export activity step.
    3. Switch to the "Unattended Activity Tester" tab.
    4. Press the "Process All..." button.
      • On the subsequent screen press the "Start" button to start processing the Batch. This will apply the Export activity, as configured in the Batch Process to all items in the activity's scope (Folder Level 1 in our case)

    Success! We exported the documents in our Batch! All of this was made possible by our Content Model's Export Behavior using the CMIS Export.

    1. All files are exported to the connected CMIS Repository location using our NTFS CMIS Connection.
    2. For each, Batch Folder, two files were exported for each Export Format added and configured.
      • A PDF file from the PDF Format
      • An XML file from the XML Format

    FYI This truly is about the most basic export you could do using CMIS Export. There's a lot more functionality available to CMIS Export to get data out of Grooper and use that data to index your documents better.
    1. If nothing else, the these exported document filenames could stand improvement. Because these files were originally brought into Grooper as imported PDF files, as we configured CMIS Export, the generated filenames are simply copied from whatever the original file's name was.

    In the next tutorial, we will introduce the concept of "data mapping". We will use extracted data to form folder levels and filenames, mapping Grooper extracted metadata to folder and file metadata upon export.

    Click Me to Return to the Top

    Intro to Data Mapping: Folder Pathing and File Naming

    A Further Mapped CMIS Export