2.90:Database Export: Difference between revisions

Revision as of 11:31, 14 April 2020

Database Export is one of the main ways to Deliver data Collected in Grooper.

A completed Content Model and accompanying Batch for what will be built can be found by following this link and downloading the provided file. It is not required to download to understand this article, but can be helpful because it can be used to follow along with the content of this article. This file was exported from and meant for use in Grooper 2.9

About

The most important goal of Grooper is to deliver accurate data to line of business systems that allow the information to be integrated into impactful business decisioning. Tables in databases remain, to this day, one of the main vessels by which this information is stored. Grooper's Database Export activity is the mechanism by which this delivery is performed. Database Export uses a configured Data Connection to establish a link to (Microsoft SQL Server or ODBC-compliant) tables in a database and intelligently populate said tables.

Two key distinctions about Grooper's Database Export activity are its ability to take full advantage of its sophisticated hierarchical data modeling to flatten complex/inherited data structures, and the ease of delivery to multiple tables at once.

How To

Following is an example of how to setup the Database Export activity. In this example are a couple of different document formats whose data will be collected by a single Content Model and used to help illustrate the two key distinctions mentioned above. The first document shows the flattening of inherited data, while the second is used to give an avenue to export to multiple tables at once.

!	Some of the tabs in this tutorial are longer than the others. Please scroll to the bottom of each step's tab before going to the step.

Understanding the FormsUnderstanding the Content ModelIndex DataConfiguring a Data Connection

Understanding the Forms

The thing to understand about the document displayed on the right is that there is a combination of single instance data, as well as a table displaying a dynamic amount of rows, i.e. - a one-to-many relationship. The single instance data (Last Name, First Name, Employee ID), as a result of only being listed once on the document, will only be collected once, but needs to be married to each row of information from the table. While only collected once, it will be reported many times.

The second document is collecting only information from the table and exporting as is. This document was added to demonstrate (later in this article) how to export to multiple tables via one step in a Batch Process.

Understanding the Content Model

The Content Model extracting the data for these documents is fairly straight forward. There are two Document Types, each with their own Data Model.

The first Data Model is the one representing the one-to-many relationship. Notice for the fields represented once in the document there are Data Fields. For the tabular data, a Data Table was established.

The second Data Model is simply collecting all the tabular data from the document with one Data Table. There is no one-to-many relationship here.

Index Data

Before the Database Export activity can send data, it must have data. It's easy to get in the habit of testing extraction on a Data Field or a Data Model and feel good about the results, but it must be understood that the information displayed when doing so is in memory, or temporary. When the Extract activity is successfully run against a document it creates Index Data and marries it to the document via a JSON file called Grooper.DocumentData.json. A couple of ways to verify its existence are as follows.

First, navigate to the Document object in the node tree; not the Batch, not the Batch Folder, not the Page, but specifically the Document object. This is where the information lives.
From there, click on the Index Data tab.
After doing so you can see data displayed. Were the document classified, but not successfully extracted, the data structure would be present, but the fields empty.

Another means of verifying is to again, click on the Document object in the node tree.
Click the Advanced tab.
Click the Files tab.
In the List View you should see the Grooper.DocumentData.json file.
When you click on that file, you should see the stored JSON information of the indexed data displayed in the viewer below.

Configuring a Data Connection

In order for the Database Export activity to run, it needs an established connection to a database and subsequent table(s). Grooper can connect to an existing database and import references to its tables, or easily enough, you can have Grooper create the database FOR you AND create tables based on data structures present in Grooper!

The following setup will have Grooper create the database and its tables.

It is worth noting that this article cannot tell you specifics about permissions in your environment. The configuration for this article is using Microsoft SQL Server, not ODBC, and has given the active Active Directory user full DB Admin privileges to the SQL environment.

@@ Line 12: / Line 12: @@
 {|
 | style="padding:25px; vertical-align:top" |
-Following is an example of how to setup '''Database Export'''. In this example are a couple of different document formats whose data will be collected by a single '''Content Model''' and used to help illustrate the two key ''distinctions'' mentioned above. The first document show the ''flattening'' of inherited data, while the second is used to give an avenue to export to multiple tables at once.
+Following is an example of how to setup the '''Database Export''' activity. In this example are a couple of different document formats whose data will be collected by a single '''Content Model''' and used to help illustrate the two key ''distinctions'' mentioned above. The first document shows the ''flattening'' of inherited data, while the second is used to give an avenue to export to multiple tables at once.
 || [[File:database_export_001.gif]]
 |}
@@ Line 44: / Line 44: @@
 The first '''Data Model''' is the one representing the one-to-many relationship. Notice for the fields represented once in the document there are '''Data Fields'''. For the tabular data, a '''Data Table''' was established.
 <p/>
-The second '''Data Model''' is simply collecting all the tabular data from the document. There is no one-to-many relationship here.
+The second '''Data Model''' is simply collecting all the tabular data from the document with one '''Data Table'''. There is no one-to-many relationship here.
 || [[File:database_export_004.gif]]
 |}
@@ Line 50: / Line 50: @@
 <tab name="Index Data" style="margin:25px">
 ====Index Data====
-Before the '''Database Export''' activity can send data, it must have data. Its easy to get in the habit of testing extraction on a '''Data Field''' or a '''Data Model''' and feel good about the results, but it must be understood that the information displayed when doing so is in memory, or temporary. When the '''Extract''' activity is successfully run against a document it creates '''Index Data''' and marries it to the document via a [https://en.wikipedia.org/wiki/JSON JSON] file called ''Grooper.DocumentData.json''. A couple of ways to verify its existence are as follows.<p/>
+Before the '''Database Export''' activity can send data, it must have data. It's easy to get in the habit of testing extraction on a '''Data Field''' or a '''Data Model''' and feel good about the results, but it must be understood that the information displayed when doing so is in memory, or temporary. When the '''Extract''' activity is successfully run against a document it creates '''Index Data''' and marries it to the document via a [https://en.wikipedia.org/wiki/JSON JSON] file called ''Grooper.DocumentData.json''. A couple of ways to verify its existence are as follows.<p/>
 {|
 | style="padding:25px; vertical-align:center" |
@@ Line 73: / Line 73: @@
 In order for the '''Database Export''' activity to run, it needs an established connection to a database and subsequent table(s). '''Grooper''' can connect to an existing database and import references to its tables, or easily enough, you can have '''Grooper''' create the database FOR you AND create tables based on data structures present in '''Grooper'''!<p/>
 The following setup will have '''Grooper''' create the database and its tables.<p/>
-It is worth noting that this article cannot tell you specifics about permissions in your environment. The configuration for this article is using [Microsoft SQL Server], not ODBC, and has given the active Active Directory user full DB Admin privileges to SQL environment.
+It is worth noting that this article cannot tell you specifics about permissions in your environment. The configuration for this article is using Microsoft SQL Server, not ODBC, and has given the active Active Directory user full DB Admin privileges to the SQL environment.
 </tab>
 </tabs>