Data Element

From Grooper Wiki

Data Elements are a class of node types used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

About

Types of Data Elements

The "Data Element" nodes in Grooper consist of:

data_table Data Model ...
variables Data Field ...
insert_page_break Data Section ...
table Data Table and ...
view_column Data Column nodes .

Each of these nodes has its own function within Grooper's data extraction architecture but are also intimately related to each other.

The relationship between these Data Elements is hierarchical and modular.

  • The Data Model acts as the overall blueprint for data extraction.
  • Data Sections structure the document into logical parts. Data Sections can also serve as simple organizational objects within a Data Model to bucket similar "Data Elements" together.
  • Data Tables are incorporated into the model to handle tabular data. Each Data Table comprises Data Columns which specify the format and rules for columnar data extraction.
  • Finally, Data Fields are the fundamental units of data of any kind representing individual pieces of non-repeated data within a document. The exception to this is when Data Fields are contained within a "multi instance" Data Section that occurs repeatedly within a document.

Related Node Types

Data Model

data_table Data Models are leveraged during the Extract activity to collect data from documents (folder Batch Folders). Data Models are the root of a Data Element hierarchy. The Data Model and its child Data Elements define a schema for data present on a document. The Data Model's configuration (and its child Data Elements' configuration) define data extraction logic and settings for how data is reviewed in a Data Viewer.

Data Field

variables Data Fields represent a single value targeted for data extraction on a document. Data Fields are created as child nodes of a data_table Data Model and/or insert_page_break Data Sections.

  • Data Fields are frequently referred to simply as "fields".

Data Section

A insert_page_break Data Section is a container for Data Elements in a data_table Data Model. variables They can contain Data Fields, table Data Tables, and even Data Sections as child nodes and add hierarchy to a Data Model. They serve two main purposes:

  1. They can simply act as organizational buckets for Data Elements in larger Data Models.
  2. By configuring its "Extract Method", a Data Section can subdivide larger and more complex documents into smaller parts to assist in extraction.
    • "Single Instance" sections define a division (or "record") that appears only once on a document.
    • "Multi-Instance" sections define collection of repeating divisions (or "records").

Data Table

A table Data Table is a Data Element specialized in extracting tabular data from documents (i.e. data formatted in rows and columns).

  • The Data Table itself defines the "Table Extract Method". This is configured to determine the logic used to locate and return the table's rows.
  • The table's columns are defined by adding view_column Data Column nodes to the Data Table (as its children).

Data Column

view_column Data Columns represent columns in a table extracted from a document. They are added as child nodes of a table Data Table. They define the type of data each column holds along with its data extraction properties.

  • Data Columns are frequently referred to simply as "columns".
  • In the context of reviewing data in a Data Viewer, a single Data Column instance in a single Data Table row, is most frequently called a "cell".


Object Model info

Grooper Type Name

Grooper.Core.DataElement

Inheritance

Grooper Object (Grooper.GrooperObject)
Connected Object (Grooper.ConnectedObject)
Node (Grooper.GrooperNode)
Data Element (Grooper.Core.DataElement)

Derived Types

Data Element (Grooper.Core.DataElement)
Data Field (Grooper.Core.DataField)
Data Column (Grooper.Core.DataColumn)
Data Field Container (Grooper.Core.DataFieldContainer)
Data Element Container (Grooper.Core.DataElementContainer)
Data Model (Grooper.Core.DataModel)
Data Section (Grooper.Core.DataSection)
Data Table (Grooper.Core.DataTable)