Document Type (Object)

From Grooper Wiki

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.

Glossary

Batch Folder: folder Batch Folder objects are defined as container objects within a inventory_2 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or contract Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.

  • Batch Folders are frequently referred to simply as "documents".

Batch: inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as settings Batch Processes.

Behavior: Behaviors refer a group of functionality configured using a Content Type's Behaviors property. Behaviors enable different features for how documents of a specific Content Type are processed and define their settings. This includes how they are exported, if Label Sets are used for the Document Type and more.

Classification: Classification is the process of identifying and organizing documents into categorical types based on their content or layout. Classification is key for efficient document management and data extraction workflows. Grooper has different methods for classifying documents. These include methods that use machine learning and text pattern recognition. In a Grooper Batch Process, the Classify Activity will assign a Content Type to a folder Batch Folder.

Content Category: collections_bookmark Content Category node objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.

Content Model: stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.

Data Element: Data Element refers to the objects in Grooper used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.

Document Type: description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.

About

If a set contains Invoices, Checks, Receipts and Purchase Orders, there might be four Document Types, one for each kind of document. Classification, in Grooper, is the process of assigning Document Types to Batch Folders in a Batch.

Purposes of a Document Type

Document Types have four functions within Grooper:

  • Classifying documents.
  • Information storage and property configuration necessary for said classification.
    • For example, training weightings for classification methods such Lexical classification, along with both Positive and Negative Extractors.
  • Setting up the Data Model for data extraction.
  • Defining any Behavior settings.

Classification

This is the primary function of a Document Type. A Document Type is assigned to a a Batch Folder, then the document is classified based upon what Document Type was assigned to it. This can either be done manually, or automatically through the Batch Process.

Training and Configuration

Extraction

Behaviors