Document Type (Node Type)

From Grooper Wiki

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

description Document Type nodes represent a distinct type of document, such as an invoice or a contract. Document Types are created as child nodes of a stacks Content Model or a collections_bookmark Content Category. They serve three primary purposes:

  1. They are used to classify documents. Documents are considered "classified" when the folder Batch Folder is assigned a Content Type (most typically, a Document Type).
  2. The Document Type's data_table Data Model defines the Data Elements extracted by the Extract activity (including any Data Elements inherited from parent Content Types).
  3. The Document Type defines all "Behaviors" that apply (whether from the Document Type's Behavior settings or those inherited from a parent Content Type).

About

If a set contains Invoices, Checks, Receipts and Purchase Orders, there might be four Document Types, one for each kind of document. Classification, in Grooper, is the process of assigning Document Types to Batch Folders in a Batch.

Purposes of a Document Type

Document Types have four functions within Grooper:

  • Classifying documents.
  • Information storage and property configuration necessary for said classification.
    • For example, training weightings for classification methods such Lexical classification, along with both Positive and Negative Extractors.
  • Setting up the Data Model for data extraction.
  • Defining any Behavior settings.

Classification

This is the primary function of a Document Type. A Document Type is assigned to a a Batch Folder, then the document is classified based upon what Document Type was assigned to it. This can either be done manually, or automatically through the Batch Process.

Training and Configuration

Extraction

Behaviors