Content Type (Concept)
STUB |
This article is a stub. It contains minimal information on the topic and should be expanded. |
Content Type refers to objects in Grooper used to classify folder Batch Folders. These include: stacks Content Models, collections_bookmark Content Categories, and description Document Types.
Glossary
Batch: inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as settings Batch Processes.
Behavior: Behaviors refer a group of functionality configured using a Content Type's Behaviors property. Behaviors enable different features for how documents of a specific Content Type are processed and define their settings. This includes how they are exported, if Label Sets are used for the Document Type and more.
Content Category: collections_bookmark Content Category node objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.
Content Model: stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.
Content Type: Content Type refers to objects in Grooper used to classify folder Batch Folders. These include: stacks Content Models, collections_bookmark Content Categories, and description Document Types.
Data Element: Data Element refers to the objects in Grooper used to collect data from a document. These include: data_table Data Models, insert_page_break Data Sections, variables Data Fields, table Data Tables, and view_column Data Columns.
Document Type: description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.
Form Type: two_pager Form Type objects represent trained variations of a description Document Type. These objects store machine learning training data for Lexical and Visual document classification methods.
Lexical: The Lexical Classification Method classifies folder Batch Folders based on the text content of trained document examples. This is achieved through the statistical analysis of word frequencies that identify description Document Types.
About
In Grooper, the "Content Type Objects" consist of:
Content Model ...
Content Category and ...
Document Type objects.
Each of these objects serves a distinct function within Grooper's content classification and are related to each other through hierarchical relationships.
The relationship between these objects is established through a heirarchical inheritance system. Content Categories and Document Types are building blocks within a Content Model seen as the "tree". Content Categories act as the "branches". Document Types are the "leaves" of the hierarchy.
"Data Elements" can be defined on each "Content Type Object" and are inherited down the "tree" of heirachy.
- "Data Elements" defined at the Content Model level are applied to all "Content Types" within the Content Model.
- "Data Elements" defined at the Content Category level are applied to all "Content Types" that exist within that specific "branch".
- "Data Elements" defined on a Document Type will apply to that specific "leaf".
These "Content Type Objects" work together in Grooper to enable sophisticated document processing workflows. With different types of documents properly classified, they can have their data extracted and be handled according to the rules and behaviors defined by their respective Document Types within a Content Model hierarchy.
Related Objects
Content Model
stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.
Content Category
collections_bookmark Content Category node objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.
Document Type
description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.
What about Form Types and Page Types?
Technically speaking, Form Types and Page Types are also Content Types, but they aren't typically used in the same way. Form Types and Page Types are created automatically when training example documents for classification. They hold the feature weighting data for documents.
- Form Types
- When a Document Type is trained for classification, the training samples are created as Form Types.
- Form Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).
- Page Types
- The Page Types are the individual pages of a Form Type. All training weightings are stored on the Page Types for each page of the training document.
- Page Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).