Content Type (Concept)
STUB |
This article is a stub. It contains minimal information on the topic and should be expanded. |
Content Type refers to objects in Grooper used to classify folder Batch Folders. These include: stacks Content Models, collections_bookmark Content Categories, and description Document Types.
About
In Grooper, the "Content Type Objects" consist of:
- stacks Content Model ...
- collections_bookmark Content Category and ...
- description Document Type objects.
Each of these objects serves a distinct function within Grooper's content classification and are related to each other through hierarchical relationships.
The relationship between these objects is established through a heirarchical inheritance system. Content Categories and Document Types are building blocks within a Content Model seen as the "tree". Content Categories act as the "branches". Document Types are the "leaves" of the hierarchy.
"Data Elements" can be defined on each "Content Type Object" and are inherited down the "tree" of heirachy.
- "Data Elements" defined at the Content Model level are applied to all "Content Types" within the Content Model.
- "Data Elements" defined at the Content Category level are applied to all "Content Types" that exist within that specific "branch".
- "Data Elements" defined on a Document Type will apply to that specific "leaf".
These "Content Type Objects" work together in Grooper to enable sophisticated document processing workflows. With different types of documents properly classified, they can have their data extracted and be handled according to the rules and behaviors defined by their respective Document Types within a Content Model hierarchy.
Related Objects
Content Model
stacks Content Model node objects define the taxonomy of document sets in terms of the description Document Type they contain. They also house the Data Elements that appear on each collections_bookmark Content Category and Document Type within them. Content Models serve as the root of a Content Type hierarchy and are crucial for organizing the different types of documents that Grooper can recognize and process.
Content Category
collections_bookmark Content Category node objects are containers within a stacks Content Model that hold other Content Categories and description Document Type objects. They allow for further classification and grouping of Document Types within a taxonomy, aiding in the logical structuring of complex document sets. Besides grouping Document Types together, Content Categories also serve to create new branches in a Data Element hierarchy. In most cases Content Categories are used as organizational buckets to group like Document Types together.
Document Type
description Document Type objects represent a distinct type of document, like an invoice or contract. Document Types are created as children of a stacks Content Model or a collections_bookmark Content Category and are used to classify individual folder Batch Folders. Each Document Type in the hierarchy defines the Data Elements and Behaviors that apply to Batch Folders of that specific classification.
What about Form Types and Page Types?
Technically speaking, Form Types and Page Types are also Content Types, but they aren't typically used in the same way. Form Types and Page Types are created automatically when training example documents for classification. They hold the feature weighting data for documents.
- Form Types
- When a Document Type is trained for classification, the training samples are created as Form Types.
- Form Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).
- Page Types
- The Page Types are the individual pages of a Form Type. All training weightings are stored on the Page Types for each page of the training document.
- Page Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).