Content Type (Concept)

From Grooper Wiki
Revision as of 13:50, 22 December 2023 by Dgreenwood (talk | contribs)

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

Content Types are the building blocks of a Content Model. They are built to represent the content of a document set, both in terms of document classification and their data.

There are five Content Types:

  • The Content Model itself
  • Document Types
    • These are used to represent one kind of document or another. When documents are classified in Grooper, they are assigned a Document Type.
  • Content Categories
    • This are used to create "branches" in the document classification and data extraction hierarchy of Content Model.
    • These are also often used as simple organizational tools, grouping similar Document Types under a single Content Category.
  • Form Types
    • When a Document Type is trained for classification, the training samples are created as Form Types.
    • Form Types are generated automatically when training documents for Lexical classification.
  • Page Types
    • The Page Types are the individual pages of a Form Type. All training weightings are stored on the page types for training samples.
    • Page Types are generated automatically when training documents for Lexical classification.


The different Content Types are used to create a hierarchical structure within a Content Model with each Content Type, forming a different level of the document set's classification taxonomy. The Content Model forms the root of this hierarchy.