Content Type (Concept)

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

Content Types are the building blocks of a Content Model. They are built to represent the content of a document set, both in terms of document classification and their data.

There are three main Content Types:

The Content Model itself
Document Types
- These are used to represent one kind of document or another. When documents are classified in Grooper, they are assigned a Document Type.
Content Categories
- These are used to create "branches" in the document classification and data extraction hierarchy of Content Model.
- These are also often used as simple organizational tools, grouping similar Document Types under a single Content Category.

The different Content Types are used to create a hierarchical structure within a Content Model with each Content Type, forming a different level of the document set's classification taxonomy. The Content Model forms the root of this hierarchy.

What about Form Types and Page Types?

Technically speaking, Form Types and Page Types are also Content Types, but they aren't typically used in the same way. Form Types and Page Types are created automatically when training example documents for classification. The hold the feature weighting data for documents.

Form Types
- When a Document Type is trained for classification, the training samples are created as Form Types.
- Form Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).
Page Types
- The Page Types are the individual pages of a Form Type. All training weightings are stored on the Page Types for each trained page.
- Page Types are generated automatically when training documents for Lexical classification (and less commonly for Visual classification).