2023.1:Content Model (Node Type): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 15: Line 15:
<tabs>
<tabs>
<tab name = "Document Classification">
<tab name = "Document Classification">
Document Classification is an important task that the Content Model helps facilitate. The very first property of a Content Model is the ''Classification Method''. This tells Grooper how to classify documents. This can be done one of five ways:
* GPT Embeddings
* Labelset-Based
* Lexical
* Rules-Based
* Visual
</tab>
</tab>
<tab name = "Data Extraction">
<tab name = "Data Extraction">

Revision as of 09:46, 31 January 2024

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

A Content Model is the digital representation in Grooper of a document set's content. What content you want to glean from your documents is all set up within a Content Model, including the system for classifying documents and what data you want to extract from them.

Content Models are the fundamental Content Type.  Other Content Types, such as Document Types, are established within a Content Model.  Content Models have two main purposes in Grooper:  



Let's look at how Document Classification and Data Extraction can be used on a Content Model:

Document Classification is an important task that the Content Model helps facilitate. The very first property of a Content Model is the Classification Method. This tells Grooper how to classify documents. This can be done one of five ways:

  • GPT Embeddings
  • Labelset-Based
  • Lexical
  • Rules-Based
  • Visual

Content Models define the classification taxonomy for a set of documents.  This means a list of distinct types of documents (via Document Types), their hierarchical structure within the Content Model (via optional Content Categories). How a document is classified is defined here as well (via the Classification Method and the Document Types).  

Hand-in-hand with the classification taxonomy, Content Models also define the hierarchical data structure for the documents and document set (via Data Models of the various Content Types in the Content Model). The Data Models and their Data Elements define what data is extracted from documents and how that is accomplished.