Category:Document Modeling: Difference between revisions

From Grooper Wiki
Created page with ""Document modeling" is the process of designing structured representations of documents to better understand them, manage them, and/or extract information from them. Many different Grooper components are used to help represent a document in different ways. This includes: * Batch Objects - These are the components used to represent a document's structure and store raw data about them. For example, Batch Pages represent a document's individual p..."
 
No edit summary
Line 2: Line 2:


* [[:Category:Batch Object|Batch Objects]] - These are the components used to represent a document's structure and store raw data about them. For example, Batch Pages represent a document's individual pages and store the page's image, text data obtained from [[Recognize]], and more.  
* [[:Category:Batch Object|Batch Objects]] - These are the components used to represent a document's structure and store raw data about them. For example, Batch Pages represent a document's individual pages and store the page's image, text data obtained from [[Recognize]], and more.  
* [[:Content Type:Content Type|Contnet Types]] - These are the components used to represent how documents fit into a classification schema. They are used to form a classification "taxonomy". [[Content Model]]s are at the top of the taxonomy. They composed of [[Document Type]]s which represent different kinds of documents that all fit within the Content Model. For example, Document Types represent different kinds of documents in a larger Content Model. Content Types are key to dictating the Data Model used to extract data from a document and [[Behavior]]s that control processing logic for several different Activities in Grooper.
* [[:Category:Content Type|Content Types]] - These are the components used to represent how documents fit into a classification schema. They are used to form a classification "taxonomy". [[Content Model]]s are at the top of the taxonomy. They composed of [[Document Type]]s which represent different kinds of documents that all fit within the Content Model. For example, Document Types represent different kinds of documents in a larger Content Model. Content Types are key to dictating the Data Model used to extract data from a document and [[Behavior]]s that control processing logic for several different Activities in Grooper.
* [[:Category:Data Element|Data Elements]]-
* [[:Category:Data Element|Data Elements]]-

Revision as of 14:51, 30 July 2025

"Document modeling" is the process of designing structured representations of documents to better understand them, manage them, and/or extract information from them. Many different Grooper components are used to help represent a document in different ways. This includes:

  • Batch Objects - These are the components used to represent a document's structure and store raw data about them. For example, Batch Pages represent a document's individual pages and store the page's image, text data obtained from Recognize, and more.
  • Content Types - These are the components used to represent how documents fit into a classification schema. They are used to form a classification "taxonomy". Content Models are at the top of the taxonomy. They composed of Document Types which represent different kinds of documents that all fit within the Content Model. For example, Document Types represent different kinds of documents in a larger Content Model. Content Types are key to dictating the Data Model used to extract data from a document and Behaviors that control processing logic for several different Activities in Grooper.
  • Data Elements-