2.72:What is Classification - DSmith: Difference between revisions

From Grooper Wiki
Line 1: Line 1:
== Overview ==
== Overview ==
''Classification'' is an Activity in Grooper that allows the assigning of a Content Type to a Document. While we as humans may be able to classify a document by reading it (or its title should it have one), to Grooper all documents that come in are unclassified, or "blank". If we want Grooper to know what a Purchase Order is, or be able to tell the difference between a Purchase Order and an invoice, we have to tell it; and we do that through Classification.
''Classification'' is an Activity in Grooper that allows the assigning of a Content Type to a Document. While we as humans may be able to classify a document by reading it (or its title should it have one), to Grooper all documents that come in are unclassified, or "blank". If we want Grooper to know what a Purchase Order is, or be able to tell the difference between a Purchase Order and an invoice, we have to tell it; and we do that through Classification.
== Why Classification?==
So, why would you want to perform Classification? Is it even necessary? The short answer is yes. The slightly longer answer is yes, if you want proper data extraction.


== Classification Methods ==
== Classification Methods ==

Revision as of 12:26, 9 January 2024

Overview

Classification is an Activity in Grooper that allows the assigning of a Content Type to a Document. While we as humans may be able to classify a document by reading it (or its title should it have one), to Grooper all documents that come in are unclassified, or "blank". If we want Grooper to know what a Purchase Order is, or be able to tell the difference between a Purchase Order and an invoice, we have to tell it; and we do that through Classification.

Classification Methods

In order to classify a document, you must choose between four different Classification Methods. They are:

  • Rules-Based
  • Lableset-Based
  • Lexical
  • Visual



These methods can be set on the Content Model via the Classification Method property. Whatever method you choose is largely based on what sort of document you have; its structure, complexity, so on and so forth. We will provide a brief overview of each Classification Method here.

Rules-Based

Labelset-Based

Lexical

Visual

For more information about each Classification Method, click the following links: