2.72:What is Classification - DSmith: Difference between revisions

From Grooper Wiki
Line 15: Line 15:


=== Rules-Based===
=== Rules-Based===
Rules-Based Classification works by using classification rules set up on a Document Type.
=== Labelset-Based===
=== Labelset-Based===
=== Lexical ===
=== Lexical ===

Revision as of 12:50, 9 January 2024

Overview

Classification is an Activity in Grooper that allows the assigning of a Content Type to a Document. While we as humans may be able to classify a document by reading it (or its title should it have one), to Grooper all documents that come in are unclassified, or "blank". If we want Grooper to know what a Purchase Order is, or be able to tell the difference between a Purchase Order and an invoice, we have to tell it; and we do that through Classification.

Classification Methods

In order to classify a document, you must choose between four different Classification Methods. They are:

  • Rules-Based
  • Lableset-Based
  • Lexical
  • Visual



These methods can be set on the Content Model via the Classification Method property. Whatever method you choose is largely based on what sort of document you have; its structure, complexity, so on and so forth. We will provide a brief overview of each Classification Method here.

Rules-Based

Rules-Based Classification works by using classification rules set up on a Document Type.

Labelset-Based

Lexical

Visual

For more information about each Classification Method, click the following links: