2.72:What is Classification - DSmith: Difference between revisions

From Grooper Wiki
No edit summary
Line 3: Line 3:


== Why Classification?==
== Why Classification?==
So, why would you want to perform Classification? Is it even necessary? The short answer is yes. The slightly longer answer is, yes, if you want proper data extraction.


== Classification Methods ==
== Classification Methods ==

Revision as of 09:27, 9 January 2024

Overview

Classification is an Activity in Grooper that allows the assigning of a Content Type to a Document. While we as humans may be able to classify a document by reading it (or its title should it have one), to Grooper all documents that come in are unclassified, or "blank". If we want Grooper to know what a Purchase Order is, or be able to tell the difference between a Purchase Order and an invoice, we have to tell it; and we do that through Classification.

Why Classification?

So, why would you want to perform Classification? Is it even necessary? The short answer is yes. The slightly longer answer is, yes, if you want proper data extraction.

Classification Methods

In order to classify a document, you must choose between four different Classification Methods. They are:

  • Rules-Based
  • Lableset-Based
  • Lexical
  • Visual



These methods can be set on the Content Model via the Classification Method property.

For more information about each Classification Method, click the following links: