Classify Method: Difference between revisions
No edit summary |
Dgreenwood (talk | contribs) No edit summary |
||
| Line 3: | Line 3: | ||
<blockquote>{{#lst:Glossary|Classification Method}}</blockquote> | <blockquote>{{#lst:Glossary|Classification Method}}</blockquote> | ||
There are currently 5 Classify Methods in Grooper: | |||
* [[Classification ( | * [[Rules-Based (Classification Method)|Rules-Based]] - Classifies documents using simple "rules" defined by each '''Document Type's''' Positive Extractor and Negative Extractor properties. | ||
* [[Labeling_Behavior_(Behavior)#About_Labelset-Based_Classification|Labelset-Based]] | * [[Lexical (Classification Method)|Lexical]] - Classifies documents based on text-based features on trained examples of each '''Document Type''' in the '''Content Model'''. | ||
*:*<li class="fyi-bullet"> The Lexical method can be used to classify already separated, unclassified '''Batch Folders''' during the Classify activity. It also can be used with [[ESP Auto Separation]] to separate loose pages into classified '''Batch Folders''' during the Separate activity. | |||
* [[ | * [[Labeling_Behavior_(Behavior)#About_Labelset-Based_Classification|Labelset-Based]] - Classifies documents based on the presence of text-based labels defined in each '''Document Type's''' "[[Labeling Behavior|label set]]" | ||
* [[Visual (Classification Method)|Visual]] | * [[Search Classifier]] - Classifies documents by finding similar documents in a [[AI Search|search index]]. The Search Classifier method compares large language model (LLM) embeddings on unclassified documents to embeddings already collected for documents in the search index. | ||
* [[Visual (Classification Method)|Visual]] - Classifies documents with computer vision, using visual features on trained examples of each '''Document Type''' in the '''Content Model'''. | |||
*:*<li class="fyi-bullet"> This is a less common Classify Method. It is suitable only for highly-structured documents like forms whose general visual appearance does not change from document to document. | |||
*:*<li class="fyi-bullet"> This is the only Classify Method that does not rely on text data. It can be used at scan-time when combined with "[[Event-Based Separation]]" using the "Content Type" event. | |||
Revision as of 16:11, 12 May 2025
|
STUB |
This article is a stub. It contains minimal information on the topic and should be expanded. |
There are currently 5 Classify Methods in Grooper:
- Rules-Based - Classifies documents using simple "rules" defined by each Document Type's Positive Extractor and Negative Extractor properties.
- Lexical - Classifies documents based on text-based features on trained examples of each Document Type in the Content Model.
- The Lexical method can be used to classify already separated, unclassified Batch Folders during the Classify activity. It also can be used with ESP Auto Separation to separate loose pages into classified Batch Folders during the Separate activity.
- Labelset-Based - Classifies documents based on the presence of text-based labels defined in each Document Type's "label set"
- Search Classifier - Classifies documents by finding similar documents in a search index. The Search Classifier method compares large language model (LLM) embeddings on unclassified documents to embeddings already collected for documents in the search index.
- Visual - Classifies documents with computer vision, using visual features on trained examples of each Document Type in the Content Model.
- This is a less common Classify Method. It is suitable only for highly-structured documents like forms whose general visual appearance does not change from document to document.
- This is the only Classify Method that does not rely on text data. It can be used at scan-time when combined with "Event-Based Separation" using the "Content Type" event.