Classify Method
|
STUB |
This article is a stub. It contains minimal information on the topic and should be expanded. |
There are currently 5 Classify Methods in Grooper:
- Rules-Based - Classifies documents using simple "rules" defined by each Document Type's Positive Extractor and Negative Extractor properties.
- Lexical - Classifies documents based on text-based features on trained examples of each Document Type in the Content Model.
- The Lexical method can be used to classify already separated, unclassified Batch Folders during the Classify activity. It also can be used with ESP Auto Separation to separate loose pages into classified Batch Folders during the Separate activity.
- Labelset-Based - Classifies documents based on the presence of text-based labels defined in each Document Type's "label set"
- Search Classifier - Classifies documents by finding similar documents in a search index. The Search Classifier method compares large language model (LLM) embeddings on unclassified documents to embeddings already collected for documents in the search index.
- Visual - Classifies documents with computer vision, using visual features on trained examples of each Document Type in the Content Model.
- This is a less common Classify Method. It is suitable only for highly-structured documents like forms whose general visual appearance does not change from document to document.
- This is the only Classify Method that does not rely on text data. It can be used at scan-time when combined with "Event-Based Separation" using the "Content Type" event.