Rules-Based Approach: Difference between revisions

From Grooper Wiki
No edit summary
Tag: New redirect
Line 1: Line 1:
This approach uses [[Data Extractor]]s to find key words, phrases, or other text-based information in order to identify and classify a document (assigning a '''[[Document Type]]''' to the '''[[Batch Folder|Document Folder]]''').  For example, a document with a centered header of "Purchase Report" might be classified as a "Purchase Report" '''Document Type''' with this approach.  One could build a '''[[Data Type]]''' extractor using regular expression to match the phrase "Purchase Report" centered at the top of a document to identify it. 
#REDIRECT [[Rules Based (Classification Method)]]
 
The "rules" are set using the '''''Positive Extractor''''' and '''''Negative Extractor''''' properties of a '''Document Type''' object in a '''[[Content Model]]'''.  If an extractor set as the '''''Positive Extractor''''' returns a result on a document, the document would be classified as that '''Document Type'''.  The '''''Negative Extractor'''''' works the opposite way.  If the extractor finds a result on a document, it would be ''prevented'' from being classified as that '''Document Type'''.

Revision as of 13:38, 13 October 2020