Rules-Based Approach: Difference between revisions

Revision as of 15:14, 6 October 2020

This approach uses Data Extractors to find key words, phrases, or other text-based information in order to identify and classify a document (assigning a Document Type to the Folder). For example, a document with a centered header of "Purchase Report" might be classified as a "Purchase Report" Document Type with this approach. One could build a Data Type extractor using regular expression to match the phrase "Purchase Report" centered at the top of a document to identify it.

The "rules" are set using the Positive Extractor and Negative Extractor properties of a Document Type object in a Content Model. If set as the Positive Extractor and the extractor returned a result on a document, it would be classified as a "Purchase Report" Document Type. The Negative Extractor' works the opposite way. If the extractor finds a result on a document, it would be prevented from being classified as that Document Type.

Revision as of 15:14, 6 October 2020 view source Dgreenwood (talk \| contribs) Bureaucrats, Administrators 17,139 edits No edit summary ← Older edit		Revision as of 15:14, 6 October 2020 view source Dgreenwood (talk \| contribs) Bureaucrats, Administrators 17,139 edits No edit summary Newer edit →
Line 1:		Line 1:
	This approach uses [[Data Extractor]]s to find key words, phrases, or other text-based information in order to identify and classify a document (assigning a '''[[Document Type]]''' to ~~a document~~). For example, a document with a centered header of "Purchase Report" might be classified as a "Purchase Report" '''Document Type''' with this approach. One could build a [[Data Type]] extractor using regular expression to match the phrase "Purchase Report" centered at the top of a document to identify it.		This approach uses [[Data Extractor]]s to find key words, phrases, or other text-based information in order to identify and classify a document (assigning a '''[[Document Type]]''' to the [[Folder]]). For example, a document with a centered header of "Purchase Report" might be classified as a "Purchase Report" '''Document Type''' with this approach. One could build a [[Data Type]] extractor using regular expression to match the phrase "Purchase Report" centered at the top of a document to identify it.

	The "rules" are set using the '''''Positive Extractor''''' and '''''Negative Extractor''''' properties of a '''Document Type''' object in a '''[[Content Model]]'''. If set as the '''''Positive Extractor''''' and the extractor returned a result on a document, it would be classified as a "Purchase Report" '''Document Type'''. The '''''Negative Extractor'''''' works the opposite way. If the extractor finds a result on a document, it would be ''prevented'' from being classified as that '''Document Type'''.		The "rules" are set using the '''''Positive Extractor''''' and '''''Negative Extractor''''' properties of a '''Document Type''' object in a '''[[Content Model]]'''. If set as the '''''Positive Extractor''''' and the extractor returned a result on a document, it would be classified as a "Purchase Report" '''Document Type'''. The '''''Negative Extractor'''''' works the opposite way. If the extractor finds a result on a document, it would be ''prevented'' from being classified as that '''Document Type'''.