Field Class (Node Type)

From Grooper Wiki
Revision as of 16:33, 27 December 2019 by Configadmin (talk | contribs) (Created page with "Field Classes are Data Extractors that use supervised machine learning in order to find the right value on a page. This is done by training examples of positive candidate...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Field Classes are Data Extractors that use supervised machine learning in order to find the right value on a page. This is done by training examples of positive candidates. Field Classes use two Data Extractors to do this:

The Value Extractor finds specified output. There can be multiple possible values (candidates) returned by the Value Extractor. To find the context that differentiates the right candidate from the wrong one, the Feature Extractor is written to return words, phrases or other labels that can identify the value in question. From the list of value candidates, the correct value is trained as a positive candidate. The features around it returned by the Feature Extractor are given positive weightings using a TF-IDF algorithm. The extractor will use the weightings of these features on other documents to identify the correct value.