Field Class (Node Type)

From Grooper Wiki
Revision as of 11:36, 28 April 2025 by Dgreenwood (talk | contribs) (Dgreenwood moved page Field Class (Object) to Field Class (Node Type))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

STUB

This article is a stub. It contains minimal information on the topic and should be expanded.

input Field Classes are NLP (natural language processing) based extractor nodes. They find values based on some natural language context near that value. Values are positively or negatively associated with text-based "features" nearby by training the extractor. During extraction, the extractor collects values based on these training weightings.

  • Field Classes are most useful when attempting to find values within the flow of natural language.
  • Field Classes can be configured to distinguish values within highly structured documents, but this type of extraction is better suited to simpler "extractor nodes" like quick_reference_all Value Readers or pin Data Types.
  • Advances in large-language models (LLMs) have largely made Field Classes obsolete. LLM-based extraction methods in Grooper (such as AI Extract) can achieve similar results with nowhere near the amount of set up.

Field Classes are commonly used to find values within the flow of sentences in a paragraph. For example, you would opt for a Field Class when the value you're after is an entire clause within a contract, or a specific value defined within the flow of text. This method involves training with positive and negative examples to distinguish the correct semantic context around the value.

Field Classes use two Data Extractors to do this:

  • A Value Extractor
  • and a Feature Extractor

The Value Extractor finds specified output. There can be multiple possible values (candidates) returned by the Value Extractor. To find the context that differentiates the right candidate from the wrong one, the Feature Extractor is written to return words, phrases or other labels that can identify the value in question. From the list of value candidates, the correct value is trained as a positive candidate. The features around it returned by the Feature Extractor are given positive weightings using a TF-IDF algorithm. The extractor will use the weightings of these features on other documents to identify the correct value.

As with any extractor, data context can be critical to understanding your documents and building the Field Class extractor. For more information on this topic, visit the Data Context article.