2023:Highlight Zone (Value Extractor): Difference between revisions

Revision as of 12:10, 18 December 2023

The Highlight Zone extractor uses zonal information to highlight text on a document. It doesn't actually extract anything and is meant to be a visual aid only during Data Review. It uses many of the same functions as the Read Zone extractor, so please review the Read Zone - 2023 article prior to following the tutorial in this article.

About

Highlight Zone is a zonal extractor specifically designed to aid in Data Review by highlighting specific text on a document that needs to be verified. It's very similar to the Read Zone extractor in that you use one of the four Location options (Fixed Region, Relative Region, Shape Region or Text Region) to draw an extraction zone on a geographic region of the page.

However, rather than returning the OCR or native text data within the zone, nothing is actually extracted. Instead, it simply places a zone in a particular location on a document. In terms of a "data instance", it returns an instance location but no text value whatsoever. Most commonly, this extractor will be used to aid document reviewers, highlighting a troublesome field on the document for manual review.

⚠	This extractor shares some similarities to the Read Zone extractor. It is recommended you become familiar with the different aspects of the Read Zone extractor before configuring the Detect Signature extractor. For more information, please see the Read Zone - 2023 article.

How To

In this example, we are going to collect the Signature Date from the Application for Cow Ownership document.

Often, Grooper has a difficult time extracting hand written information. OCR generally just does not work well on handwritten information. We are going to assume this might be the case for these handwritten dates, so we will need a Reviewer to come back through and manually enter in the information from the document. We can use Highlight Zone to make this easier for the Reviewer.

@@ Line 1: / Line 1: @@
-{|class="wip-box"
-|
-'''WIP'''
-|
-This article is a work-in-progress or created as a placeholder for testing purposes.  This article is subject to change and/or expansion.  It may be incomplete, inaccurate, or stop abruptly.
-This tag will be removed upon draft completion.
-|}
 <blockquote>
-The ''Highlight Zone'' extractor uses zonal information to. It uses many of the same functions as the ''Read Zone'' extractor, so please review the [[Read Zone - 2023]] article prior to following the tutorial in this article.
+The ''Highlight Zone'' extractor uses zonal information to highlight text on a document. It doesn't actually extract anything and is meant to be a visual aid only during Data Review. It uses many of the same functions as the ''Read Zone'' extractor, so please review the [[Read Zone - 2023]] article prior to following the tutorial in this article.
 </blockquote>
 == About ==
-''Highlight Zone'' is a zonal extractor specifically designed to . It's very similar to the Read Zone extractor in that you use one of the four Location options (Fixed Region, Relative Region, Shape Region or Text Region) to draw an extraction zone on a geographic region of the page.
+''Highlight Zone'' is a zonal extractor specifically designed to aid in Data Review by highlighting specific text on a document that needs to be verified. It's very similar to the Read Zone extractor in that you use one of the four Location options (Fixed Region, Relative Region, Shape Region or Text Region) to draw an extraction zone on a geographic region of the page.
 However, rather than returning the OCR or native text data within the zone, nothing is actually extracted. Instead, it simply places a zone in a particular location on a document. In terms of a "data instance", it returns an instance location but no text value whatsoever. Most commonly, this extractor will be used to aid document reviewers, highlighting a troublesome field on the document for manual review.
@@ Line 26: / Line 17: @@
 == How To ==
 <br>
-In this example,
+In this example, we are going to collect the Signature Date from the Application for Cow Ownership document.
+Often, Grooper has a difficult time extracting hand written information. OCR generally just does not work well on handwritten information. We are going to assume this might be the case for these handwritten dates, so we will need a Reviewer to come back through and manually enter in the information from the document. We can use ''Highlight Zone'' to make this easier for the Reviewer.
+[[File:2023 Highlight Zone - 2023 01 How To 01.png]]
+[[File:2023 Highlight Zone - 2023 01 How To 02.png]]
+[[File:2023 Highlight Zone - 2023 01 How To 03.png]]
+[[File:2023 Highlight Zone - 2023 01 How To 04.png]]
+[[File:2023 Highlight Zone - 2023 01 How To 05.png]]
+[[File:2023 Highlight Zone - 2023 01 How To 06.png]]