2.90:Tab Marking (Property)

Tab Marking allows you to insert tab characters into a document's text data.
The Tab Marking property enables tab characters for regular expression pattern matching. These characters are inserted into a document's text data wherever there is a large gap of space between characters on a line.
About
Normally, a space is a space is a space. Whether a space between characters, a space between columns, or any other space between characters, those spaces are represented by a single space character in a document's text data.
|
However, often, knowing there's a large amount of space one one or both sides of a label or value can be useful information for how to extract that data. The image here has three columns each with pairs of numbers. |
![]() |
|
You can visually differentiate between the numbers in the second column from the others based on the spatial context around it. The numbers in this columns have a large amount of space on either side between them and the numbers in the other columns. |
![]() |
|
However, with default extractor settings, there's no differentiation between the spaces between words and large spaces between the columns. We call words, phrases, numbers or other data separated by large amounts of space like this "segments". As is, it would be cumbersome to write a regex pattern to differentiate between the pairs of numbers (or other "segments" on the page). |
![]() |
|
With the Tab Marking property enabled, tab characters are inserted wherever there is a large gap between segments. This is the Now, we have a character regular expression can use to pattern match the large white space on either side of the segments in the second column. The tab characters can act as anchors to help us locate what we want on a document. |
![]() |
|
We can now easily create an extractor to return just the pairs of digits in the center column. For the Value Pattern we have For more information on how to enable and configure the Tab Marking property, visit the How To section of this article.
|
![]() |
Use Cases
| WIP | This section is a work-in-progress and may abruptly stop. |
Tab Marking uses are myriad. Essentially, any time you need to match a text segment by anchoring it to a large amount of white space on either side, you will use tab characters in your regex pattern to do so. However, there are a few very common uses where tab characters pop up often.
Segment Extraction
Structured Form Extraction
Table Extraction
How To
Enabling the Tab Marking Property
Where to Begin
|
Tab Marking is enabled on the Pattern properties of Data Types, Data Format and objects using an Internal or Text Pattern extractor. Long story short, any time you can get to a Pattern Editor, you can enable Tab Marking Notice here, our pattern is not matching anything, even though we have |
Enable the Tab Marking Property
|
Once you're on a Pattern Editor in Grooper, you can turn on Tab Marking with the "Properties" tab.
|
Verify Tab Characters Are Inserted
|
With Tab Marking enabled, tab characters replace single spaces in the document's text data wherever there is a long horizontal gap between characters. We can see now, our pattern matches. With tab characters on either side of the second column segments, the
|







