2023:Array (Collation Provider): Difference between revisions
New Edit // Edit via Wikitext Extension for VSCode |
Final // Edit via Wikitext Extension for VSCode |
||
| Line 56: | Line 56: | ||
==== The Flow Layout Property ==== | ==== The Flow Layout Property ==== | ||
The Flow Layout Property should be used when the information you want to extract is contained within the flow of a paragraph or text. You will commonly find this in unstructured documents. | |||
[[File:2023 Arrays - 2023 01 How To 04 Flow Layout 01.png]] | [[File:2023 Arrays - 2023 01 How To 04 Flow Layout 01.png]] | ||
| Line 68: | Line 71: | ||
[[File:2023 Arrays - 2023 01 How To 04 Flow Layout 04.png]] | [[File:2023 Arrays - 2023 01 How To 04 Flow Layout 04.png]] | ||
=== | === The Minimum Elements Property === | ||
You may have noticed when we first switched to the third document in the '''Batch''' with Vertical Layout two words were returned because there were two words that just happened to be stacked on top of one another. It returned when Grooper found two of the words, but not just a single word. The reason is because the Minimum Elements Property is set to 2 by default. The following screenshots will explain how this property works. | |||
[[File:2023 Arrays - 2023 01 How To 05 Maximum Elements 01.png]] | [[File:2023 Arrays - 2023 01 How To 05 Maximum Elements 01.png]] | ||
| Line 77: | Line 83: | ||
[[File:2023 Arrays - 2023 01 How To 05 Maximum Elements 03.png]] | [[File:2023 Arrays - 2023 01 How To 05 Maximum Elements 03.png]] | ||
=== A Practical Example === | |||
Although the above examples make it easier to see how Array works, it is not something you'd probably see in the real world. The following screenshots give an example of how you might use a collation property in the real world. | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 01.png]] | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 02.png]] | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 03.png]] | |||
==== Maximum Distance ==== | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 04.png]] | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 05.png]] | |||
==== Enforce Line Boundaries ==== | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 06.png]] | |||
[[File:2023 Arrays - 2023 01 How To 06 Practical Example 07.png]] | |||
Revision as of 12:10, 22 December 2023
|
WIP |
This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly. This tag will be removed upon draft completion. |
An Array is one of many Collation Providers you can use in Grooper to combine or organize extracted data based on the data's layout relationship.
About
Array is one of the Collation Providers and can be used for data organization depending on what you want to extract from your documents. All of the Collation Providers (except for Individual) essentially take multiple results and combine them into one. Array specifically only returns results based on the orientation of the information. If the data is lined up horizontally or vertically, you must select the corresponding layout property for Grooper to return the information.
Essentially, an Array collated result is a collection of results who share a layout relationship that are all lined up together (either horizontally, vertically, or in the left/right and top/bottom text flow of the document).
| ⚠ |
The Array collation differs from the Ordered Array collation in one significant way. In an Ordered Array the order of the data matters. For the Array Collation Provider the data can be in any order and will all be returned as one result. |
How To
Setting the Collation Property
The Horizontal Layout Property
The Vertical Layout Property
The Flow Layout Property
The Flow Layout Property should be used when the information you want to extract is contained within the flow of a paragraph or text. You will commonly find this in unstructured documents.
The Minimum Elements Property
You may have noticed when we first switched to the third document in the Batch with Vertical Layout two words were returned because there were two words that just happened to be stacked on top of one another. It returned when Grooper found two of the words, but not just a single word. The reason is because the Minimum Elements Property is set to 2 by default. The following screenshots will explain how this property works.
A Practical Example
Although the above examples make it easier to see how Array works, it is not something you'd probably see in the real world. The following screenshots give an example of how you might use a collation property in the real world.





















