2023.1:EPI Separation (Separation Provider): Difference between revisions

Revision as of 07:41, 4 April 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

2025

2023.1

WIP

This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

The EPI Separation provider uses embedded page information ("EPI") to separate loose pages into document folders. A Data Extractor is used to find page numbers from the text on a page and Grooper uses this information to separate the pages.

About

For this Separation Proivder, a Data Extractor is used to find page numbers from the text on a page. The extractor must define the page number as group "PageNo" in the regular expression (regex) pattern. If the page number is formatted as Page X of Y (Page 1 of 3) then a second group must be defined as "PageCount" in its regular expression pattern.

The pattern Page (?<PageNo>\d+) of (?<PageCount>\d+) would group the "1" and "3" of our earlier example properly.

If the value of PageNo is 1, a new folder is created. As long as each subsequent page's PageNo value follows in sequence, they are included in the folder. If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page.

How To

@@ Line 1: / Line 1: @@
-{{stubs}}
+{{AutoVersion}}
-<section begin="glossary" />
+{|class="wip-box"
+|
+'''WIP'''
+|
+This article is a work-in-progress or created as a placeholder for testing purposes.  This article is subject to change and/or expansion.  It may be incomplete, inaccurate, or stop abruptly.
+This tag will be removed upon draft completion.
+|}
 <blockquote>
-The '''''EPI Separation''''' provider uses embedded page information ("EPI") to separate loose pages into document folders.
+The '''''EPI Separation''''' provider uses embedded page information ("EPI") to separate loose pages into document folders. A [[Data Extractor]] is used to find page numbers from the text on a page and Grooper uses this information to separate the pages.
 </blockquote>
-<section end="glossary" />
-For this '''''Separation Proivder''''', a [[Data Extractor]] is used to find page numbers from the text on a page (i.e. "Page 1 of 10").  The extractor must also define two groups "PageNo" and "PageCount" in its regular expression pattern.  The pattern "Page (?<PageNo>\d+) of (?<PageCount>\d+)" would group the "1" and "10" of our earlier example properly).  If the value of PageNo is 1, a new folder is created.  As long as each subsequent page's PageNo value follows in sequence, they are included in the folder.  If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page.
-[[Category:Articles]]
+== About ==
+For this '''''Separation Proivder''''', a [[Data Extractor]] is used to find page numbers from the text on a page. The extractor must define the page number as group "PageNo" in the regular expression (regex) pattern. If the page number is formatted as Page X of Y (Page 1 of 3) then a second group must be defined as "PageCount" in its regular expression pattern.
+The pattern <code>Page (?<PageNo>\d+) of (?<PageCount>\d+)</code> would group the "1" and "3" of our earlier example properly.
+[[File:2023.1 EPI-Separation 01 About 01.png]]
+[[File:2023.1 EPI-Separation 01 About 02.png]]
+If the value of PageNo is 1, a new folder is created.  As long as each subsequent page's PageNo value follows in sequence, they are included in the folder.  If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page.
+== How To ==