2023.1:EPI Separation (Separation Provider): Difference between revisions
Configadmin (talk | contribs) Created page with "For EPI (Embedded Page Information) Separation, a Data Extractor is used to find page numbers from the text on a page (i.e. "Page 1 of 10"). This is set on the "Value Ext..." |
Dgreenwood (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
For EPI (Embedded Page Information) Separation, a [[Data Extractor]] is used to find page numbers from the text on a page (i.e. "Page 1 of 10"). This is set on the "Value Extractor" property. The extractor must also define two groups "PageNo" and "PageCount" in its [[Regular Expression|regular expression]] pattern (The pattern "Page (?<PageNo>\d+) of (?<PageCount>\d+)" would group the "1" and "10" of our earlier example properly). If the value of PageNo is 1, a new folder is created. As long as each subsequent page's PageNo value follows in sequence, they are included in the folder. If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page. | For EPI (Embedded Page Information) Separation, a [[Data Extractor]] is used to find page numbers from the text on a page (i.e. "Page 1 of 10"). This is set on the "Value Extractor" property. The extractor must also define two groups "PageNo" and "PageCount" in its [[Regular Expression|regular expression]] pattern (The pattern "Page (?<PageNo>\d+) of (?<PageCount>\d+)" would group the "1" and "10" of our earlier example properly). If the value of PageNo is 1, a new folder is created. As long as each subsequent page's PageNo value follows in sequence, they are included in the folder. If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page. | ||
[[Category:Articles]] | |||
[[Category:Stub]] |
Revision as of 15:38, 9 December 2022
For EPI (Embedded Page Information) Separation, a Data Extractor is used to find page numbers from the text on a page (i.e. "Page 1 of 10"). This is set on the "Value Extractor" property. The extractor must also define two groups "PageNo" and "PageCount" in its regular expression pattern (The pattern "Page (?<PageNo>\d+) of (?<PageCount>\d+)" would group the "1" and "10" of our earlier example properly). If the value of PageNo is 1, a new folder is created. As long as each subsequent page's PageNo value follows in sequence, they are included in the folder. If the page is out of sequence (or the extractor fails to produce a result), it is left as a loose page.