Main Page: Difference between revisions

From Grooper Wiki
No edit summary
 
No edit summary
Line 1: Line 1:
{|cellpadding="15" cellspacing="10"
|-style="background-color:#36b0a7; color:white; font-size:16pt"
|style="width:50%"|'''Featured Article'''||'''Did you know?'''
|-style="background-color:#d8f3f1" valign="top"
|[[file:simpletable.png|thumb|300px|Data in an Excel spreadsheet is an example of tabular data.]]
<blockquote>
<span style="font-size:14pt">'''[[Table Extraction]]'''</span>
</blockquote>
Tables are one of the most common ways data is organized on documents.  Human beings have been writing information into tables before they started writing literature, even before paper was invented.  There are examples of tables carved into Egyptian pyramids!  They are excellent structures for representing a lot of information with various characteristics in common in a relatively small space (or a pyramid sized space).  However, targeting the data inside them presents its own set of challenges.  A table’s structure can range from simple and straightforward to more complex (even confounding).  Different organizations may organize the same data differently, creating different tables for what, essentially, is the same data.
In Grooper, tabular data can be extracted using the [[Row Match (Table Extract Method)|Row Match]], [[Header-Value (Table Extract Method)|Header-Value]], or [[Infer Grid (Table Extract Method)|Infer Grid]] table extraction methods.
|Did you know we have a wiki?
You're using it!
|}
{|cellpadding="15" cellspacing="10"
|-style="background-color:#f89420; color:white; font-size:16pt"
|style="width:50%"|'''New in 2.8'''||'''Featured Use Case'''
|-style="background-color:#fde6cb" valign="top"
|
New [[Microfiche Processing]] capabilities including
* Three new batch activities specifically designed for microfiche processing
** [[Initialize Card]]
** [[Detect Frames]]
** [[Clip Frames]]
* Two new IP commands.  While these are not strictly limited to microfiche processing they were created with microfiche processing in mind.
** [[Extract Page]]
** [[Scratch Removal]]
Two additional batch activities
* [[Recognize]] - Combining the old OCR and PDF Extract activities.
* [[Generate PDF]] - Generating PDF content from processed documents, including native-PDF element creation (such as signature widgets).
Two additional IP commands
* [[Shape Removal]]
* [[Shade Removal]]
New extraction methods available to data fields
* [[Anchored Extract]]
* [[Anchored OMR]]
* [[Find Barcode]]
* [[Read Barcode]]
* [[Zonal Extract]]
Simpler and expanded [[Database Lookup]] capabilities.
Expression based [[Field Mapping]] between data elements and their locations in external storage platforms, allowing for easier data formatting and exporting of batch processing metadata.
|Use case. Use case.  Here a use case.  There a use case.  Everywhere a use case.
Uuuuuuuuuuuuuuuuuuuuuuuuuuuse case.
|}
== Getting started ==
<strong>MediaWiki has been installed.</strong>
<strong>MediaWiki has been installed.</strong>


Consult the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents User's Guide] for information on using the wiki software.
Consult the [https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents User's Guide] for information on using the wiki software.


== Getting started ==
* [https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:Configuration_settings Configuration settings list]
* [https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:Configuration_settings Configuration settings list]
* [https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ MediaWiki FAQ]
* [https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ MediaWiki FAQ]

Revision as of 11:25, 9 January 2020

Featured Article Did you know?
Data in an Excel spreadsheet is an example of tabular data.

Table Extraction

Tables are one of the most common ways data is organized on documents. Human beings have been writing information into tables before they started writing literature, even before paper was invented. There are examples of tables carved into Egyptian pyramids! They are excellent structures for representing a lot of information with various characteristics in common in a relatively small space (or a pyramid sized space). However, targeting the data inside them presents its own set of challenges. A table’s structure can range from simple and straightforward to more complex (even confounding). Different organizations may organize the same data differently, creating different tables for what, essentially, is the same data.

In Grooper, tabular data can be extracted using the Row Match, Header-Value, or Infer Grid table extraction methods.

Did you know we have a wiki?

You're using it!


New in 2.8 Featured Use Case

New Microfiche Processing capabilities including

Two additional batch activities

  • Recognize - Combining the old OCR and PDF Extract activities.
  • Generate PDF - Generating PDF content from processed documents, including native-PDF element creation (such as signature widgets).

Two additional IP commands

New extraction methods available to data fields

Simpler and expanded Database Lookup capabilities.

Expression based Field Mapping between data elements and their locations in external storage platforms, allowing for easier data formatting and exporting of batch processing metadata.

Use case. Use case. Here a use case. There a use case. Everywhere a use case.


Uuuuuuuuuuuuuuuuuuuuuuuuuuuse case.


Getting started

MediaWiki has been installed.

Consult the User's Guide for information on using the wiki software.