Main Page: Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 35: Line 35:
[[file:simpletable.png|thumb|300px|Data in an Excel spreadsheet is an example of tabular data.]]
[[file:simpletable.png|thumb|300px|Data in an Excel spreadsheet is an example of tabular data.]]
<blockquote>
<blockquote>
<span style="font-size:14pt">'''[[Table Extraction]]'''</span>
<span style="font-size:14pt">'''[[OCR]]'''</span>
</blockquote>
</blockquote>
Tables are one of the most common ways data is organized on documents.  Human beings have been writing information into tables before they started writing literature, even before paper was invented. There are examples of tables carved onto the walls of ancient Egyptian temples!  They are excellent structures for representing a lot of information with various characteristics in common in a relatively small space (or an Egyptian temple sized space).  However, targeting the data inside them presents its own set of challenges.  A table’s structure can range from simple and straightforward to more complex (even confounding).  Different organizations may organize the same data differently, creating different tables for what, essentially, is the same data.
OCR stands for Optical Character Recognition. It allows text from paper documents to be digitized to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text. This conversion allows Grooper to search text characters from the image, providing the capability to separate images into documents, classify them and extract data from them.


In Grooper, tabular data can be extracted using the [[Row Match (Table Extract Method)|Row Match]], [[Header-Value (Table Extract Method)|Header-Value]], or [[Infer Grid (Table Extract Method)|Infer Grid]] table extraction methods.
In Grooper, tabular data can be extracted using the [[Row Match (Table Extract Method)|Row Match]], [[Header-Value (Table Extract Method)|Header-Value]], or [[Infer Grid (Table Extract Method)|Infer Grid]] table extraction methods.

Revision as of 09:15, 26 February 2020

Getting Started

Some kind of general intro paragraph about what Grooper is/does.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Introduction to Grooper
Install and Setup
Third article?


Featured Article Did you know?
Data in an Excel spreadsheet is an example of tabular data.

OCR

OCR stands for Optical Character Recognition. It allows text from paper documents to be digitized to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text. This conversion allows Grooper to search text characters from the image, providing the capability to separate images into documents, classify them and extract data from them.

In Grooper, tabular data can be extracted using the Row Match, Header-Value, or Infer Grid table extraction methods.

Did you know we have a wiki?

You're using it!


New in 2.8 Featured Use Case

New Microfiche Processing capabilities including

Two additional batch activities

  • Recognize - Combining the old OCR and PDF Extract activities.
  • Generate PDF - Generating PDF content from processed documents, including native-PDF element creation (such as signature widgets).

Two additional IP commands

New extraction methods available to data fields

Simpler and expanded Database Lookup capabilities.

Expression based Field Mapping between data elements and their locations in external storage platforms, allowing for easier data formatting and exporting of batch processing metadata.

Use case. Use case. Here a use case. There a use case. Everywhere a use case.


Uuuuuuuuuuuuuuuuuuuuuuuuuuuse case.


Other Resources

Getting started (MediaWiki)

MediaWiki has been installed.

Consult the User's Guide for information on using the wiki software.