2.80:Clip Frames (Activity): Difference between revisions

From Grooper Wiki
No edit summary
No edit summary
Line 5: Line 5:


This is the third step in Grooper’s microfiche processing, after Initialize Card and Detect Frames has been performed.  The Clip Frames activity takes the frame location information from the Detect Frames activity and crops them from a fiche card strip.  You are left with the individual document images from each strip.  Once you have these images off the microfiche card, they can be processed through Grooper as if they were any other document image.
This is the third step in Grooper’s microfiche processing, after Initialize Card and Detect Frames has been performed.  The Clip Frames activity takes the frame location information from the Detect Frames activity and crops them from a fiche card strip.  You are left with the individual document images from each strip.  Once you have these images off the microfiche card, they can be processed through Grooper as if they were any other document image.
<br clear=all>
== Glossary ==
<u><big>'''Activity'''</big></u>: {{#lst:Glossary|Activity}}
<u><big>'''Clip Frames'''</big></u>: {{#lst:Glossary|Clip Frames}}
<u><big>'''Detect Frames'''</big></u>: {{#lst:Glossary|Detect Frames}}
<u><big>'''Extract Page'''</big></u>: {{#lst:Glossary|Extract Page}}
<u><big>'''Extract'''</big></u>: {{#lst:Glossary|Extract}}
<u><big>'''Image Processing'''</big></u>: {{#lst:Glossary|Image Processing}}
<u><big>'''Image Processing'''</big></u>: {{#lst:Glossary|Image Processing}}
<u><big>'''Initialize Card'''</big></u>: {{#lst:Glossary|Initialize Card}}
<u><big>'''OCR'''</big></u>: {{#lst:Glossary|OCR}}


== Version Differences ==
== Version Differences ==

Revision as of 11:04, 3 May 2024

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252.80
Property panel for Clip Frames

view_module Clip Frames is a specialized Activity for processing microfiche in Grooper. It extracts defined areas from microfiche card images, creating new image frames or layers for focused analysis or processing.

This is the third step in Grooper’s microfiche processing, after Initialize Card and Detect Frames has been performed. The Clip Frames activity takes the frame location information from the Detect Frames activity and crops them from a fiche card strip. You are left with the individual document images from each strip. Once you have these images off the microfiche card, they can be processed through Grooper as if they were any other document image.

Glossary

Activity: Grooper Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. In a settings Batch Process, each edit_document Batch Process Step executes a single Activity (determined by the step's "Activity" property).

  • Batch Process Steps are frequently referred by the name of their configured Activity followed by the word "step". For example: "Classify step".

Clip Frames: view_module Clip Frames is a specialized Activity for processing microfiche in Grooper. It extracts defined areas from microfiche card images, creating new image frames or layers for focused analysis or processing.

Detect Frames: view_module Detect Frames is a specialized Activity for processing microfiche in Grooper. It locates and identifies frame lines on microfiche card images, enabling the isolation of areas within the frames for further data extraction or processing.

Extract Page: Extract Page is an IP Command that removes an image from a carrier image while simultaneously removing any image warping or skewing.

Extract: export_notes Extract is an Activity that retrieves information from folder Batch Folder documents, as defined by Data Elements in a data_table Data Model. This is how Grooper locates unstructured data on your documents and collects it in a structured, usable format.

Image Processing: wallpaper Image Processing is an Activity that enhances contract Batch Page images and optimizes them for better OCR text recognition and data extraction results.

Image Processing: wallpaper Image Processing is an Activity that enhances contract Batch Page images and optimizes them for better OCR text recognition and data extraction results.

Initialize Card: view_module Initialize Card is a specialized Activity for processing microfiche in Grooper. It prepares and configures microfiche card images for further processing.

OCR: OCR is stands for Optical Character Recognition. It allows text on paper documents to be digitized, in order to be searched or edited by other software applications. OCR converts typed or printed text from digital images of physical documents into machine readable, encoded text.

Version Differences

Clip Frames is a brand new activity in Grooper 2.80. In previous versions, documents from microfiche scans were pre-processed using software local to the scanner. This significantly slows down the speed at which cards can be scanned. Grooper's microfiche capabilities allow the scanner to run at full speed, while Grooper pre-processes them independently. Furthermore, while microfiche scanners do have some image cleanup capabilities, they are nowhere near as robust as Grooper's Image Processing activities. The result is end-to-end microfiche document processing with a faster workflow and cleaner images resulting in more accurate OCR data.

Use Cases

Clip Frames is specifically designed for document processing from microfiche scans. Any set of documents stored on microfiche can take advantage of this activity. Grooper does not currently have support for other microforms (such as microfilm). However, now that the groundwork for microform processing is laid, this functionality will be easier to add in the future. If you'd like to see more microform processing added into Grooper, please let us know!

Microfiche is often used for archiving documents. Court records are one example of documents that may be stored on microfiche cards.

How To: Configure the Activity

Before you begin

The microfiche card must first be scanned directly into Grooper or otherwise imported. The images must then go through the Initialize Card activity to organize the tile images into strip folders and the Detect Frames activity to determine where each document is on the fiche card.

Set the Layout settings

Here you specify four important things: How many total rows of documents are on the card (Row Count). How many columns (Column Count). How many rows of documents to expect per strip (Rows Per Strip). And, whether the card has a Reverse Alignment. By default, Grooper assumes the first strip only contains a single row. However, if the last strip contains a single row instead of the first, change the Reverse Alignment setting to “True”

These should be the same settings as your Detect Frames activity’s Card Layout settings.



Optionally rotate the images

If needed, rotate the image by 0, 90, 180 or 270 degrees using the Tile Rotation property.



Set the JPEG settings

Set the JPEG compression quality used to save images using the Jpeg Quality property. Compression sacrifices image quality for smaller file size.

Set the JPEG smoothing amount used when saving images with the Jpeg Smoothing property. Smoothing is used to reduce noise or to produce a less pixelated image.



Optionally pad the images

The Padding property can add a defined length in pixels, inches or millimeters to the top, right, left, and/or bottom edge of an image before cropping it out of the frame.

This can help ensure a document is not over-cropped, losing information on the edge of the page. It will leave a border around the document. However, the border can be removed via Image Processing.



Property Details

Property Default Value Information
General Properties
Layout 13 rows x 28 columns Here you specify four important things:
  • How many total rows of documents are on the card (Row Count).
  • How many columns of documents are on the card (Column Count).
  • How many rows of documents to expect per strip (Rows Per Strip).
  • And, whether the card has a Reverse Alignment. By default, Grooper assumes the first strip only contains a single row. However, if the last strip contains a single row instead of the first, change the Reverse Alignment setting to “True”
Tile Rotation 0 If needed, you can rotate all images by 0, 90, 180 or 270 degrees using this property.
Jpeg Quality 75% Compression sacrifices image quality for smaller file size. You can configure the image quality by setting the compression percent here.
Jpeg Smoothing 0% Smoothing is used to reduce noise or to produce a less pixelated image. The higher percent set here, the more the images will be smoothed.
Padding Here you can change the border size (in px, in, or mm) around the images. You can pad the left, right, top, and bottom edges of the images. This can help ensure a document is not over-cropped, losing information on the edge of the page. It will leave a border around the document. However, the border can be removed via Image Processing (using, for example, Extract Page).