Microfiche Processing

From Grooper Wiki
Jump to navigation Jump to search

Grooper has activities and image processing capabilities specifically designed for ingesting microfiche scans and creating document images from them.

In version 2.80, Grooper added capabilities to process scans from microfiche scanners. Grooper’s microfiche processes have several advantages over typical microfiche scanning. First, new Batch Processing Activities detect document frames and digitally cut the document from the fiche card. This allows the microfiche scanner to run at the fastest possible setting while Grooper does the work to get the documents off the card. 2.80 also adds some microfiche specific image processing capabilities on top of Grooper’s impressive image cleanup operations. The result is faster end-to-end microfiche processing with far superior image quality compared to anything else on the market.

What is microfiche?

Microfiche is a flat piece of film, called a card, containing scaled down reproductions of documents. These documents can be viewed through a microfiche reader, which magnifies them to readable proportions. The purpose of microfiche is to store a large number of documents in a small amount of space while providing access to the documents without distributing the originals. Another great medium checks all those boxes, digital. Furthermore, microfiche is intended to be a permanent archive. However, film degrades over time, and every time it’s handled, the film is in danger of being scratched or otherwise damaged. Also, the number of people who can access documents on microfiche is limited to the number of copies of that card on hand. Digitizing microfiche cards resolves both these limitations.

Grooper's microfiche processing ability was originally developed for Mekel-brand microfiche scanners. Because of the nature of microfiche, it is likely that these capabilities are generalizable to other scanners, and possibly other film-based media.

The activities in this process make a distinction between:

  • frames, which are individual document images on fiche cards, and
  • tiles, which are sections of the microfiche card generated by microfiche scanners.

The main thing to remember is tiles are wider than frames. As such, tiles do not necessarily contain full frames. In other words, a tile might not (and probably won't) have a full document image on it. Grooper's microfiche processing will stitch those tiles back together and extract the frame, generating individual document images from the full card.

Steps

An example of a Batch Process for processing microfiche cards.

Microfiche processing happens in six main steps:

1. Full microfiche cards are imported into Grooper, either via a microfiche scanner or other import operation. 2. Run the Initialize Card activity. Microfiche scans are organized into folders by full card. A low-resolution preview image of the full card is also generated.

  • The preview image may be OCR'd (via the Recognize activity), have data extracted from it, and reviewed just like any other document image.

3. Run the Detect Frames activity. Individual frames surrounding documents on the card are detected.

  • A Review activity can be configured at this point for review and correction using the Fiche Strip Viewer.

4. Run the Clip Frames activity. The documents are clipped from the detected frames, generating one image per page on the fiche card.

  • The Remove Level activity is often used at this point as a "cleanup" activity for the batch structure. This activity removes one or more folder levels in the batch. For example, it can be used to remove the initial folder level created from the the Initialize Card activity (removing the low resolution preview of the full card on that folder at the same time).

5. Film specific image processing commands, such as Extract Page and Scratch Removal, and other IP commands, such as Contrast Stretch, are applied to prepare them for other Grooper activities. 6. And, it’s off to the Grooper races. These images are now document images just like any other as far as Grooper is concerned. You can get OCR data off them, separate the images into document folders, classify them and extract any data you want.

Microfiche Activities

Microfiche Related IP Commands

Although not specifically limited to microfiche processing, these commands were developed specifically for microfiche processing in Version 2.8.