Attachment Type

From Grooper Wiki

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025

Files attached to a folder Batch Folder are defined by their "Attachment Type". Their Attachment Type (determined by the files MIME type) controls what properties and commands are applicable in Grooper.

When a file is imported into Grooper, two things happen:

  1. A folder Batch Folder is created.
  2. The file is attached to that Batch Folder.

Depending on the file's file type (PDF, ZIP, TXT, etc), Grooper will have different sets of commands it uses to process the file. For example, Grooper can execute the "Unzip" command on ZIP files. Grooper assigns attachment files an "Attachment Type" based on their MIME type. It is the "Attachment Type" that determines which commands are accessible and executable.

  • Importing files is the most common way a file is attached to a Batch Folder. However, the file_save Merge activity can also generate an attachment file.

Attachment Types and their commands

Please note, these Attachment Types are listed in order of how common they are across a variety of Grooper users, not by alphabetical order.

PDF Document

Represents a PDF document. Provides properties and commands which apply to PDF documents. Handles MIME type "application/pdf". If the file has no MIME type value, "application/pdf" will be inferred from file extension ".pdf".

PDF Commands

  • Burst - Splits a PDF document into smaller documents. Generates child Batch Folders with PDF attachments containing pages from the original document according to a "PDF Expand Method" (Fixed Page Count, Tag Based, Bookmarks, or Page Piece).
  • Compact - Reduces the size of a PDF file by removing duplicate fonts, images, and other artifacts. Performs a hash-based deduplication of the objects inside a PDF file.
  • Repair - Repairs PDF files which contain minor errors.

TIFF Document

Represents a TIFF document.

TIFF Commands

  • There are no Grooper commands for TIFF documents (but TIFF files can be processed by several Grooper Activities).

Text Document

Represents a plain text file (TXT).

Text Document Commands

  • Insert Page Breaks - Inserts page breaks into a text document.
  • Normalize - Normalizes the encoding and control characters in a text document.
  • Split - Splits a text document into smaller documents, using an extractor to identify split positions within the text content.

Mail Message

Represents an RFC822 or Outlook mail message. Handles MIME types "message/rfc822" or "application/vnd.ms-outlook". If the document has no MIME type value:

  • "message/rfc822" will be inferred from ".eml" or ".mht" file extensions.
  • "application/vnd.ms-outlook" will be inferred from an ".msg" file extension.

Mail Message Commands

  • Convert To RFC822 - Converts a proprietary Microsoft Outlook mail message (.msg) to an industry-standard RFC822 mail message (.eml).
  • Expand Attachments - Bursts out the individual components of a mail message, and saves them as children of mail message.

Microsoft Office Documents

Word Document

Represents a Word Document.

Word Commands

Excel Document

Represents an Excel document.

Excel Commands

  • Convert to CSV - Converts an Excel spreadsheet to a Comma-Separated Values (CSV) file.

PowerPoint Document

Represents a PowerPoint Document.

PowerPoint Commands

  • There are currently no Grooper commands for PowerPoint documents.

ZIP Archive

Represents a ZIP archive. Handles MIME type 'application/zip'. If the document has no MIME type value, 'application/zip' will be inferred from file extension '.zip'.

ZIP Commands

  • Unpackage - Replaces the ZIP file with a single file extracted from it, optionally copying additional companion files with it. Use this command to unpackage ZIP files which contain a single file, or to unpackage a transaction containing a primary file and one or more companion files. For example, if a ZIP file contains PDF document and a CSV file, this might be used to extract the PDF file and the primary document content, and save the CSV file with it, for later use by Delimited Extract.
  • Unzip - Expands all attachments as children of the folder object.
  • Update - Writes all child attachments to the ZIP file.

HTML Document

Represents an HTML document (such as a webpage).

HTML Commands

  • Condition HTML - Performs cleanup and normalization on HTML documents.
  • Convert To PDF - Converts an HTML document to PDF format.
  • Convert To Text - Converts an HTML document to plain text or markdown.

EDI File

Represents an EDI X12 document.

EDI Commands

  • Bundle - Replaces the selected EDI files with a new set of files containing N transactions each.
  • Load Data - Loads data from an EDI document into a Data Model.
  • Split Envelops - Splits an EDI 837 file, creating a child document for each interchange control envelope (i.e. each ISA envelope.).

PST File

Represents a Microsoft Outlook PST (Personal Storage Table) file.

PST Commands

  • Burst - Creates a child document for each message in the PST file. Extracts mail messages only. Appointments, Contacts, Tasks, and other non-mail items present in the PST file will be ignored.

vCard

Represents an RFC6350 vCard (VCF) file. Handles MIME type "text/vcard". If the file has no MIME type value, "text/vcard" will be inferred from file extensions ".vcf".

vCard Commands

  • Expand Photo - Expands included photo (if available) as a child of the folder object.