Microsoft Office Integration
Easier access to the contents of files from the worlds most used business application suite.
Contents
About
Microsoft Office integration allows a Grooper user to leverage the native text of files generated in the Microsoft Office Suite such as Microsoft Word documents and Microsoft Excel spreadsheets. This feature can pull the native text from and perform type-specific activities on these files.
Supported File Types
- Microsoft Word documents (.doc and.docx)
- For Word documents, you can generate a Grooper-usable document with the Execute activity, using the Word to PDF command for the Word Document object type. The PDF will contain all the native text from the Word document, obtainable for further Grooper processing using the Recognize activity.
- Microsoft Excel spreadsheets (xls and xlsx)
- For Excel documents, you can generate a Grooper-usable document with the Execute activity, using the Excel to CSV command for the Excel Document object type. CSV files are natively readable by Grooper in version 2.90. The Recognize activity is not required.
How to Use
⚠ |
To make use of this feature, ensure that Microsoft Office is installed on the machine running Grooper Design Studio. Furthermore, the bit version of Grooper and Microsoft Data Access Components (MDAC) must match. |
Ad Hoc Execution: Testing in Grooper Design Studio
Like any Activity, the Execute activity can be applied to a document in an "ad hoc" manner in Grooper Design Studio. This is typical for Grooper architects testing and designing solutions before building a Batch Process.
Getting a Result with Microsoft Word Documents
|
|
The Execute activity applies simple processing commands to a specified object type. To turn the imported Word file into a PDF file for further Grooper processing, you will indicate you want to process the Word Document object type and execute the Word to PDF command.
|
|
|
|
|
|
This will create a PDF copy of the Word document, stored on the document folder. This document is now viewable in Grooper's Document Viewer and contains all the native text data from the Word file. This document folder can now be processed by the Recognize activity to extract that native text for further document processing (classification, data extraction etc). |
Excel Spreadsheets
|
|||
The Execute activity applies simple processing commands to a specified object type. To turn the imported Excel file into a CSV file for further Grooper processing, you will indicate you want to process the Excel Document object type and execute the Excel to CSV command.
|
|||
|
|||
|
|||
Burst ConversionThe Burst option will convert the Excel file to a CSV file and saves the results as child object(s). This is the default, and most typical, configuration option. As seen in this image, if there are multiple sheets, they will be saved as multiple child objects of the document folder. The native Excel file had two sheets. So we get two child CSV files. |
|||
SaveNew ConversionThe SaveNew option will convert the Excel file to a CSV file and save the result as a new file. The new file is stored on the Batch Folder with the native file (More specifically, it is stored in the file store location associated with the Batch Folder.) When choosing this option, you will name the generated file using the File Name property. As you can see in this image, the document folder has not one, but two files associated with it, as seen in the "Advanced > Files" tab.
|
|||
Convert ConversionThe Convert option will convert the Excel file to a CSV file and replace the native file. This is a "true" conversion. Rather than making a CSV copy of the Excel file in one way or another, the original Excel file is transformed into a CSV file. Seen in this image, the original native file has been converted from an XLSX file to a CSV file.
|
Batch Processing Execution: Automating the Conversions
When automating a Word to PDF or Excel to CSV conversion, you will add the Execute activity to a Batch Process. Once added to a Batch Process, its configuration to convert Word files into PDFs and Excel files into CSVs is the same as described above.
To add the Execute activity to a Batch Process:
|
Version Differences
Prior to Grooper 2.9 files from the Microsoft Office Suite had to be rendered (essentially a "print..." function) to PDF in order to view contents and use activities more effectively.