XML Transform (Activity)

From Grooper Wiki
(Redirected from XML Transform)

This article was migrated from an older version and has not been updated for the current version of Grooper.

This tag will be removed upon article review and update.

This article is about the current version of Grooper.

Note that some content may still need to be updated.

2025 2.72
The XML Transform activity's property panel.

code_blocks XML Transform is an Activity that applies XSLT stylesheets to XML data to modify or reformat the output structure for various purposes.


Version Differences

As of 2.72, Grooper uses XSLT 1.0 to apply XML transformations. XSLT (or eXtensible Stylesheet Language Transformations) is a language for transforming XML documents into other formats. Using XSLT, Grooper can output XML in virtually any layout.

A good example of how XSLT transforms an XML document into HTML can be found following this link.

Examples

For example, this is the XML data from an extremely simple batch process getting the invoice number off various documents.

<Document Id="94b9a646-f926-4a7e-9df3-3f70d88fcd80" Name="Generic Invoice (1)" TypeId="22684b3b-e266-4350-bfdb-96bf90bae207" TypeName="Acme">
  <Field Name="Invoice Number" Confidence="1.00" Page="1" Valid="True" Location="5.103, 7.030, 1.927, 0.093">74449788</Field>
</Document>

XML is designed to be both machine-readable and human readable. You can parse through the information here. You can track down the invoice number by locating Field Name="Invoice Number" and going down the line until you get to the actual number. But maybe we don't need or want all that extra metadata. What if it's just junk to our end process?

We can edit our XSL stylesheet to do just that. Under the "General" heading, selecting XML Transform transform property will bring up an editor to write the stylesheet according to our needs.

Configure

Select the XML Transform property and press the ellipsis button.



Customize

This is the boilerplate XSL stylesheet. From here you can edit the stylesheet to output whatever format you desire.



Set File Name

Give the transformation a filename under "Output Filename".



Observing Results

If you select a document's Batch Folder within its Batch, a transformed XML file will appear in the "Files" sub-tab under the "Advanced" tab.



Given our example, if we were to type the following XSL stylesheet into the editor...

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
  <xsl:output method="xml" indent="yes" />
  <xsl:template match="@*|node()">
    <Invoice>
      <xsl:for-each select="Field">
        <xsl:element name="{translate(@Name, ' ', '_')}">
          <xsl:value-of select="self::node()" />
        </xsl:element>
      </xsl:for-each>
    </Invoice>
  </xsl:template>
</xsl:stylesheet>

...this is how our original XML data would be transformed.

<Invoice>
  <Invoice_Number>74449788</Invoice_Number>
</Invoice>

All that extra junk we didn't want is gone. Now imagine you had forty data elements you wanted to extract and just wanted the field information and nothing else. XML transform using XSLT is a way to accomplish just that.

Here's a slightly more complex example, with the stylesheet on the left and the input/output files on the right.


XML Transform Tester

Once you add an "XML Transform" step to your batch process, you can use the "XML Transform Tester". This is an exceptionally handy tool to transform XML data using XSL stylesheets. This way you can see if your transform works and how it will output, all within Grooper.

Location

Navigate to the "XML Transform" step in your "Batch Process".



Testing Tab

Switch to the "XML Transform Tester" tab.



Select a Batch

Select a Test Batch from the Batch dropdown.



Write Customizated Transform

Write your transformation under the "XSL Transform" panel.



Testing

Press the "Execute" button to see the fruits of your labor, or, if it fails, the anti-fruits of your labor under the "Output File" panel.



Save

Press the "Save" button.



Exporting Transformed XML

Once you run the "XML Transform" activity, the transformed metadata is saved to a file on the document at the Batch Folder level of the Batch. You can export the metadata using "Document Export". To do this, in the Document Export Settings, change the "Metadata Format" property to "Custom" and type the filename you gave in the XML Transform step under "Custom File". This will export the transformed version of the metadata rather than the original metadata.

Glossary

Activity: Activity is a property on edit_document Batch Process Steps. Activities define specific document processing operations done to a inventory_2 Batch, folder Batch Folder, or contract Batch Page. Batch Process Steps configured with specific Activities are frequently referred by the name of the Activity followed by the word "step". For example: Classify step.

Batch Folder: folder Batch Folder objects are defined as container objects within a inventory_2 Batch that are used to represent and organize both folders and pages. They can hold other Batch Folders or contract Batch Page objects as children. The Batch Folder acts as an organizational unit within a Batch, allowing for a structured approach to managing and processing a collection of documents.

  • Batch Folders are frequently referred to simply as "documents".

Batch Process: settings Batch Process objects are crucial components in Grooper's architecture. A Batch Process orchestrates the document processing strategy and ensures each inventory_2 Batch of documents is managed systematically and efficiently.

  • Batch Processes by themselves do nothing. Instead, the workflows they execute are designed by adding child edit_document Batch Process Steps.
  • A Batch Process is often referred to as simply a "process".

Batch: inventory_2 Batch objects are fundamental in Grooper's architecture as they are the containers of documents that get moved through Grooper's workflow mechanisms known as settings Batch Processes.

Execute: tv_options_edit_channels Execute is an Activity that runs one or more specified object commands. This gives access to a variety of Grooper commands in a settings Batch Process for which there is no Activity, such as the "Sort Children" command for Batch Folders or the "Expand Attachments" command for email attachments.

Export: output Export is an Activity that transfers documents and extracted information to external file systems and content management systems, completing the data processing workflow.

Test Batch: a "Test Batch" refers to any inventory_2 Batch created in the Test folder of the Batches folder in the Node Tree. Test Batches are used to test various configurations in Grooper, such as Batch Process Step configurations and Data Model configurations.

XML Transform: code_blocks XML Transform is an Activity that applies XSLT stylesheets to XML data to modify or reformat the output structure for various purposes.