2023.1:Classify (Activity): Difference between revisions

From Grooper Wiki
No edit summary
 
(31 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{stubs}}
{{AutoVersion}}
<section begin="glossary" />
 
<blockquote>
<blockquote>{{#lst:Glossary|Classify}}</blockquote>
Classify is an [[Activity|Unattended Activity]] that [[Classification|classifies]] documents in a [[Batch]] according to a [[Content Model]]. 
 
</blockquote>
All the logic and setup to take an unorganized document set and categorize documents within them as [[Document Type|Document Types]] is created in the '''Content Model'''.  The Classify activity uses that information to automatically classify documents in a '''Batch''' as '''Document Types''' in the model.
<section end="glossary" />
<br>
All the logic and setup to take an unorganized document set and categorize documents within them as [[Document Type]]s is created in the [[Content Model]].  The Classify activity uses that information to automatically classify documents in a [[Batch]] as [[Document Type]]s in the model.
<br>
{|class="download-box"
|
[[File:Asset 22@4x.png]]
|
You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains one or more '''Batches''' of sample documents.  The second contains one or more '''Projects''' with resources used in examples throughout this article.
* [[Media:2023.1_Wiki_Classify-(Activity)_Batch.zip]]
* [[Media:2023.1_Wiki_Classify-(Activity)_Project.zip]]
|}


== How To ==
== How To ==
Classification can be performed on a document in three different ways:
Classification can be performed on a document in two different ways:
# Through the ''Classify'' step on an automated '''Batch Process'''.
# Through the ''Classify'' step on an automated '''Batch Process'''.
# Through ESP Auto Separation
#* '''ESP Auto Separation''' is a classification-based separation method; during the ''Separate'' activity, it will both separate and classify your documents.
# Manually, by right-clicking the document and using the "Apply Document Type" command.
# Manually, by right-clicking the document and using the "Apply Document Type" command.
<br>
 
<br>
You can also classify documents through '''[[ESP Auto Separation]]''', but that is more of a seperation method performing both separation and classification upon the separation step.
 
Let's go over each method:
Let's go over each method:
{|
 
[[insert image here]]
=== Automated Classification ===
To see how classification works during the activity, let's take a look at classification on the Activity Tester. This is basically what's happening when Grooper runs classification during the '''Batch Process'''.
# Naturally, you will need to have a '''Content Model''' with the appropriate '''Document Type''' created.
# Once that's been done, create a '''Batch Process''' and set up the ''Classify'' step.
{|class="fyi-box"
|-
|-
|
|
[[insert image here]]
'''FYI'''
|-
|
|
[[insert image here]]
* Pay attention to the Folder Level on which classification is taking place. Grooper will default classification to Folder Level 1, which is the document level, just beneath the Batch itself (the Batch being Level 0).
* Since we want to classify the folders as Document Types, we running classification at Level 2.
|}
|}
[[File:2023_Classify_(Activity)_How_To_Automatic_Classification_Within_a_Batch_Process_01(1).png]]
# Once the classification step has been properly configured, switch over to the Activity Tester tab.
# Press the Play button.
# And now all of the documents have been classified.
[[File:2023_Classify_(Activity)_How_To_Automatic_Classification_Within_a_Batch_Process_02.png]]
This is the function that Grooper will perform automatically when the '''Batch Process''' is run. It is important to test your ''Classify'' step first to make sure nothing goes awry during actual classification.


== For More Information ==
== For More Information ==
For more information on Classification as a whole, please see the following articles:
For more information on Classification as a whole, please see the following articles:
* [[Classification]]
* [[Classification (Concept)]]
* [[Content Model]]
* [[Content Model (Node Type)]]
[[Category:Articles]]
* [[Rules-Based (Classify Method)]]
* [[Labeling Behavior (Behavior)]]
* [[Lexical (Classify Method)]]
* [[Visual (Classify Method)]]
* [[ESP Auto Separation (Separation Provider)]]

Latest revision as of 09:46, 25 June 2025

This article is about an older version of Grooper.

Information may be out of date and UI elements may have changed.

20252023.1

unknown_document Classify is an Activity that "classifies" folder Batch Folders in a inventory_2 Batch by assigning them a description Document Type.

  • Classification is key to Grooper's document processing. It affects how data is extracted from a document (during the Extract activity) and how Behaviors are applied.
  • Classification logic is controlled by a Content Model's "Classify Method". These methods include using text patterns, previously trained document examples, and Label Sets to identify documents.

All the logic and setup to take an unorganized document set and categorize documents within them as Document Types is created in the Content Model. The Classify activity uses that information to automatically classify documents in a Batch as Document Types in the model.

You may download the ZIP(s) below and upload it into your own Grooper environment (version 2023.1). The first contains one or more Batches of sample documents. The second contains one or more Projects with resources used in examples throughout this article.

How To

Classification can be performed on a document in two different ways:

  1. Through the Classify step on an automated Batch Process.
  2. Manually, by right-clicking the document and using the "Apply Document Type" command.

You can also classify documents through ESP Auto Separation, but that is more of a seperation method performing both separation and classification upon the separation step.

Let's go over each method:

Automated Classification

To see how classification works during the activity, let's take a look at classification on the Activity Tester. This is basically what's happening when Grooper runs classification during the Batch Process.

  1. Naturally, you will need to have a Content Model with the appropriate Document Type created.
  2. Once that's been done, create a Batch Process and set up the Classify step.

FYI

  • Pay attention to the Folder Level on which classification is taking place. Grooper will default classification to Folder Level 1, which is the document level, just beneath the Batch itself (the Batch being Level 0).
  • Since we want to classify the folders as Document Types, we running classification at Level 2.


  1. Once the classification step has been properly configured, switch over to the Activity Tester tab.
  2. Press the Play button.
  3. And now all of the documents have been classified.


This is the function that Grooper will perform automatically when the Batch Process is run. It is important to test your Classify step first to make sure nothing goes awry during actual classification.

For More Information

For more information on Classification as a whole, please see the following articles: