Training Batch (Concept): Difference between revisions

From Grooper Wiki
No edit summary
Redirected page to Training Batch - 2023
Tag: New redirect
Line 1: Line 1:
 
#REDIRECT[[Training Batch - 2023]]
[[File:Training Batch01.PNG|thumb|275px|This is a snippet of the '''Grooper Design Studio UI''' showing the '''Training Set''' batch.]]
 
<blockquote style="font-size:14pt">
The '''Training Set''' batch is more convenient way to work with all of the samples a Content Model has been trained against
</blockquote>
</p><br/>
A '''Content Model''' and accompanying set of '''Batches''' can be downloaded '''[[:Media:Training_Batch_Example.zip|here.]]'''  It is not required to download to understand this article, but can be helpful because it can be used to follow along with the steps in this article. ''This file was exported from and meant for use in Grooper 2.9''
 
==About==
During the development and training of '''TF-IDF Classification''' in a '''Grooper Content Model''', it can be challenging to keep track of all of the samples that are used during training.  In previous versions, each trained sample was stored under each content type in the Grooper Design Studio node tree.  In 2.9, the trained samples are stored both under each content type and in the '''Training Set''' batch.
 
==How To==
{|
| style="padding:25px; vertical-align:top" |
Following is an example of how to perform '''TF-IDF classification''' that creates the '''Training Set''' batch. In the example content model, there are five different content types from three different batches.
|}
 
{|cellpadding="10" cellspacing="5"
|-style="background-color:#f89420; color:white"
|style="font-size:14pt"|'''!'''||Some of the tabs in this tutorial are longer than the others.  Please scroll to the bottom of each step's tab before going to the step.
|}
 
<tabs style="margin:20px">
<tab name="Prerequisites" style="margin:25px">
====Prerequisites====
{|
| style="padding:25px; vertical-align:top" |
Following these steps assumes you already have a content model created up with '''Lexical''' set as the '''Classification Method''' and the appropriate '''Text Feature Extractor''' selected.  In the example content model, this property is set to '''Words(Stemmed)'''
|| [[File:Training Batch02.PNG]]
|}
</tab>
<tab name="Train Content Types" style="margin:25px">
====Train Content Types====
{| class="wikitable"
| style="padding:25px; |
1. Browse to the ''Content Model''' node and select the '''Classification Testing''' tab on the right.<br/>
2. Select the appropriate batch in the '''Batch''' drop down.<br/>
3. Select the document to be trained and select '''Train Document'''
|| [[File:Training_Batch03.PNG|1000px]]
|-
| style="padding:25px; |
4. Repeat these steps for remaining '''Content Types'''.  In the example '''Content Model''' provided, train all five '''Content Types''' from all three example batches
</tab>
<tab name="Review the Training Set batch" style="margin:25px">
====Review the Training Set batch====
{|
| style="padding:25px; vertical-align:top" |
As you train your content types you will see a '''Training Set''' batch begin to populate under the '''Local Resources''' folder.<br/>
A Grooper engineer can review and keep track off all of the documents that have been used for '''TF-IDF'' Classification training.  As the development cycle of Classification continues and more content types are training, the Grooper Engineer now has a single place to review, test and perform regression testing for Classification <br/>
<br/>
|| [[File:Training Batch04.PNG]]
|}
</tab>
</tabs>
<br/>
It is important to understand that the '''Training Set''' is not tied to the actual '''TF-IDF Weightings''' that is associated with the '''Content Type''' or '''Content Category'''.  Purging the training from a '''Content Model''' does not delete any or all of the documents in the '''Training Set'''.  Conversely, deleting a document from the '''Training Set''' does not remove or purge any'''TF-IDF Weightings''' from a '''Content Type''' or '''Content Category.'''
<br/>
 
 
==Version Differences==
Versions prior to '''Grooper 2.9''' do not automatically generate a '''Training Set''' batch in the '''Local Resources''' folder
 
[[Category:Articles]]
[[Category:Version 2.90]]

Revision as of 08:53, 18 October 2023