Thread Processing Guidance
Grooper Activity Processing services distribute processing resources to automated tasks in a Batch Process. This article seeks to answer questions about thread processing and document best practices.
What is a thread?
A "thread" is the smallest unit of execution that can be performed within an operating system.
- They are the "workers" that carry out "tasks" in a software application.
- Performance cores have 2 threads per core.
- Efficiency cores have 1 thread per core.
How can I find out how many threads my machine has?
Physical machines:
- Open Windows Task Manager.
- Go to "Performance".
- Threads are listed as "logical processors".
Virtual machines:
- Open Windows Task Manager.
- Go to "Performance".
- Threads are listed as "virtual processors".
How does Grooper utilize threads?
Primary
Threads are utilized by Activity Processing services to automate Activity tasks in a Batch Process.
- The number of threads an Activity Processing service can use is defined by the Number of Threads setting.
- Multiple Activity Processing services can be installed on a single machine.
- Activity Processing services can be installed on additional machines connected to a Grooper Repository to distribute task processing across several servers.
Secondary
The Grooper App Pool uses threads on the Grooper Web Server to run the application as users interact with the Grooper Web App.
Is it better to have one Activity Processing service with a large number of threads or several with a smaller number?
It is more efficient to spread available threads to multiple Activity Processing services on a machine.
- The sweet spot appears to be 3-4 threads per service.
- Example: If you have a single processing server with 50 available threads, you'll see better overall throughput if you have 16 Activity Processing services with 3 threads each compared to 1 Activity Processing service with 50 threads.
How should I be using Processing Queues?
Processing Queues facilitate thread distribution across specific steps in a Batch Process.
- Processing Queues are assigned to Batch Process Steps in a Batch Process and an Activity Process to facilitate this.
- Processing Queues are assigned to Batch Process Steps using their Queue Name property.
- Processing Queues are assigned to Activity Processing services using their Queue Name property.
- If threads are "workers", Processing Queues are "managers" marshalling workers to particular activities.
- Processing Queues should be assigned to resource intensive or long-running Activity tasks to avoid bottlenecks in Batch processing.
- This bottleneck may not slow down total processing over a long period of time, but can lead to less steady flow to Review or Export steps.
- Example: Recognize tends to be a longer running step in a Batch Process (OCR takes time). If all of your threads are stuck processing Recognize tasks, other tasks will sit idle until all Recognize tasks are finished.
About Thread Allocation for Grooper Services
Understand the "n minus one" rule
When Grooper services are installed, each service is assigned a number of CPU threads.
- Some services (for example, Import Watcher) always run on a single thread.
- Activity Processing services can use multiple threads.
Your machine has a finite number of processing threads, and over-allocating them will cause errors.
The operating system must always have at least one thread available. For that reason, the total number of threads assigned to Grooper services must never exceed the number of available threads minus one.
This is known as the "n minus one" rule:
- If n is the total number of threads available on the machine, the maximum number of threads you may assign to Grooper services is n − 1.
Consider the "n minus x" rule
In real-world environments, Grooper is rarely the only software running.
Other applications also require CPU threads, so additional reservations may be necessary:
- SQL installed on the same machine
- Follow an n − 2 rule
- (1 thread for the OS, 1 for SQL)
- SQL and IIS installed on the same machine
- Follow an n − 3 rule
- (1 thread for the OS, 1 for SQL, 1 for IIS)
- Other background applications (for example, antivirus software)
- Reserve additional threads as needed
This is one of the key reasons we recommend distributing SQL, IIS, and Grooper processing services across multiple machines whenever possible.
Bottom line
Do not over-allocate your available threads.
- If you do, Grooper can behave erratically or fail unexpectedly.
- Always leave sufficient CPU resources for the operating system and any supporting services.