Deployment in AWS with Auto Scaling (DRAFT)
|
WIP |
This article is a work-in-progress. This article represents our current understanding of the topic. It is subject to revision and expansion in the future. This tag will be removed upon draft completion. |
Easily scale your Grooper processing with cloud-based computing using Amazon Web Services and Auto Scaling.
About AWS
Imagine you normally process a hundred documents a day. Then, out of nowhere, you start getting a thousand documents a day. Maybe its a busy time of year for you. Maybe you landed a huge client and need to process their paperwork. Great news! But how do you account for the increase in computing demands? How do you both scale up your IT infrastructure to meet the new demand and scale it down once the demand has subsided?
One answer lies in cloud-based computing. Spinning up a pool of virtual machines to process the additional workload is, generally speaking, quicker and more cost effective than purchasing and deploying physical computers. Furthermore, what are you going to do once you're done processing all that extra work? With a cloud-based IT infrastructure, you can quickly deactivate those machines and re-activate them as needed on-demand. With a physical infrastructure, you're stuck holding a lot of additional hardware until you need it again.
Amazon Web Services (AWS) is one of the most popular on-demand cloud computing providers. Their Elastic Compute Cloud (EC2) gives users a pay-as-you-go model to create, launch and terminate virtual computers, as needed. These virtual machines (which Amazon calls "instances") can be built to mirror physical computer specifications, including operating system, CPU, RAM and storage options. EC2 encourages scalability with web services that allow you to take a snapshot of a server's configuration (an Amazon Machine Image or "AMI") and boot a virtual machine with whatever software you need already loaded.
Furthermore, the process of spinning up and spinning down virtual machines can be automated with AWS Auto Scaling. This allows users to set automatic scaling parameters, defining when to launch new EC2 instances to meet surges in processing demand. The virtual machines are then terminated as soon as demand dies down. With a pay-for-use model, you only pay for the machines while they're running. Ultimately, this saves you time scaling your IT infrastructure up and down and saves you money only paying for what you need when you need it.
In this article, we will show you how to set up a scalable Grooper deployment using Amazon Web Services (AWS), Elastic Compute Cloud (EC2) instances and Auto Scaling.
- For more information on EC2 instances, visit Amazon's EC2 documentation.
- For more information on Auto Scaling, visit Amazon's Auto Scaling documentation.
The General Process
The general steps to complete this deployment setup are as follows:
- Launch the "Main" EC2 instance.
- This is the virtual machine that hosts the Grooper Repository's database and file store and the Grooper license for the "Worker" machines.
- Setup the Security Group.
- In AWS, Security Groups are a set of firewall rules, restricting inbound and outbound traffic based on protocols and port numbers.
- Launch the "Worker" EC2 instance.
- This is the virtual machine performing automated processing tasks.
- Grooper will need to be installed. Grooper will need to be licensed using the Main instance's hosted licensing. The Worker instance will need to be able to connect to the Main instance's Grooper Repository.
- Create an Amazon Machine Image (AMI) of the Worker instance.
- This image will be a snapshot of the Worker machine, with Grooper installed and ready for document processing.
- This is used to create new Worker instances automatically according to the Auto Scaling setup.
- Create a Launch Template.
- Essentially, a Launch Template is a set of instructions to launch a new EC2 instance programmatically (rather than manually, as done in step 3).
- Auto Scaling will use the Launch Template to create new Worker instances, using the AMI created in step 5, and assign the right Security Group.
- Configure the Auto Scaling policies.
- This defines the minimum and maximum number of virtual machines to be added as well as the scaling rules, what conditions should be met to spin up and down new EC2 instances.

How To
Step 1: Launch the "Main" EC2 Instance
Getting Started
|
|
|
|
|
|
|
|
System Specs
Next, you will choose the virtual machines system specifications. There is some flexibility in your choices. However, there are some requirements and best practice suggestions you'll need to keep in mind. You will configure the following:
|
|
|
|
|
Network Settings
Generally speaking, every organization will have their own network security requirements. Your IT department will need to configure the EC2 instance's Network Settings panel according to your needs. For the purposes of this demo, we will use the default Network Settings.
- For more information on networking, visit the Amazon documentation here.
- For more information on security, visit the Amazon documentation here.
|
We will create a new Security Group in this demonstration. |
Login Info
Amazon strongly recommends you log into EC2 instances using a key pair. If you plan on connecting to your virtual machines using RDP (which we will in this tutorial), you must specify a key pair.
- For more information on key pairs, please visit the Amazon documentation here.
|
We have already generated a key pair and selected it. |
When you've finished configuring the EC2 instance, you will need to launch it, and then connect to it to install and configure Grooper.
Launch Instance
|
|
|
|
|
Connect to Instance
|
|
|
|
|
|
|
Grooper Installation and Configuration
Once connected to the "Main" instance, you must do the following:
- Install Grooper.
- Configure a new Grooper Repository.
- You will need to know the server's hostname when creating the Grooper database.
- If you are unsure of the VM's hostname, open a Command Prompt and enter "hostname".
- Install a Grooper Licensing service.
- Be sure to include the VM's hostname in the security credentials for the user name.
- Manually adjust the file store's path in Grooper Design Studio.
- You will need to change the root of the UNC path from the machine's name to its private IP address. See the "Manually Adjust File Store Path" section below for more details.
If you need guidance on installing and configuring Grooper or a Grooper Licensing service, please visit the the Install and Setup article on the Grooper Wiki.
For Version 2022 and beyond
If you are using the Grooper Web Client, you must install IIS on the server and install the Grooper Web Client application. If you need guidance, please visit the Web Client article on the Grooper Wiki.
Manually Adjust File Store Path
Typically, when you initialize a Grooper Repository, you give a UNC path for the folder location in the following format:
\\<machine name>\<folder path>
However, when the Worker instance accesses the file store, it will need to use the Main instance's private IP address, using the following UNC path format:
\\<private IP address>\<folder path>
|
FYI |
You can copy the Main instance's IP address from the EC2 Dashboard. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab. |
Therefore, you will need to manually adjust the Grooper file store's storage path. To do so, open Grooper Design Studio and perform the following steps:
|
Step 2: Setup the Security Group
For this deployment, you will need to configure the following protocols and ports for inbound traffic:
TCP (Transmission Control Protocol)
- Port 13950
- This will open access for the Grooper REST API.
- Port 13900
- This will open access for the Grooper Licensing service, allowing the Main instance to supply licenses for the Worker instances.
- Port 13930
- Only required for version 2022 and later
- This will open access for the Grooper Web Client
- Port 13905
- Only required for version 2022 and later
- This will open access for the Grooper Desktop scanning application.
ICMP (Internet Control Message Protocol)
- All ports
- This will allow communication between the Main instance and the Worker instances, ultimately allowing the Worker instances to access Main's Grooper Repository for document processing.
|
|
|||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||
|
|
|
⚠ |
These inbound rules will also need to be added to the Windows Firewall in the Main and Worker instances (or you can turn off the Windows Firewall on both machines). |
Step 3: Launch the "Worker" EC2 Instance
Launch Configuration
For the most part, launching this instance will be very similar to launching the Main instance, with a few things to keep in mind.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Grooper Installation and Configuration
Once connected to the "Worker" instance, you must do the following:
- Install Grooper.
- Connect to the Main instance's Grooper Repository.
- See the "Connect to the Grooper Repository" section below for more specifics.
- License the Worker instance using Main's license URL.
- See the "License the Worker Instance" section below for more specifics.
- Install an Activity Processing service, running four threads.
- Please note when entering the User Name property, you typically use the following format:
<machine name>\<user name>
- Because Auto Scaling is going to create new machines using an image of this Worker instance, the machine names will necessarily change. Use the "dot" shortcut instead, a la:
.\<user name>
- Verify the Worker instance has access to the Grooper Repository's database and file store.
- See the "Verification" section below for more details.
If you need guidance on installing Grooper, please visit the the Install and Setup article on the Grooper Wiki.
If you need more information about Activity Processing services, please visit the Activity Processing article on the Grooper Wiki.
- Activity Processing - For Versions 2021 and earlier
- Activity Processing - For Versions 2022 and later
Connect to the Grooper Repository
For the most part, you will connect to Main's Grooper Repository as you would normally. However, there is one key difference:
- Instead of using the server's hostname when connecting to the Grooper Repository database, you will need to use its private IP address.
|
FYI |
You can copy the Main instance's IP address from the EC2 Dashboard. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab. |
|
|
License the Worker Instance
When licensing the Worker instance, you will also need to use the Main instance's private IP address.
Typically, when licensing a client machine with hosted license, you will use a URL with the following format:
http://<hostname>:13900/LicenseService.svc
However, for this deployment, you will replace the hostname with the Main instance's private IP address, using the following format:
http://<private IP address>:13900/LicenseService.svc
|
|
Verification
Prior to continuing, you should verify the following things:
- The Worker instance has access to Main's Grooper Repository database
- The Worker instance has access to Main's Grooper Repository file store.
- The Worker instance's Grooper is licensed using Main's hosted license.
- Main's Grooper Licensing service must be running to verify this.
- The Worker instance's Activity Processing service is running and executing Unattended Activity steps for production Batches.
To test this, you should do the following:
Logged into the Worker instance, try and copy and paste a document into a test Batch. Then, load that document's content (i.e. right click the Batch Folder and select File System Link > Load Content).
If you can do so, you've verified the following:
- The Worker instance's Grooper is licensed (You'd get an error message upon opening Grooper Design Studio if it wasn't).
- The Worker instance has access to Main's Grooper Repository database (You wouldn't be able to create a test Batch if there was some restriction on your database access).
- The Worker instance has access to Main's Grooper Repository file store (You wouldn't be able to paste a document into Grooper and load its content if there is something restricting your file store access).
Logged into the Main instance, test the Activity Processing service is running correctly
- To do this, make a simple Batch Process (It could just have one step in it. It doesn't matter, it just needs to be an automatable Unattended Activity step).
- Then, import a document into a new Batch, assigning it the Batch Process.
- Last, ensure the Batch is not "paused".
If the Batch Process's steps execute automatically, you've verified the Worker instance's Activity Processing service is running and executing Unattended Activity steps for production Batches.
Step 4: Create an Amazon Machine Image (AMI) of the Worker Instance
|
|
|
|
|
Step 5: Create a Launch Template
Create a New Launch Template
|
|
Configure the Launch Template
Configuring a Launch Template is a lot like launching a new instance. Just in our case, instead of creating an instance from a "blank" operating system, we're going to be creating instances from our Worker image.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 6: Configure the Auto Scaling Policies
Create a New Auto Scaling Group
|
|
Name the Group and Select Launch Template
|
|
Select Availability Zone
|
|
|||
|
|
|||
|
|
Set Capacity and Scaling Policies
On the next screen, you can configure the Group size and Scaling policies. These allow you to give the parameters to dynamically scale your Worker instances.
|
|
|
|
|
Review and Create Auto Scaling Group
|
|
|||
|
|
|||
|
|
















































