Deployment in AWS with Auto Scaling (DRAFT)

From Grooper Wiki

WIP

This article is a work-in-progress. This article represents our current understanding of the topic. It is subject to revision and expansion in the future.

This tag will be removed upon draft completion.

Easily scale your Grooper processing with cloud-based computing using Amazon Web Services and Auto Scaling.

About AWS

Imagine you normally process a hundred documents a day. Then, out of nowhere, you start getting a thousand documents a day. Maybe its a busy time of year for you. Maybe you landed a huge client and need to process their paperwork. Great news! But how do you account for the increase in computing demands? How do you both scale up your IT infrastructure to meet the new demand and scale it down once the demand has subsided?

One answer lies in cloud-based computing. Spinning up a pool of virtual machines to process the additional workload is, generally speaking, quicker and more cost effective than purchasing and deploying physical computers. Furthermore, what are you going to do once you're done processing all that extra work? With a cloud-based IT infrastructure, you can quickly deactivate those machines and re-activate them as needed on-demand. With a physical infrastructure, you're stuck holding a lot of additional hardware until you need it again.

Amazon Web Services (AWS) is one of the most popular on-demand cloud computing providers. Their Elastic Compute Cloud (EC2) gives users a pay-as-you-go model to create, launch and terminate virtual computers, as needed. These virtual machines (which Amazon calls "instances") can be built to mirror physical computer specifications, including operating system, CPU, RAM and storage options. EC2 encourages scalability with web services that allow you to take a snapshot of a server's configuration (an Amazon Machine Image or "AMI") and boot a virtual machine with whatever software you need already loaded.

Furthermore, the process of spinning up and spinning down virtual machines can be automated with AWS Auto Scaling. This allows users to set automatic scaling parameters, defining when to launch new EC2 instances to meet surges in processing demand. The virtual machines are then terminated as soon as demand dies down. With a pay-for-use model, you only pay for the machines while they're running. Ultimately, this saves you time scaling your IT infrastructure up and down and saves you money only paying for what you need when you need it.

In this article, we will show you how to set up a scalable Grooper deployment using Amazon Web Services (AWS), Elastic Compute Cloud (EC2) instances and Auto Scaling.

The General Process

The general steps to complete this deployment setup are as follows:

  1. Launch the "Main" EC2 instance.
    • This is the virtual machine that hosts the Grooper Repository's database and file store and the Grooper license for the "Worker" machines.
  2. Setup the Security Group.
    • In AWS, Security Groups are a set of firewall rules, restricting inbound and outbound traffic based on protocols and port numbers.
  3. Launch the "Worker" EC2 instance.
    • This is the virtual machine performing automated processing tasks.
    • Grooper will need to be installed. Grooper will need to be licensed using the Main instance's hosted licensing. The Worker instance will need to be able to connect to the Main instance's Grooper Repository.
  4. Create an Amazon Machine Image (AMI) of the Worker instance.
    • This image will be a snapshot of the Worker machine, with Grooper installed and ready for document processing.
    • This is used to create new Worker instances automatically according to the Auto Scaling setup.
  5. Create a Launch Template.
    • Essentially, a Launch Template is a set of instructions to launch a new EC2 instance programmatically (rather than manually, as done in step 3).
    • Auto Scaling will use the Launch Template to create new Worker instances, using the AMI created in step 5, and assign the right Security Group.
  6. Configure the Auto Scaling policies.
    • This defines the minimum and maximum number of virtual machines to be added as well as the scaling rules, what conditions should be met to spin up and down new EC2 instances.

How To

Step 1: Launch the "Main" EC2 Instance

First, you must launch and configure the "Main" EC2 instance. Most importantly, this virtual machine will host the Grooper Repository environment. It will house the Grooper database and file store location. It will also host the Grooper Licensing service supplying licenses to the "Worker" machines. As such the following must be installed on the Main instance after launching it:

  • SQL must be installed.
  • Grooper must be installed.
  • A Grooper Repository must be configured.
  • The Grooper Licensing service must be installed (and running).

Getting Started


To launch a new EC2 instance, log into the AWS Management Console at console.aws.amazon.com

  1. In the search bar, search for "EC2".
  2. Under Services, select EC2.
    • This will take you to the EC2 Dashboard.


  1. In the EC2 Dashboard, select Launch Instance.
  2. Then, select Launch Instance.
    • FYI: We will create the Main instance "from scratch". Launch Templates (the other option) allow users to save launch configurations. We will actually make a Launch Template later in order to set up AWS Auto Scaling.


  1. This will take you to the Launch an Instance configuration panel.
  2. Name your virtual machine.
    • We've simply named this machine "Main"
    • Use whatever naming convention works for your organization. Just be sure to make a distinction between this machine, which is the host server for the Grooper Repository and licensing, and the "Worker" machines we will create later, which will be our scalable machines used purely to process work in Grooper.

System Specs

Next, you will choose the virtual machines system specifications. There is some flexibility in your choices. However, there are some requirements and best practice suggestions you'll need to keep in mind. You will configure the following:

  1. Application and OS Image
    • Windows Server 2012 or later or Windows 10 or later is required.
    • The machine will need to host the Grooper database as well. SQL Server 2012 or later must be installed (or accessible from a hosted server).
    • For the purposes of this tutorial, we selected Microsoft Windows Server 2019 with SQL Server 2012 Standard. This takes care of both these requirements.
    • For more information on instances and AMIs, visit Amazon's documentation here.

  1. Instance Type
    • These settings define the CPU and RAM for the virtual machine.
    • Server RAM should be 16GB or more.
    • The CPU should consist of 4 or more cores.
    • We selected the "m6i.xlarge" instance type, which meets these requirements. There are, however, other instance types that also meet these requirements. For example, the "t3.xlarge" has the same number of cores and RAM, but is cheaper and performs slower.
    • For more information on instance types, visit Amazon's documentation here.

  1. Storage
    • The storage will need to accommodate the Grooper database and file store.
    • AWS has a variety of storage options for you to choose from, including SSD and HDD options. There are also scalable storage options, allowing you to dynamically increase the instance's storage, as needed.
    • We selected 65 GiB of "gp2" memory, which is the "default" storage size for the selected OS image. We certainly won't need more storage than that for this demo.
    • For more information on storage options, visit Amazon's documentation here

Network Settings

Generally speaking, every organization will have their own network security requirements. Your IT department will need to configure the EC2 instance's Network Settings panel according to your needs. For the purposes of this demo, we will use the default Network Settings.

  • For more information on networking, visit the Amazon documentation here.
  • For more information on security, visit the Amazon documentation here.


However, for all AWS Grooper deployments we will need to configure port access using a Security Group.

  1. To create a new Security Group, select the "Create security group" option.
  2. If you have an existing Security Group you can use, select the "Select existing security group" option and select the Security Group.

We will create a new Security Group in this demonstration.

Login Info

Amazon strongly recommends you log into EC2 instances using a key pair. If you plan on connecting to your virtual machines using RDP (which we will in this tutorial), you must specify a key pair.

  • For more information on key pairs, please visit the Amazon documentation here.


  1. You may specify an existing key pair using the dropdown menu.
  2. You may create a new key pair by following this link.

We have already generated a key pair and selected it.

When you've finished configuring the EC2 instance, you will need to launch it, and then connect to it to install and configure Grooper.

Launch Instance

  1. At the bottom of the Summary heading, press the Launch instance button.


  1. You will see the following success message if the instance was successfully launched.
    • It may take a few minutes for Amazon to create the instance. The virtual machine will be running at launch time.

Connect to Instance

  1. To connect to the new instance, select Instances in the navigation bar.
  2. Check the box next to the instance you want to connect to.
    • The instance must be running in order to connect to it. Right-click an instance to start/stop it.
    • If this is a new instance, Amazon will perform a series of initialization checks before you can connect to it. This may take a few minutes.
  3. Press the Connect button to connect to the selected instance.


  1. To connect to the instance using RDP, select the RDP tab.
  2. If this is your first time connecting to the instance, you will need to decrypt a password from your key pair. To do this, press the Get Password link and follow its instructions.
  3. Once you have your password, select the Download remote desktop file button to download an RDP file.


Open the downloaded RDP file and enter the password to connect to the virtual machine.

Grooper Installation and Configuration

Once connected to the "Main" instance, you must do the following:

  1. Install Grooper.
  2. Configure a new Grooper Repository.
    • You will need to know the server's hostname when creating the Grooper database.
    • If you are unsure of the VM's hostname, open a Command Prompt and enter "hostname".
  3. Install a Grooper Licensing service.
    • Be sure to include the VM's hostname in the security credentials for the user name.
  4. Manually adjust the file store's path in Grooper Design Studio.
    • You will need to change the root of the UNC path from the machine's name to its private IP address. See the "Manually Adjust File Store Path" section below for more details.

If you need guidance on installing and configuring Grooper or a Grooper Licensing service, please visit the the Install and Setup article on the Grooper Wiki.

For Version 2022 and beyond

If you are using the Grooper Web Client, you must install IIS on the server and install the Grooper Web Client application. If you need guidance, please visit the Web Client article on the Grooper Wiki.

Manually Adjust File Store Path

Typically, when you initialize a Grooper Repository, you give a UNC path for the folder location in the following format:

\\<machine name>\<folder path>

However, when the Worker instance accesses the file store, it will need to use the Main instance's private IP address, using the following UNC path format:

\\<private IP address>\<folder path>

FYI

You can copy the Main instance's IP address from the EC2 Dashboard. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab.

Therefore, you will need to manually adjust the Grooper file store's storage path. To do so, open Grooper Design Studio and perform the following steps:

  1. Expand the File Stores folder.
  2. Select the Primary file store object.
  3. Navigate to the Advanced tab.
  4. Navigate to the Properties tab.
  5. Change the root of the UNC path to the Main instance's private IP address.


It is generally ill-advised to manually adjust a Grooper object's properties using the Properties tab.

This is a rare exception to this rule. Please do not get in the habit of changing object properties using the Properties tab.


Step 2: Setup the Security Group

Next, you will need to configure the Main instance's Security Group. In AWS, a Security Group acts as a virtual firewall. You can define rules to control either inbound or outbound traffic (or both) based on protocols and port numbers.

For this deployment, you will need to configure the following protocols and ports for inbound traffic:

TCP (Transmission Control Protocol)

  • Port 13950
    • This will open access for the Grooper REST API.
  • Port 13900
    • This will open access for the Grooper Licensing service, allowing the Main instance to supply licenses for the Worker instances.
  • Port 13930
    • Only required for version 2022 and later
    • This will open access for the Grooper Web Client
  • Port 13905
    • Only required for version 2022 and later
    • This will open access for the Grooper Desktop scanning application.

ICMP (Internet Control Message Protocol)

  • All ports
    • This will allow communication between the Main instance and the Worker instances, ultimately allowing the Worker instances to access Main's Grooper Repository for document processing.


First, you will need to access the Main instance's Security Group from the EC2 Dashboard.

  1. You can access all Security Groups using the Security Groups link in the left hand navigation panel.
  2. Or, you can select the Main instance from the Instances interface.
  3. Then, select the Security tab.
  4. Then, select the item under "Security groups".
    • In our case, this is the security group we created when we launched our Main instance.
    • Remember this Security Group's name. We will need to reference it later when creating our Worker instances.


With the Security Group selected, you will need to edit inbound rules for the protocols and ports described above.

  1. Press the Edit inbound rules button.


  1. Add new inbound rules by pressing the Add rule button, toward the bottom of the page.
  2. Configure the following inbound rules:
Type Port range Source Additional Info
Custom TCP 13950 Anywhere-IPv4 For Grooper Rest API
Custom TCP 13900 Anywhere-IPv4 For Grooper Licensing
Custom TCP 13930 Anywhere-IPv4 For Grooper Web Client
Only necessary for 2022 installs and beyond
Custom TCP 13905 Anywhere-IPv4 For Grooper Desktop
Only necessary for 2022 installs and beyond
Custom ICMP - IPv4 All Anywhere-IPv4 To allow communication between Main and Worker instances
  1. Press the Save rules button when finished.

These inbound rules will also need to be added to the Windows Firewall in the Main and Worker instances (or you can turn off the Windows Firewall on both machines).

Step 3: Launch the "Worker" EC2 Instance

Next, we will launch and configure the "Worker" EC2 instance. This virtual machine will be the image template for our processing machines. These virtual machines will process work supplied by the Main server's Grooper Repository.

As such, the following must be installed on the Worker instance after launching it:

  • Grooper must be installed.
  • They must be able to connect to the Main instance's Grooper Repository
  • Licensing must be configured by pointing to the Grooper Service running on the Main instance.
  • A Grooper Activity Processing service must be installed (and running).

Launch Configuration

For the most part, launching this instance will be very similar to launching the Main instance, with a few things to keep in mind.


  1. Be sure to name the Worker instance something that sets it apart from the Main instance.
    • We've gone with "Worker"


  1. The Worker instance does NOT need to have SQL installed.
    • This means you can use a simpler OS like Microsoft Windows Server 2019 Base. This will decrease your operating costs.


  1. Because we're eventually taking advantage of Auto Scaling, you do not need more than 4 CPUs.
    • Exactly 4 is preferred for this deployment setup.
    • We've gone with the same "m6i.xlarge" instance type we used when launching the Main instance.


  1. You can use the same Key Pair you used to create your password as you did for the Main instance.
    • A single Key Pair can be used to create passwords for multiple instances.


  1. CRITICAL!! Be sure to select the same Security Group the Main instance is using, and whose inbound rules you've edited in the previous step.

Grooper Installation and Configuration

Once connected to the "Worker" instance, you must do the following:

  1. Install Grooper.
  2. Connect to the Main instance's Grooper Repository.
    • See the "Connect to the Grooper Repository" section below for more specifics.
  3. License the Worker instance using Main's license URL.
    • See the "License the Worker Instance" section below for more specifics.
  4. Install an Activity Processing service, running four threads.
    • Please note when entering the User Name property, you typically use the following format:
    <machine name>\<user name>
    • Because Auto Scaling is going to create new machines using an image of this Worker instance, the machine names will necessarily change. Use the "dot" shortcut instead, a la:
    .\<user name>
  5. Verify the Worker instance has access to the Grooper Repository's database and file store.
    • See the "Verification" section below for more details.

If you need guidance on installing Grooper, please visit the the Install and Setup article on the Grooper Wiki.

If you need more information about Activity Processing services, please visit the Activity Processing article on the Grooper Wiki.

Connect to the Grooper Repository

For the most part, you will connect to Main's Grooper Repository as you would normally. However, there is one key difference:

  • Instead of using the server's hostname when connecting to the Grooper Repository database, you will need to use its private IP address.

FYI

You can copy the Main instance's IP address from the EC2 Dashboard. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab.


  1. In the Server Name property, enter the Main instance's private IP address.
  2. FYI: Don't forget to enter the logon credentials to access the database, if necessary.

License the Worker Instance

When licensing the Worker instance, you will also need to use the Main instance's private IP address.

Typically, when licensing a client machine with hosted license, you will use a URL with the following format:

http://<hostname>:13900/LicenseService.svc

However, for this deployment, you will replace the hostname with the Main instance's private IP address, using the following format:

http://<private IP address>:13900/LicenseService.svc


To license the Worker instance, after connecting to Main's Grooper Repository, open Grooper Design Studio, then:

  1. Select the root node of the repository.
  2. Select the License Server URL property.
  3. Enter the URL with the private IP address in place of the hostname.

Verification

Prior to continuing, you should verify the following things:

  1. The Worker instance has access to Main's Grooper Repository database
  2. The Worker instance has access to Main's Grooper Repository file store.
  3. The Worker instance's Grooper is licensed using Main's hosted license.
    • Main's Grooper Licensing service must be running to verify this.
  4. The Worker instance's Activity Processing service is running and executing Unattended Activity steps for production Batches.


To test this, you should do the following:


Logged into the Worker instance, try and copy and paste a document into a test Batch. Then, load that document's content (i.e. right click the Batch Folder and select File System Link > Load Content).

If you can do so, you've verified the following:

  1. The Worker instance's Grooper is licensed (You'd get an error message upon opening Grooper Design Studio if it wasn't).
  2. The Worker instance has access to Main's Grooper Repository database (You wouldn't be able to create a test Batch if there was some restriction on your database access).
  3. The Worker instance has access to Main's Grooper Repository file store (You wouldn't be able to paste a document into Grooper and load its content if there is something restricting your file store access).


Logged into the Main instance, test the Activity Processing service is running correctly

  • To do this, make a simple Batch Process (It could just have one step in it. It doesn't matter, it just needs to be an automatable Unattended Activity step).
  • Then, import a document into a new Batch, assigning it the Batch Process.
  • Last, ensure the Batch is not "paused".

If the Batch Process's steps execute automatically, you've verified the Worker instance's Activity Processing service is running and executing Unattended Activity steps for production Batches.

Step 4: Create an Amazon Machine Image (AMI) of the Worker Instance

In order to set up AWS Auto Scaling, you first need an Amazon Machine Image (AMI) and a Launch Template.

Next, we will take a snapshot of our Worker instance and save it as an image. Now that the Worker has Grooper installed and licensed, we've verified it can access Main's Grooper Repository, and it has an Activity Processing service installed, we can create its image. This is easily done in just a few steps.


First, ensure the Worker instance is running and the the Activity Processing service is also running.

  1. In the EC2 Dashboard, navigate to the Instances interface in the navigation panel.
  2. Right-click the Worker instance.
  3. Select Images and templates.
  4. Then, select Create image.


  1. On the following screen, name your image, using the Image name property.
    • We've named ours "Worker Image".
  2. Press the Create image button.

Step 5: Create a Launch Template

Now that we have an image, we can create our Launch Template.

A Launch Template defines settings to launch a new instance. This will allow Auto Scaling to create new virtual machines from the image we took in the previous step, and place it within the right Security Group. Auto Scaling will use this Launch Template to launch multiple new Worker instances, as processing demands require it.


Create a New Launch Template


  1. To create a Launch Template, navigate to the Launch Templates interface in the navigation panel.
  2. Press the Create launch template button.

Configure the Launch Template

Configuring a Launch Template is a lot like launching a new instance. Just in our case, instead of creating an instance from a "blank" operating system, we're going to be creating instances from our Worker image.


  1. First, you'll need to give your Launch Template a name.


  1. Then, you'll need to select your image.
a. Select the My AMIs tab.
b. Select your Worker image from the dropdown selection list.


  1. Next, choose your instance type.
    • You should choose whatever CPU and RAM configuration you chose for the Worker instance.


  1. Be sure to use the same Key Pair you used for the Main and Worker instances.


  1. CRITICAL!! Be sure to select the same Security Group the Main and Worker instances are using.


  1. When finished, at the bottom of the Summary heading, press the Create launch template button.

Step 6: Configure the Auto Scaling Policies

Finally, we come to the Auto Scaling setup.

With an Auto Scaling Group, we will define a capacity and a scaling policy for our Worker machines. The capacity (or "group size") will determine the minimum and maximum numbers of Worker instances running at any one time. The scaling policy will contain the rules for when to spin up and spin down Worker instances. For example, we will tell our scaling policy to create a new Worker instance whenever the Worker pool's CPU utilization exceeds 50%.

Create a New Auto Scaling Group


To make a new Auto Scaling Group:

  1. In the EC2 Dashboard, scroll to the bottom of the left navigation panel and select Auto Scaling Groups.
  2. Press the Create an Auto Scaling group button.

Name the Group and Select Launch Template


  1. In the Name panel, give your Auto Scaling Group a name.
  2. In the Launch template panel, select the Launch Template you created previously.
    • This is the Launch Template designed to launch new Worker instances from your Worker image.
  3. Press the Next button at the bottom of the page to continue.

Select Availability Zone


The most important part of the next set of configuration panels is selecting your Availability Zone. The Availability Zone for your Auto Scaling Group must match that of your Main and Worker instances.

FYI

If you have forgotten what Availability Zone you've placed your Main and Worker instances in, you can find your instances' Availability Zones in the Instances interface.


  1. On the 'Choose instance launch options page, select the appropriate Availability Zone from the dropdown list.
  2. Press the Next button at the bottom of the page to continue.


FYI

The next configuration screen allows you to configure some advanced options, if you so choose, such as adding a load balancer, health checks, and other advanced options.

We will not do so for the purposes of this tutorial. We will move on to the next configuration screen "Step 4: Configure group size and scaling policies".

Set Capacity and Scaling Policies

On the next screen, you can configure the Group size and Scaling policies. These allow you to give the parameters to dynamically scale your Worker instances.


The Group size determines three important settings regarding the capacity of your Auto Scaling Group. In other words, how many Worker instances will be running at any one time.

  1. The Desired capacity determines the base number of instances running at all times.
    • We've set this to "2". So, when Auto Scaling is enabled, we will start out with two Worker instances. AWS will create and boot up two Worker instances for us.
    • This number must be within the minimum and maximum capacity.
  2. The Minimum capacity determines the minimum number of instances created and running at any one time.
    • No less than this number of instances will be created.
  3. The Maximum capacity determines the maximum number of instances created and running at any one time.
    • No more than this number of instances will be created.


The Scaling policies determine what metric is used to spin up new Worker instances, as needed, and spin them down whenever they're not needed anymore.

  1. Select Target tracking scaling policy to set the scaling metric.
  2. Use the Metric type property to select what will determine if a new instance is created or terminated.
    • In our case, we're using Average CPU utilization. Instances will be scaled based on what percentage of the processing resources are used by the various Worker instances in the group.
  3. Set the Target value.
    • This determines the percentage of CPU utilization required to spin up and down new Worker instances, in our case. If more than 50% of the combined CPU resources are working, a new instance will be created (up to the maximum capacity). If less than 50% of the CPU resources are working, instances will be terminated (up to the minimum capacity).
  4. Optionally, the Instances need property can be configured to define a "buffer" before using the metric measurement.
  5. When finished configuring the Group Size and Scaling Policies, press the Next button at the bottom of the page to continue.

Review and Create Auto Scaling Group


  1. The last step is to review your your Auto Scaling Group in the Review screen.

FYI

You can optionally add notifications and tags in Steps 5 and 6 of the Auto Scaling Group setup. We chose not to for the purposes of this tutorial.


  1. When you are satisfied with your configuration, scroll down to the bottom of the Review screen and press the Create Auto Scaling group.


  1. Upon creating the Auto Scaling Group, AWS will automatically spin up the number of Worker instances listed for the desired capacity.
    • Since our Worker image has an Activity Processing service installed, these instances will start grabbing work from the Main instance's Grooper Repository.
    • Whenever these instances' CPU usage exceeds 50%, a new Worker instance will be created.
    • Whenever the total CPU usage falls below 50%, a Worker instance will be terminated.