Deployment in AWS with Auto Scaling (DRAFT): Difference between revisions

From Grooper Wiki
Line 344: Line 344:


{|cellpadding=10 cellspacing=5
{|cellpadding=10 cellspacing=5
|valign=top style="width:15%"|
|style="width:15%"|
[[File:2021-aws-diagram-worker.png|center]]
[[File:2021-aws-diagram-worker.png|center]]
|valign=top|
|
Next, we will launch and configure the "Worker" EC2 instance.  This virtual machine will be the image template for our processing machines.  These virtual machines will process work supplied by the Main server's Grooper Repository.   
Next, we will launch and configure the "Worker" EC2 instance.  This virtual machine will be the image template for our processing machines.  These virtual machines will process work supplied by the Main server's Grooper Repository.   



Revision as of 11:30, 29 June 2022

WIP This article is a work-in-progress or created as a placeholder for testing purposes. This article is subject to change and/or expansion. It may be incomplete, inaccurate, or stop abruptly.

This tag will be removed upon draft completion.

Easily scale your Grooper processing with cloud-based computing using Amazon Web Services and Auto Scaling.

About AWS

Imagine you normally process a hundred documents a day. Then, out of nowhere, you start getting a thousand documents a day. Maybe its a busy time of year for you. Maybe you landed a huge client and need to process their paperwork. Great news! But how do you account for the increase in computing demands? How do you both scale up your IT infrastructure to meet the new demand and scale it down once the demand has subsided?

One answer lies in cloud-based computing. Spinning up a pool of virtual machines to process the additional workload is, generally speaking, quicker and more cost effective than purchasing and deploying physical computers. Furthermore, what are you going to do once you're done processing all that extra work? With a cloud-based IT infrastructure, you can quickly deactivate those machines and re-activate them as needed on-demand. With a physical infrastructure, you're stuck holding a lot of additional hardware until you need it again.

Amazon Web Services (AWS) is one of the most popular on-demand cloud computing providers. Their Elastic Compute Cloud (EC2) gives users a pay-as-you-go model to create, launch and terminate virtual computers, as needed. These virtual machines (which Amazon calls "instances") can be built to mirror physical computer specifications, including operating system, CPU, RAM and storage options. EC2 encourages scalability with web services that allow you to take a snapshot of a server's configuration (an Amazon Machine Image or "AMI") and boot a virtual machine with whatever software you need already loaded.

Furthermore, the process of spinning up and spinning down virtual machines can be automated with AWS Auto Scaling. This allows users to set automatic scaling parameters, defining when to launch new EC2 instances to meet surges in processing demand. The virtual machines are then terminated as soon as demand dies down. With a pay-for-use model, you only pay for the machines while they're running. Ultimately, this saves you time scaling your IT infrastructure up and down and saves you money only paying for what you need when you need it.

In this article, we will show you how to set up a scalable Grooper deployment using Amazon Web Services (AWS), Elastic Compute Cloud (EC2) instances and Auto Scaling.

The General Process

The general steps to complete this deployment setup are as follows:

  1. Launch the "Main" EC2 instance.
    • This is the virtual machine that hosts the Grooper Repository's database and file store and the Grooper license for the "Worker" machines.
  2. Setup the Security Group.
    • In AWS, Security Groups are a set of firewall rules, restricting inbound and outbound traffic based on protocols and port numbers.
  3. Launch the "Worker" EC2 instance.
    • This is the virtual machine performing automated processing tasks.
    • Grooper will need to be installed. Grooper will need to be licensed using the Main instance's hosted licensing. The Worker instance will need to be able to connect to the Main instance's Grooper Repository.
  4. Create an Amazon Machine Image (AMI) of the Worker instance.
    • This image will be a snapshot of the Worker machine, with Grooper installed and ready for document processing.
    • This is used to create new Worker instances automatically according to the Auto Scaling setup.
  5. Create a Launch Template.
    • Essentially, a Launch Template is a set of instructions to launch a new EC2 instance programmatically (rather than manually, as done in step 3).
    • Auto Scaling will use the Launch Template to create new Worker instances, using the AMI created in step 5, and assign the right Security Group.
  6. Configure the Auto Scaling policies.
    • This defines the minimum and maximum number of virtual machines to be added as well as the scaling rules, what conditions should be met to spin up and down new EC2 instances.

How To

Step 1: Launch the "Main" EC2 Instance

First, you must launch and configure the "Main" EC2 instance. Most importantly, this virtual machine will host the Grooper Repository environment. It will house the Grooper database and file store location. It will also host the Grooper Licensing service supplying licenses to the "Worker" machines. As such the following must be installed on the Main instance after launching it:

  • SQL must be installed.
  • Grooper must be installed.
  • A Grooper Repository must be configured.
  • The Grooper Licensing service must be installed (and running).

Getting Started


To launch a new EC2 instance, log into the AWS Management Console at console.aws.amazon.com

  1. In the search bar, search for "EC2".
  2. Under Services, select EC2.
    • This will take you to the EC2 Dashboard.


  1. In the EC2 Dashboard, select Launch Instance.
  2. Then, select Launch Instance.
    • FYI: We will create the Main instance "from scratch". Launch Templates (the other option) allow users to save launch configurations. We will actually make a Launch Template later in order to set up AWS Auto Scaling.


  1. This will take you to the Launch an Instance configuration panel.
  2. Name your virtual machine.
    • We've simply named this machine "Main"
    • Use whatever naming convention works for your organization. Just be sure to make a distinction between this machine, which is the host server for the Grooper Repository and licensing, and the "Worker" machines we will create later, which will be our scalable machines used purely to process work in Grooper.

System Specs


Next, you will choose the virtual machines system specifications. There is some flexibility in your choices. However, there are some requirements and best practice suggestions you'll need to keep in mind. You will configure the following:

  1. Application and OS Image
    • Windows Server 2012 or later or Windows 10 or later is required.
    • The machine will need to host the Grooper database as well. SQL Server 2012 or later must be installed (or accessible from a hosted server).
    • For the purposes of this tutorial, we selected Microsoft Windows Server 2019 with SQL Server 2012 Standard. This takes care of both these requirements.
    • For more information on instances and AMIs, visit Amazon's documentation here.
  2. Instance Type
    • These settings define the CPU and RAM for the virtual machine.
    • Server RAM should be 16GB or more.
    • The CPU should consist of 4 or more cores.
    • We selected the "m6i.xlarge" instance type, which meets these requirements. There are, however, other instance types that also meet these requirements. For example, the "t3.xlarge" has the same number of cores and RAM, but is cheaper and performs slower.
    • For more information on instance types, visit Amazon's documentation here.
  3. Storage
    • The storage will need to accommodate the Grooper database and file store.
    • AWS has a variety of storage options for you to choose from, including SSD and HDD options. There are also scalable storage options, allowing you to dynamically increase the instance's storage, as needed.
    • We selected 65 GiB of "gp2" memory, which is the "default" storage size for the selected OS image. We certainly won't need more storage than that for this demo.
    • For more information on storage options, visit Amazon's documentation here

Network Settings

Generally speaking, every organization will have their own network security requirements. Your IT department will need to configure the EC2 instance's Network Settings panel according to your needs. For the purposes of this demo, we will use the default Network Settings.

  • For more information on networking, visit the Amazon documentation here.
  • For more information on security, visit the Amazon documentation here.


However, for all AWS Grooper deployments we will need to configure port access using a Security Group.

  1. To create a new Security Group, select the "Create security group" option.
  2. If you have an existing Security Group you can use, select the "Select existing security group" option and select the Security Group.

We will create a new Security Group in this demonstration.

Login Info

Amazon strongly recommends you log into EC2 instances using a key pair. If you plan on connecting to your virtual machines using RDP (which we will in this tutorial), you must specify a key pair.

  • For more information on key pairs, please visit the Amazon documentation here.


  1. You may specify an existing key pair using the dropdown menu.
  2. You may create a new key pair by following this link.

We have already generated a key pair and selected it.

When you've finished configuring the EC2 instance, you will need to launch it, and then connect to it to install and configure Grooper.

Launch Instance

  1. At the bottom of the Summary heading, press the Launch instance button.


  1. You will see the following success message if the instance was successfully launched.
    • It may take a few minutes for Amazon to create the instance. The virtual machine will be running at launch time.

Connect to Instance

  1. To connect to the new instance, select Instances in the navigation bar.
  2. Check the box next to the instance you want to connect to.
    • The instance must be running in order to connect to it. Right-click an instance to start/stop it.
    • If this is a new instance, Amazon will perform a series of initialization checks before you can connect to it. This may take a few minutes.
  3. Press the Connect button to connect to the selected instance.


  1. To connect to the instance using RDP, select the RDP tab.
  2. If this is your first time connecting to the instance, you will need to decrypt a password from your key pair. To do this, press the Get Password link and follow its instructions.
  3. Once you have your password, select the Download remote desktop file button to download an RDP file.


Open the downloaded RDP file and enter the password to connect to the virtual machine.

Grooper Installation and Configuration

Once connected to the "Main" instance, you must do the following:

  1. Install Grooper.
  2. Configure a new Grooper Repository.
    • You will need to know the server's hostname when creating the Grooper database.
    • If you are unsure of the VM's hostname, open a Command Prompt and enter "hostname".
  3. Install a Grooper Licensing service.
    • Be sure to include the VM's hostname in the security credentials for the user name.
  4. Manually adjust the file store's path in Grooper Design Studio.
    • You will need to change the root of the UNC path from the machine's name to its private IP address. See the "Manually Adjust File Store Path" section below for more details.

If you need guidance on installing and configuring Grooper or a Grooper Licensing service, please visit the the Install and Setup article on the Grooper Wiki.

For Version 2022 and beyond

If you are using the Grooper Web Client, you must install IIS on the server and install the Grooper Web Client application. If you need guidance, please visit the Web Client article on the Grooper Wiki.

Manually Adjust File Store Path

Typically, when you initialize a Grooper Repository, you give a UNC path for the folder location in the following format:

\\<machine name>\<folder path>

However, when the Worker instance accesses the file store, it will need to use the Main instance's private IP address, using the following UNC path format:

\\<private IP address>\<folder path>
FYI

You can copy the Main instance's IP address from the AWS console. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab.

Therefore, you will need to manually adjust the Grooper file store's storage path. To do so, open Grooper Design Studio and perform the following steps:

  1. Expand the File Stores folder.
  2. Select the Primary file store object.
  3. Navigate to the Advanced tab.
  4. Navigate to the Properties tab.
  5. Change the root of the UNC path to the Main instance's private IP address.


It is generally ill-advised to manually adjust a Grooper object's properties using the Properties tab.

This is a rare exception to this rule. Please do not get in the habit of changing object properties using the Properties tab.


Step 2: Setup the Security Group

Next, you will need to configure the Main instance's Security Group. In AWS, a Security Group acts as a virtual firewall. You can define rules to control either inbound or outbound traffic (or both) based on protocols and port numbers.

For this deployment, you will need to configure the following protocols and ports for inbound traffic:

TCP (Transmission Control Protocol)

  • Port 13950
    • This will open access for the Grooper REST API.
  • Port 13900
    • This will open access for the Grooper Licensing service, allowing the Main instance to supply licenses for the Worker instances.
  • Port 13930
    • Only required for version 2022 and later
    • This will open access for the Grooper Web Client
  • Port 13905
    • Only required for version 2022 and later
    • This will open access for the Grooper Desktop scanning application.

ICMP (Internet Control Message Protocol)

  • All ports
    • This will allow communication between the Main instance and the Worker instances, ultimately allowing the Worker instances to access Main's Grooper Repository for document processing.


First, you will need to access the Main instance's Security Group from the AWS console.

  1. You can access all Security Groups using the Security Groups link in the left hand navigation panel.
  2. Or, you can select the Main instance from the Instances interface.
  3. Then, select the Security tab.
  4. Then, select the item under "Security groups".vv
    • In our case, this is the security group we created when we launched our Main instance.
    • Remember this Security Group's name. We will need to reference it later when creating our Worker instances.


With the Security Group selected, you will need to edit inbound rules for the protocols and ports described above.

  1. Press the Edit inbound rules button.


  1. Add new inbound rules by pressing the Add rule button, toward the bottom of the page.
  2. Configure the following inbound rules:
Type Port range Source Additional Info
Custom TCP 13950 Anywhere-IPv4 For Grooper Rest API
Custom TCP 13900 Anywhere-IPv4 For Grooper Licensing
Custom TCP 13930 Anywhere-IPv4 For Grooper Web Client
Only necessary for 2022 installs and beyond
Custom TCP 13905 Anywhere-IPv4 For Grooper Desktop
Only necessary for 2022 installs and beyond
Custom ICMP - IPv4 All Anywhere-IPv4 To allow communication between Main and Worker instances
  1. Press the Save rules button when finished.

These inbound rules will also need to be added to the Windows Firewall in the Main and Worker instances (or you can turn off the Windows Firewall on both machines).

Step 3: Launch the "Worker" EC2 Instance

Next, we will launch and configure the "Worker" EC2 instance. This virtual machine will be the image template for our processing machines. These virtual machines will process work supplied by the Main server's Grooper Repository.

As such, the following must be installed on the Worker instance after launching it:

  • Grooper must be installed.
  • They must be able to connect to the Main instance's Grooper Repository
  • Licensing must be configured by pointing to the Grooper Service running on the Main instance.
  • A Grooper Activity Processing service must be installed (and running).

Launch Configuration

For the most part, launching this instance will be very similar to launching the Main instance, with a few things to keep in mind.


  1. Be sure to name the Worker instance something that sets it apart from the Main instance.
    • We've gone with "Worker"


  1. The Worker instance does NOT need to have SQL installed.
    • This means you can use a simpler OS like Microsoft Windows Server 2019 Base. This will decrease your operating costs.


  1. Because we're eventually taking advantage of Auto Scaling, you do not need more than 4 CPUs.
    • Exactly 4 is preferred for this deployment setup.
    • We've gone with the same "m6i.xlarge" instance type we used when launching the Main instance.


  1. You can use the same Key Pair you used to create your password as you did for the Main instance.
    • A single Key Pair can be used to create passwords for multiple instances.


  1. CRITICAL!! Be sure to select the same Security Group the Main instance is using, and whose inbound rules you've edited in the previous step.

Grooper Installation and Configuration

Once connected to the "Worker" instance, you must do the following:

  1. Install Grooper.
  2. Connect to the Main instance's Grooper Repository.
    • See the "Connect to the Grooper Repository" section below for more specifics.
  3. License the Worker instance using Main's license URL.
    • See the "License the Worker Instance" section below for more specifics.
  4. Install an Activity Processing service, running four threads.
  5. Verify the Worker instance has access to the Grooper Repository's database and file store.
    • See the "Verification" section below for more details.

If you need guidance on installing Grooper, please visit the the Install and Setup article on the Grooper Wiki.

If you need more information about Activity Processing services, please visit the Activity Processing article on the Grooper Wiki.

Connect to the Grooper Repository

For the most part, you will connect to Main's Grooper Repository as you would normally. However, there is one key difference:

  • Instead of using the server's hostname when connecting to the Grooper Repository database, you will need to use its private IP address.
FYI

You can copy the Main instance's IP address from the AWS console. Simply select the instance using the Instances interface and look for "Private IPv4 addresses" in the Details tab.


  1. In the Server Name property, enter the Main instance's private IP address.
  2. FYI: Don't forget to enter the logon credentials to access the database, if necessary.

License the Worker Instance

When licensing the Worker instance, you will also need to use the Main instance's private IP address.

Typically, when licensing a client machine with hosted license, you will use a URL with the following format:

http://<hostname>:13900/LicenseService.svc

However, for this deployment, you will replace the hostname with the Main instance's private IP address, using the following format:

http://<private IP address>:13900/LicenseService.svc


To license the Worker instance, after connecting to Main's Grooper Repository, open Grooper Design Studio, then:

  1. Select the root node of the repository.
  2. Select the License Server URL property.
  3. Enter the URL with the private IP address in place of the hostname.

Verification

Step 4: Create an Amazon Machine Image (AMI) of the Worker Instance

Step 5: Create a Launch Template

Step 6: Configure the Auto Scaling Policies