All Projects → structurely → ecs-autoscale

structurely / ecs-autoscale

Licence: MIT license
A framework that runs on AWS Lambda for autoscaling ECS clusters and services

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to ecs-autoscale

Ecs Formation
Tool to build Docker cluster composition for Amazon EC2 Container Service(ECS)
Stars: ✭ 114 (+65.22%)
Mutual labels:  ec2, ecs, autoscaling
Aws Scalable Big Blue Button Example
Demonstration of how to deploy a scalable video conference solution based on Big Blue Button
Stars: ✭ 29 (-57.97%)
Mutual labels:  ec2, ecs, autoscaling
Terraform Aws Alb
Terraform module to provision a standard ALB for HTTP/HTTP traffic
Stars: ✭ 53 (-23.19%)
Mutual labels:  ec2, ecs
Awesome Aws
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
Stars: ✭ 9,895 (+14240.58%)
Mutual labels:  ec2, ecs
Amazon Ec2 Instance Selector
A CLI tool and go library which recommends instance types based on resource criteria like vcpus and memory
Stars: ✭ 146 (+111.59%)
Mutual labels:  ec2, aws-ec2
terraform-ecs
Terraform ECS module
Stars: ✭ 15 (-78.26%)
Mutual labels:  ec2, ecs
Aws.ec2
AWS EC2 Client Package
Stars: ✭ 47 (-31.88%)
Mutual labels:  ec2, aws-ec2
Aws Workflows On Github
Workflows for automation of AWS services setup from Github CI/CD
Stars: ✭ 95 (+37.68%)
Mutual labels:  ec2, ecs
terraform-aws-mongodb
Simplify MongoDB provisioning on AWS using Terraform
Stars: ✭ 20 (-71.01%)
Mutual labels:  ec2, ecs
Aws Ec2 Assign Elastic Ip
Automatically assign Elastic IPs to AWS EC2 Auto Scaling Group instances
Stars: ✭ 172 (+149.28%)
Mutual labels:  ec2, autoscaling
Autospotting
Saves up to 90% of AWS EC2 costs by automating the use of spot instances on existing AutoScaling groups. Installs in minutes using CloudFormation or Terraform. Convenient to deploy at scale using StackSets. Uses tagging to avoid launch configuration changes. Automated spot termination handling. Reliable fallback to on-demand instances.
Stars: ✭ 2,014 (+2818.84%)
Mutual labels:  ec2, autoscaling
Awsssmchaosrunner
Amazon's light-weight library for chaos engineering on AWS. It can be used for EC2, ECS (with EC2 launch type) and Fargate.
Stars: ✭ 214 (+210.14%)
Mutual labels:  ec2, ecs
Ecs Refarch Continuous Deployment
ECS Reference Architecture for creating a flexible and scalable deployment pipeline to Amazon ECS using AWS CodePipeline
Stars: ✭ 776 (+1024.64%)
Mutual labels:  ec2, ecs
Aegea
Amazon Web Services Operator Interface
Stars: ✭ 51 (-26.09%)
Mutual labels:  ec2, ecs
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+275.36%)
Mutual labels:  ec2, aws-ec2
ecs-ansible-packer-terraform-wordpress
Proof of concept: Install wordpress environment using ansible, packer, docker, terraform and AWS.
Stars: ✭ 29 (-57.97%)
Mutual labels:  ec2, autoscaling
AutoSpotting
Saves up to 90% of AWS EC2 costs by automating the use of spot instances on existing AutoScaling groups. Installs in minutes using CloudFormation or Terraform. Convenient to deploy at scale using StackSets. Uses tagging to avoid launch configuration changes. Automated spot termination handling. Reliable fallback to on-demand instances.
Stars: ✭ 2,058 (+2882.61%)
Mutual labels:  ec2, autoscaling
sensu-plugins-aws
This plugin provides native AWS instrumentation for monitoring and metrics collection, including: health and metrics for various AWS services, such as EC2, RDS, ELB, and more, as well as handlers for EC2, SES, and SNS.
Stars: ✭ 79 (+14.49%)
Mutual labels:  ec2, autoscaling
Aws Sdk Perl
A community AWS SDK for Perl Programmers
Stars: ✭ 153 (+121.74%)
Mutual labels:  ec2, autoscaling
amazon-cloudwatch-auto-alarms
Automatically create and configure Amazon CloudWatch alarms for EC2 instances, RDS, and AWS Lambda using tags for standard and custom CloudWatch Metrics.
Stars: ✭ 52 (-24.64%)
Mutual labels:  ec2, aws-ec2

ecs-autoscale

Build Status

This is a Lambda function that allows you to automatically scale EC2 instances and services within an ECS cluster simultaneously based on arbitrary metrics from sources not limited to CloudWatch.

Table of contents

Requirements

The only requirement is an AWS account with programmatic access and Docker.

Quick start

Suppose we want to set up autoscaling for a cluster on ECS called my_cluster with two services running: backend and worker. Suppose backend is just a simple web server and worker is a celery worker for handling long-running tasks for the web server with a RabbitMQ instance as the broker.

In this case we want to scale the web server based on CPU utilization and scale the celery worker based on the number of waiting tasks (which is given by the number of ready messages on the RabbitMQ instance).

We can get the CPU utilization of the web server directly from CloudWatch, but in order to get the number of queued messages in the RabbitMQ instance, we will need to make an HTTP GET request to the api of the queue.

To learn more about the RabbitMQ API, see https://cdn.rawgit.com/rabbitmq/rabbitmq-management/v3.7.2/priv/www/api/index.html.

Step 1: Define the cluster scaling requirements

Create a YAML file ./lambda/clusters/my_cluster.yml.

NOTE: The name of the YAML file sans extension must exactly match the name of the cluster on ECS.

Our cluster definition will look like this:

# Exact name of the autoscaling group.
autoscale_group: EC2ContainerService-my_cluster-EcsInstanceAsg-AAAAA

# Set to false to ignore this cluster when autoscaling.
enabled: true

# Buffer room: you can think of this as an empty service / task.
cpu_buffer: 0  # Size of buffer in CPU units.
mem_buffer: 0  # Size of buffer in memory.

# Optionally specify the minimum and maximum number of instances for the cluster's
# autoscaling group from here.
min: 1
max: 4

# Defines scaling for individual services.
services:
  # Here we specify scaling for the "worker" service.
  worker:  # This should be the exact name of the service as on ECS.
    # Set to false to ignore service when autoscaling.
    enabled: true

    min: 1  # Min number of tasks.
    max: 3  # Max number of tasks.

    metric_sources:
      # Data sources needed for gathering metrics. Currently only `third_party` and 
      # `cloudwatch` are supported. Only one statistic from one source is needed.
      # For more information on the metrics available, see below under "Metrics".
      third_party:
        - url: https://username:password@my_rabbitmq_host.com/api/queues/celery
          method: GET  # Either GET or POST
          payload: null  # Optional JSON paylaod to include with the request
          statistics:
            - name: messages_ready
              alias: queue_length
          # In this case it is assumed that we will make a GET request to the url
          # given, and that request will return a JSON object that contains
          # the field `messages_ready`.

    # Autoscaling events which determine when to scale up or down.
    events:
      - metric: queue_length  # Name of metric to use.
        action: 1  # Scale up by one.
        # Conditions of the event:
        min: 5
        max: null
      - metric: queue_length
        action: -1  # Scale down by one.
        min: null
        max: 3

  # Here we specify scaling for the "backend" service.
  backend:
    enabled: true
    min: 1
    max: 3
    metric_sources:
      # We only need metrics from CloudWatch this time.
      cloudwatch:
        - namespace: AWS/ECS
          metric_name: CPUUtilization
          dimensions:
            - name: ClusterName
              value: my_cluster
            - name: ServiceName
              value: backend
          period: 300
          statistics:
            - name: Average
              alias: cpu_usage

    events:
      - metric: cpu_usage
        action: 1  # Scale up by 1
        min: 10
        max: null
      - metric: cpu_usage
        action: -1  # Scale down by 1
        min: null
        max: 1

NOTE: You may not want to store sensitive information in your cluster definition, such as the username and password in the RabbitMQ URL above. In this case you could store those values in environment variables and pass them to the cluster definition using our special syntax: %(VARIABLE_NAME). So, for example, suppose we have the environment variables USERNAME and PASSWORD. Then the line above with the url for RabbitMQ would become url: https://%(USERNAME):%(PASSWORD)@my_rabbitmq_host.com/api/queues/celery.

Step 2: Build the docker image

The docker image is used to test the Lambda function locally, as well as to set up the Lambda environment on AWS and upload deployment packages.

You can build the image with

docker build -t epwalsh/ecs-autoscale .

Step 3: Test the function locally

We can use the Docker image created in step 2 to test the Lamda function. In order to start the container, create a file called access.txt that looks like this:

AWS_DEFAULT_REGION=***
AWS_ACCESS_KEY_ID=***
AWS_SECRET_ACCESS_KEY=***

If your cluster definitions requires other environment variabes, you can also put them in there.

Then run:

docker run --env-file=./access.txt --rm epwalsh/ecs-autoscale make test-run

Step 4: Setup and deployment

We can now create the Lambda function on AWS.

All you have to do is run the Docker container again with the bootstrap.sh script as the command:

docker run --env-file=./access.txt --rm epwalsh/ecs-autoscale ./bootstrap.sh

This will:

  • Create an IAM policy that gives access to the resources the lambda function will need.
  • Create a role for the Lambda function to use, and attach the policy to that role.
  • Build a deployment package.
  • Create a Lambda function on AWS with the role attached and upload the deployment package.

Step 5: Create a trigger to execute your function

In this example we will create a simple CloudWatch rule that triggers our Lambda function to run every 5 minutes.

To do this, first login to the AWS Console and the go to the CloudWatch service. On the left side menu, click on "Rules". You should see a page that looks like this:

step1

Then click "Create rule" by the top. You should now see a page that looks like this:

step2

Make sure you check "Schedule" instead of "Event Pattern", and then set it to a fixed rate of 5 minutes. Then on the right side click "Add target" and choose "ecs-autoscale" from the drop down.

Next click "Configure details", give your rule a name, and then click "Create rule".

You're all set! After 5 minutes your function should run.

Deploying updates

If you update your cluster definitions, you can easily redeploy ecs-autoscale by running:

docker run --env-file=./access.txt --rm epwalsh/ecs-autoscale make deploy

Scaling details

Scaling individual services

Individual services can be scaled up or down according to arbitrary metrics, as long as those metrics can be gathered through a simple HTTP request. For example, celery workers can be scaled according to the number of queued messages.

Scaling up the cluster

A cluster is triggered to scale up by one instance when both of the following two conditions are met:

  • the desired capacity of the corresponding autoscaling group is less than the maximum capacity, and
  • the additional tasks for services that need to scale up cannot fit on the existing instances with room left over for the predefined CPU and memory buffers.

Scaling down the cluster

A cluster is triggered to scale down by one instance when both of the following two conditions are met:

  • the desired capacity of the corresponding autoscaling group is greater than the minimum capacity, and
  • all of the tasks on the EC2 instance in the cluster with either the smallest amount of reserved CPU units or memory could fit entirely on another instance in the cluster, and so that the other instances could still support all additional tasks for services that need to scale up with room left over for the predefined CPU and memory buffers.

Metrics

Sources

In order to use a metric, you need to define the source of the metric under metric_sources in the YAML definition. In the example above, we defined a third party metric like this:

third_party:
  - url: https://username:password@my_rabbitmq_host.com/api/queues/celery
    method: GET
    payload: null
    statistics:
      - name: messages_ready
        alias: queue_length

This created a metric called queue_length based on messages_ready. The alias queue_length was arbitrary, and is the name used to reference this metric when defining events like in the above example:

events:
  - metric: queue_length
    action: 1
    min: 5
    max: null
  - metric: queue_length
    action: -1
    min: null
    max: 3

In general, third party metrics are gathered by making an HTTP request to the url given. It is then assumed that the request will return a JSON object with a field name corresponding to the name of the metric. To retreive a nested field in the JSON object, you can use dot notation.

Defining metrics from CloudWatch are pretty straight forward as well, like in our example:

metric_sources:
  cloudwatch:
    - namespace: AWS/ECS
      metric_name: CPUUtilization
      dimensions:
        - name: ClusterName
          value: my_cluster
        - name: ServiceName
          value: backend
      period: 300
      statistics:
        - name: Average
          alias: cpu_usage

One thing to watch out for is how you define the statistics field above. The name part has to match exactly with a statistic used by CloudWatch, and the alias part is an arbitrary name you use to reference this metric when defining events.

Metric arithmetic

You can easily combine metrics with arbitrary arithmetic operations. For example, suppose we create two metrics with aliases cpu_usage and mem_usage. We could create an event based on the product of these two metrics like this:

events:
  - metric: cpu_usage * mem_usage * 100
    action: 1
    min: 50
    max: 100

You could even go crazy for no reason:

events:
  - metric: cpu_usage ** 2 - (mem_usage - 2000 + 1) * mem_usage + mem_usage * 0
    action: 1
    min: 50
    max: 100

In fact, metric arithmetic is interpreted directly as a Python statement, so you can even use functions like min and max:

events:
  - metric: max([cpu_usage, mem_usage])
    action: 1
    min: 0.5
    max: 1.0

Logging

Logs from the Lambda function will be sent to a CloudWatch logstream /aws/lambda/ecs-autoscale. You can also set the log level easily by setting the environment variable LOG_LEVEL, which can be set to

  • debug
  • info
  • warning
  • error

Contributing

This project is in its very early stages and we encourage developer contributions. Please read CONTRIBUTING.md before submitting a PR.

Bugs

To report a bug, submit an issue at https://github.com/structurely/ecs-autoscale/issues/new.

Credit where credit is due

This project was inspired by the following articles and projects:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].