Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → elsevier-core-engineering → Replicator

elsevier-core-engineering / Replicator

Licence: mit

Automated Cluster and Job Scaling For HashiCorp Nomad

Programming Languages

31211 projects - #10 most used programming language

golang

3204 projects

Labels

docker aws consul hashicorp nomad autoscaling scaling

Projects that are alternatives of or similar to Replicator

Hashi Up

bootstrap HashiCorp Consul, Nomad, or Vault over SSH < 1 minute

Stars: ✭ 113 (-31.93%)

Mutual labels: hashicorp, nomad, consul

Sherpa

Sherpa is a highly available, fast, and flexible horizontal job scaling for HashiCorp Nomad. It is capable of running in a number of different modes to suit different requirements, and can scale based on Nomad resource metrics or external sources.

Stars: ✭ 165 (-0.6%)

Mutual labels: hashicorp, nomad, autoscaling

libra

A Nomad auto scaler

Stars: ✭ 72 (-56.63%)

Mutual labels: hashicorp, nomad, autoscaling

Nomad Firehose

Firehose all nomad job, allocation, nodes and evaluations changes to rabbitmq, kinesis or stdout

Stars: ✭ 96 (-42.17%)

Mutual labels: hashicorp, nomad, consul

nomad-consult-ansible-centos

Deploy nomad & consult on centos with ansible

Stars: ✭ 17 (-89.76%)

Mutual labels: consul, hashicorp, nomad

nomad-droplets-autoscaler

DigitalOcean Droplets target plugin for HashiCorp Nomad Autoscaler

Stars: ✭ 42 (-74.7%)

Mutual labels: hashicorp, nomad, autoscaling

hashicorp-labs

Deploy locally on VM an Hashicorp cluster formed by Vault, Consul and Nomad. Ready for deploying and testing your apps.

Stars: ✭ 32 (-80.72%)

Mutual labels: consul, hashicorp, nomad

local-hashicorp-stack

Local Hashicorp Stack for DevOps Development without Hypervisor or Cloud

Stars: ✭ 23 (-86.14%)

Mutual labels: consul, hashicorp, nomad

vim-hcl

Syntax highlighting for HashiCorp Configuration Language (HCL)

Stars: ✭ 83 (-50%)

Mutual labels: consul, hashicorp, nomad

nomad-box

Nomad Box - Simple Terraform-powered setup to Azure of clustered Consul, Nomad and Traefik Load Balancer that runs Docker/GoLang/Java workloads. NOTE: Only suitable in dev environments at the moment until I learn more Terraform, Consul, Nomad, Vault :P

Stars: ✭ 18 (-89.16%)

Mutual labels: consul, hashicorp, nomad

Hashi Ui

A modern user interface for @hashicorp Consul & Nomad

Stars: ✭ 1,119 (+574.1%)

Mutual labels: hashicorp, nomad, consul

Escalator

Escalator is a batch or job optimized horizontal autoscaler for Kubernetes

Stars: ✭ 539 (+224.7%)

Mutual labels: aws, autoscaling, scaling

Terraform Modules

Reusable Terraform modules

Stars: ✭ 63 (-62.05%)

Mutual labels: aws, nomad, consul

Kube Aws Autoscaler

Simple, elastic Kubernetes cluster autoscaler for AWS Auto Scaling Groups

Stars: ✭ 94 (-43.37%)

Mutual labels: aws, autoscaling

Python Nomad

Client library Hashicorp Nomad

Stars: ✭ 90 (-45.78%)

Mutual labels: hashicorp, nomad

Vaultron

🤖 Vault clusters Terraformed onto Docker for great fun and learning!

Stars: ✭ 96 (-42.17%)

Mutual labels: hashicorp, consul

Nomadfiles

A collection of Nomad job files for deploying applications to a cluster

Stars: ✭ 89 (-46.39%)

Mutual labels: hashicorp, nomad

Nomad Helper

Useful tools for working with @hashicorp Nomad at scale

Stars: ✭ 96 (-42.17%)

Mutual labels: hashicorp, nomad

Aws Auto Scaling Custom Resource

Libraries, samples, and tools to help AWS customers onboard with custom resource auto scaling.

Stars: ✭ 78 (-53.01%)

Mutual labels: aws, autoscaling

Ecs Formation

Tool to build Docker cluster composition for Amazon EC2 Container Service(ECS)

Stars: ✭ 114 (-31.33%)

Mutual labels: aws, autoscaling

View All Similar Projects ➔

Replicator

Replicator is a fast and highly concurrent Go daemon that provides dynamic scaling of Nomad jobs and worker nodes.

Replicator job scaling policies are configured as meta parameters within the job specification. A job scaling policy allows scaling constraints to be defined per task-group. Currently supported scaling metrics are CPU and Memory; there are plans for additional metrics as well as different metric backends in the future. Details of configuring job scaling and other important information can be found on the Replicator Job Scaling wiki page.
Replicator supports dynamic scaling of multiple, distinct cluster worker nodes in an AWS autoscaling group. Worker pool autoscaling is configured through Nomad client meta parameters. Details of configuring worker pool scaling and other important information can be found on the Replicator Cluster Scaling wiki page.

At present, worker pool autoscaling is only supported on AWS, however, future support for GCE and Azure are planned using the Go factory/provider pattern.

Download

Pre-compiled releases for a number of platforms are available on the GitHub release page. Docker images are also available from the elsce Docker Hub page.

Running

Replicator can be run in a number of ways; the recommended way is as a Nomad service job either using the Docker driver or the exec driver. There are example Nomad job specification files available as a starting point.

It's recommended to take a look at the agent configuration options to configure Replicator to run best in your environment.

Replicator is fully capable of running as a distributed service; using Consul sessions to provide leadership locking and exclusion. State is also written by Replicator to the Consul KV store, allowing Replicator failures to be handled quickly and efficiently.

An example Nomad client configuration that can be used to enable autoscaling on the worker pool:

bind_addr = "0.0.0.0"
client {
  enabled =  true
  meta {
    "replicator_cooldown"            = 400
    "replicator_enabled"              = true
    "replicator_node_fault_tolerance" = 1
    "replicator_notification_uid"     = "REP2"
    "replicator_provider"             = "aws"
    "replicator_region"               = "us-east-1"
    "replicator_retry_threshold"      = 3
    "replicator_scaling_threshold"    = 3
    "replicator_worker_pool"          = "container-node-public-prod"
  }
}

An example job which has autoscaling enabled:

job "example" {
  datacenters = ["dc1"]
  type        = "service"

  update {
    max_parallel = 1
    stagger      = "10s"
  }

  group "cache" {
    count = 3

    meta {
      "replicator_max"               = 10
      "replicator_cooldown"          = 50
      "replicator_enabled"           = true
      "replicator_min"               = 1
      "replicator_retry_threshold"   = 1
      "replicator_scalein_mem"       = 30
      "replicator_scalein_cpu"       = 30
      "replicator_scaleout_mem"      = 80
      "replicator_scaleout_cpu"      = 80
      "replicator_notification_uid"  = "REP1"
    }

    task "redis" {
      driver = "docker"
      config {
        image = "redis:3.2"
        port_map {
          db = 6379
        }
      }

      resources {
        cpu    = 500 # 500 MHz
        memory = 256 # 256MB
        network {
          mbits = 10
          port "db" {}
        }
      }

      service {
        name = "global-redis-check"
        tags = ["global", "cache"]
        port = "db"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Permissions

Replicator requires permissions to Consul and the AWS (the only currently supported cloud provider) API in order to function correctly. The Consul ACL token is passed as a configuration parameter and AWS API access should be granted using an EC2 instance IAM role. Vault support is planned for the near future, which will change the way in which permissions are managed and provide a much more secure method of delivering these.

Consul ACL Token Permissions

If the Consul cluster being used is running ACLs; the following ACL policy will allow Replicator the required access to perform all functions based on its default configuration:

key "" {
  policy = "read"
}
key "replicator/config" {
  policy = "write"
}
node "" {
  policy = "read"
}
node "" {
  policy = "write"
}
session "" {
  policy = "read"
}
session "" {
  policy = "write"
}

AWS IAM Permissions

Until Vault integration is added, the instance pool which is capable of running the Replicator daemon requires the following IAM permissions in order to perform worker pool scaling:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AuthorizeAutoScalingActions",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeScalingActivities",
                "autoscaling:DetachInstances",
                "autoscaling:UpdateAutoScalingGroup"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Sid": "AuthorizeEC2Actions",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:DescribeRegions",
                "ec2:TerminateInstances",
                "ec2:DescribeInstanceStatus"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

Commands

Replicator supports a number of commands (CLI) which allow for the easy control and manipulation of the replicator binary. In-depth documentation about each command can be found on the Replicator commands wiki page.

Command: `agent`

The agent command is the main entry point into Replicator. A subset of the available replicator agent configuration can optionally be passed in via CLI arguments and the configuration parameters passed via CLI flags will always take precedent over parameters specified in configuration files.

Detailed information regarding the available CLI flags can be found in the Replicator agent command wiki page.

Command: `failsafe`

The failsafe command is used to toggle failsafe mode across the pool of Replicator agents. Failsafe mode prevents any Replicator agent from taking any scaling actions on the resource placed into failsafe mode.

Detailed information about failsafe mode operations and the available CLI options can be found in the Replicator failsafe command wiki page.

Command: `init`

The init command creates example job scaling and worker pool scaling meta documents in the current directory. These files provide a starting example for configuring both scaling functionalities.

Command: `version`

The version command displays build information about the running binary, including the release version.

Frequently Asked Questions

When does Replicator adjust the size of the worker pool?

Replicator will dynamically scale-in the worker pool when:

Resource utilization falls below the capacity required to run all current jobs while sustaining the configured node fault-tolerance. When calculating required capacity, Replicator includes scaling overhead required to increase the count of all running jobs by one.
Before removing a worker node, Replicator simulates capacity thresholds if we were to remove a node. If the new required capacity is within 10% of the current utilization, Replicator will decline to remove a node to prevent thrashing.

Replicator will dynamically scale-out the worker pool when:

Resource utilization exceeds or closely approaches the capacity required to run all current jobs while sustaining the configured node fault-tolerance. When calculating required capacity, Replicator includes scaling overhead required to increase the count of all running jobs by one.

When does Replicator perform scaling actions against running jobs?

Replicator will dynamically scale a job when:

A valid scaling policy for the job task-group is present within the job specification meta parameters and has the enabled flag set to true.
A job specification can consist of multiple groups, each group can contain multiple tasks. Resource allocations and count are specified at the group level.
Replicator evaluates scaling thresholds against the resource requirements defined within a group task. If any task within a group is found to violate the scaling thresholds, the group count will be adjusted accordingly.

Contributing

Contributions to Replicator are very welcome! Please refer to our contribution guide for details about hacking on Replicator.

For questions, please check out the Elsevier Core Engineering/replicator room in Gitter.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 166

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (40) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

elsevier-core-engineering / Replicator

Programming Languages

Labels

Projects that are alternatives of or similar to Replicator

Replicator

Download

Running

Permissions

Consul ACL Token Permissions

AWS IAM Permissions

Commands

Command: agent

Command: failsafe

Command: init

Command: version

Frequently Asked Questions

When does Replicator adjust the size of the worker pool?

When does Replicator perform scaling actions against running jobs?

Contributing

Command: `agent`

Command: `failsafe`

Command: `init`

Command: `version`