All Projects → blue-yonder → Mesos Threshold Oversubscription

blue-yonder / Mesos Threshold Oversubscription

Licence: apache-2.0
Simple, threshold-based oversubscription modules for Apache Mesos

Labels

Projects that are alternatives of or similar to Mesos Threshold Oversubscription

Linkerd Examples
Examples of how to configure and run linkerd
Stars: ✭ 370 (+760.47%)
Mutual labels:  mesos
Awesome Mesos
Everything about Apache Mesos
Stars: ✭ 532 (+1137.21%)
Mutual labels:  mesos
Nix Aurora
Nix on Aurora on Mesos
Stars: ✭ 6 (-86.05%)
Mutual labels:  mesos
Swan
A Distributed, Highly Available Mesos Scheduler, Inspired by the design of Google Borg
Stars: ✭ 411 (+855.81%)
Mutual labels:  mesos
Chronos
Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
Stars: ✭ 4,303 (+9906.98%)
Mutual labels:  mesos
Mesos Go
Go language bindings for Apache Mesos
Stars: ✭ 550 (+1179.07%)
Mutual labels:  mesos
Cook
Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Stars: ✭ 314 (+630.23%)
Mutual labels:  mesos
Daskos
Apache Mesos backend for Dask scheduling library
Stars: ✭ 28 (-34.88%)
Mutual labels:  mesos
Mesos
Apache Mesos
Stars: ✭ 4,824 (+11118.6%)
Mutual labels:  mesos
Singularity
Scheduler (HTTP API and webapp) for running Mesos tasks—long running processes, one-off tasks, and scheduled jobs. #hubspot-open-source
Stars: ✭ 793 (+1744.19%)
Mutual labels:  mesos
Minimesos
The experimentation and testing tool for Apache Mesos - NO LONGER MAINTANED!
Stars: ✭ 429 (+897.67%)
Mutual labels:  mesos
Marathon Lb
Marathon-lb is a service discovery & load balancing tool for DC/OS
Stars: ✭ 449 (+944.19%)
Mutual labels:  mesos
Go Health
Library for enabling asynchronous health checks in your service
Stars: ✭ 588 (+1267.44%)
Mutual labels:  mesos
Docker practice
Learn and understand Docker technologies, with real DevOps practice!
Stars: ✭ 19,768 (+45872.09%)
Mutual labels:  mesos
Linkerdcosdockerfile
Linker Dcos DockerFile&DockerCompose yml file
Stars: ✭ 8 (-81.4%)
Mutual labels:  mesos
Bk Bcs
蓝鲸智云容器管理平台(BlueKing Container Service)
Stars: ✭ 368 (+755.81%)
Mutual labels:  mesos
Scope
Monitoring, visualisation & management for Docker & Kubernetes
Stars: ✭ 5,247 (+12102.33%)
Mutual labels:  mesos
Mesos Cli
Alternative Apache Mesos CLI
Stars: ✭ 37 (-13.95%)
Mutual labels:  mesos
Traefik
The Cloud Native Application Proxy
Stars: ✭ 36,089 (+83827.91%)
Mutual labels:  mesos
Toil
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
Stars: ✭ 733 (+1604.65%)
Mutual labels:  mesos

Threshold-based Mesos Oversubscription

LICENSE

This repository contains two simple Mesos oversubscription modules:

  • ThresholdResourceEstimator: Informs the Mesos master about resources that can be oversubscribed on the associated agent.

    Similar to the fixed resource estimator provided by Mesos, it allows operators to configure a fixed amount of oversubscribable revocable resources per agent. However, once system utilization reaches a defined threshold, resources will be cut to the currently used amount, avoiding further scheduling of revocable tasks.

  • ThresholdQoSController: Informs the Mesos agent that particular corrective actions need to be made to prevent quality-of-service violations (i.e. system overload).

    Following the same principle as the ThresholdResourceEstimator, corrective actions are taken whenever the system utilization reaches a configurable threshold. The controller kills one revocable task per iteration and thus slowly corrects resource estimations that turned out to be over-optimistic.

Is it any good?

The threshold-based approach enabled us to double the peak CPU and peak memory utilization in our Mesos/Aurora clusters. Your mileage may vary significantly, so please take this statement with a grain of salt.

Mechanics

Resource estimator and QoS controller aim to work in unison to ensure that the system utilization is never approaching critical levels that could negativly affect non-revocable tasks.

threshold mechanics

Revocable resources will only be offered if the system utilization remains below the estimation threshold. If utilization spikes above, no more revocable resources are offered. If it further surpasses the QoS threshold, the controller will begin to kill revocable tasks until the utilization drops. Assuming the utilization drops below the QoS but not below the estimation threshold, the freed resources will not be re-offered, thus preventing further overload of the host.

Installation

This project uses CMake. Build requires Mesos development headers and a compatible version of GCC. The Vagrant file in this repository creates a proper build environment for Debian Jessie.

Build and installation follow the usual CMake tripplet. You can use the CXX and CC environment variable to ensure that a compatible compiler is selected.

mkdir -p build
cd build
cmake ..
make
make test
make install

Configuration

The estimator and controller implementations assume the Mesos agent is using the following isolation mechanims:

--isolation=cgroups/cpu,cgroups/mem  # enable cgroup isolation and accounting
--revocable_cpu_low_priority         # run revocable tasks with minimal CPU shares
--cgroups_enable_cfs                 # enforce upper limit of usable CPU shares
                                     # (optional but recommended)

To enable the oversubscription modules, launch the Mesos agent with the following flags. The given example assumes you have a host with 256000 MB RAM and 40 CPUs that you want to oversubscribe with at most 96000 MB RAM and 16 cores:

--resource_estimator="com_blue_yonder_ThresholdResourceEstimator"
--qos_controller="com_blue_yonder_ThresholdQoSController"
--oversubscribed_resources_interval=15secs
--qos_correction_interval_min=15secs

--modules='{
  "libraries": {
    "file": "/<path>/<to>/libthreshold_oversubscription.so",
    "modules": [
      {
        "name": "com_blue_yonder_ThresholdResourceEstimator",
        "parameters": [
          { "key": "resources",
            "value": "cpus:16;mem:96000"},
          { "key": "load_threshold_1min",
            "value": "64"},
          { "key": "load_threshold_5min",
            "value": "48" },
          { "key": "load_threshold_15min",
            "value": "32" },
          { "key": "mem_threshold",
            "value": "200000"}
        ]
      },
      {
        "name": "com_blue_yonder_ThresholdQoSController",
        "parameters": [
          { "key": "load_threshold_5min",
            "value": "60" },
          { "key": "load_threshold_15min",
            "value": "40" },
          { "key": "mem_threshold",
            "value": "230000"}
        ]
      }
    ]
  }
}'

Make sure to set the memory thresholds low enough so that the operating system can maintain sufficiently large file buffers and caches. This will also prevent the Linux OOM from being triggered which could potentially kill a non-revocable task.

By default, a Mesos framework receives only non-revocable resources. An explicit opt-in is required to receive revocable resources as well. For example, for oversubscription with Apache Aurora the following scheduler flags are required:

-receive_revocable_resources  # opt-in for revocable resource offers
-enable_revocable_cpus        # schedule revocable jobs using revocable CPU resources
-enable_revocable_ram         # schedule revocable jobs using revocable RAM resources

Known Limitations

The main design goals of the threshold modules was a simple, stateless implementation. This comes with a few limitations:

  • Being based on coarse-grained load and memory thresholds, the oversubscription technique described here might not be suitable for aggressive oversubscription on systems with very latency sensitive services. If you have such a requirement, either be very conservative or have a look at Intel/Mesosphere Serenity.

  • If you run your agents with --cgroups_enable_cfs you need at least a Linux kernel 3.8 or later. Otherwise the system load used by estimator and QoS controller will not correctly account for throttled processes. Further details can be found in this LWN article.

  • When the CPU is overloaded, a random revocable task is killed rather than the most aggressive one.

We may feel compelled to address some of these limitations in the future. Pull requests are welcome as well :-)

License

This repository is licensed under the Apache License, Version 2.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].