All Projects → tmobile → monarch

tmobile / monarch

Licence: Apache-2.0 License
App-level Chaos Engineering

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to monarch

chaostoolkit-azure
Chaos Toolkit Extension for Azure
Stars: ✭ 21 (-19.23%)
Mutual labels:  chaos-engineering, chaostoolkit-extension
chaostoolkit-prometheus
Prometheus Extension for the Chaos Toolkit
Stars: ✭ 13 (-50%)
Mutual labels:  chaos-engineering, chaostoolkit-extension
chaostoolkit-spring
Spring Chaos Monkey driver for the Chaos Toolkit
Stars: ✭ 12 (-53.85%)
Mutual labels:  chaos-engineering, chaostoolkit-extension
bounded-disturbances
A k6/.NET red/green load testing workshop
Stars: ✭ 39 (+50%)
Mutual labels:  chaos-engineering
CloudRaider
A resiliency tool that automates Failure mode effect analysis tests, simplifying complex testing with a behavior-driven development and testing approach. Provides a programmatic way to execute controlled failures in AWS and a BDD way to write test cases, allowing test plans themselves to become test cases that can be executed as is.
Stars: ✭ 26 (+0%)
Mutual labels:  chaos-engineering
chaos-operator
chaos engineering via kubernetes operator
Stars: ✭ 90 (+246.15%)
Mutual labels:  chaos-engineering
v1-litmus-docs
Documentation for the Litmus Project
Stars: ✭ 19 (-26.92%)
Mutual labels:  chaos-engineering
cf-sample-app-spring
Introduction to Cloud Foundry app deployment and management concepts.
Stars: ✭ 30 (+15.38%)
Mutual labels:  cloudfoundry
chaosmonkey
Go client to the Chaos Monkey REST API
Stars: ✭ 54 (+107.69%)
Mutual labels:  chaos-engineering
cf-swagger
Collection of Swagger descriptions and applications for CloudFoundry BOSH
Stars: ✭ 22 (-15.38%)
Mutual labels:  cloudfoundry
litmus-on-okteto
Use Litmus and Okteto Cloud to show you how to start Chaos testing your Kubernetes applications.
Stars: ✭ 32 (+23.08%)
Mutual labels:  chaos-engineering
platform-chaos
A node sdk for building services capable of injecting chaos into PaaS offerings. ⚙️ 🌩
Stars: ✭ 18 (-30.77%)
Mutual labels:  chaos-engineering
openbrokerapi
A python package for the V2 CF Service Broker API
Stars: ✭ 32 (+23.08%)
Mutual labels:  cloudfoundry
openchaos
Open Chaos Initiative
Stars: ✭ 28 (+7.69%)
Mutual labels:  chaos-engineering
heroku-buildpack-deno
Heroku Buildpack for Deno
Stars: ✭ 72 (+176.92%)
Mutual labels:  cloudfoundry
cf-tools
Useful shell scripts for Cloud Foundry API v2
Stars: ✭ 22 (-15.38%)
Mutual labels:  cloudfoundry
chaos-exporter
Prometheus Exporter for Litmus Chaos Metrics
Stars: ✭ 25 (-3.85%)
Mutual labels:  chaos-engineering
chaoshub-archive
[ARCHIVED] ChaosHub - Your Chaos Engineering Control Plane
Stars: ✭ 16 (-38.46%)
Mutual labels:  chaos-engineering
PCC-Sample-App-PizzaStore
Example sample application which uses Pivotal Cloud Cache(PCC) as a service on Pivotal Cloud Foundry
Stars: ✭ 18 (-30.77%)
Mutual labels:  cloudfoundry
routing-release
This is the BOSH release for cloud foundry routers
Stars: ✭ 36 (+38.46%)
Mutual labels:  cloudfoundry

Monarch

This is a series of tools for Chaos Toolkit (CTK) which can perform targeted experiments on applications deployed in Cloud Foundry.

Available Experiments

  • Block general network traffic
    • Block all incoming traffic to the application
    • Block all outgoing traffic from the application
  • Block service traffic
    • Auto-detection of bound services and support for manually specified non-bound services
    • Block all outgoing traffic from the application to one or more bound services
    • Block all incoming traffic form the application to one or more bound services
  • Manipulate all network traffic from an application (including to its services)
    • Latency
    • Packet loss
    • Packet duplication
    • Packet corruption
  • Impose bandwidth restrictions
    • Application download bandwidth shaping (using queuing)
    • Application upload bandwidth limiting (using policing)
  • Perform network speedtest from within hosting containers
  • Crash one or more random application instances
  • Kill/start monit processes on hosting diego-cells

Setup

Build Dockerfile

It is recommended that you run Monarch with Docker which you can get here. We have had some issues with cross-platform support for the underlying CLIs.

With docker up and running, run the following within the root of the git repository:

# FIRST Run
docker build -t monarch .
docker run -it \
    --name monarch \
    -v C:\Users\<username>\Documents\monarch\config:/monarch/config # and create the needed files from within.
    monarch

# Subsequent Runs
docker start -ai monarch

# Rebuild Image (You will loose information not in an attached volume)
docker container rm monarch
yes | docker image prune
# goto FIRST Run ;)

Note that the config volume is optional and does not need to be mounted, however, even if you do not have any written already, you should mount the volumes to prevent data loss when you destroy the container during the inevitable upgrade process. Also, if you plan to run tests, make sure to mount the testing config volume as well!

From within the docker image, you may now use either the python shell to interact with monarch, or chaostollkit which is installed automatically when the image is built. You will need to login with cf-cli and bosh-cli before attempting to use monarch.

Install Locally

To be used from your experiment, this package must first be installed in the Python environment where chaostoolkit already exists. This package requires at least Python version 3.5 (3.6 if using the chaostoolkit interfaces directly), so translate python as python3 or pyhton3.5 as appropriate for your OS.

From within the source, run:

sudo python setup.py install

Or to install for just your user:

python setup.py install --user

Now you should be able to import the package.

import monarch
print(monarch.__version__)

Third-Party CLI Setup

In order to run the script, it will require that you have the Cloud Foundry CLI installed and the BOSH CLI installed. You will also need to be logged in to the Cloud Foundry CLI as a user with permission to access all apps which are to be targeted and logged in as and you will need to be login as an admin with the BOSH CLI. This is because the script requires ssh access to the bosh vms to make its changes and also prevents applications from needing SSH enabled.

Configuration File

Once the CLIs are ready, create (or modify the existing) configuration file. This file is only necessary for CLI use as it is included within the experiments for Chaos Toolkit.

  • bosh: Information about the bosh cli and environment
    • cmd: The bosh-cli command.
    • env: The environment name for the cf deployment (-e env).
    • cf-dep: The cloud foundry deployment in the bosh environment.
    • cfdot-dc: The diego-cell to use for cfdot queries.
  • cf: Information about the cf cli and environment
    • cmd: The cf-cli command.
  • container-port-whitelist: List of node ports which should be ignored. These are the external ports on the diego-cells.
  • service-whitelist: List of service types which should be ignored. These must be the names displayed in the cf-cli marketplace.
  • quantum: The quantum to use when configuring qdisc perturbance. The recommended 6000 should work without issue.

Sample config.yml or cfg values for Chaos Toolkit.

bosh:
  cmd: bosh2 # bosh CLI to be used
  env: bosh-lite # environment alias name or address 
  cf-dep: cf # Bosh deployment name
  cfdot-dc: diego_cell/0
  credentials: # Optional; CLI will need to be logged in already if not present
    user: iamaperson
    pswd: ideallysomethingsecure
    cacert: |  # include as needed
      -----BEGIN CERTIFICATE-----
      ...
      -----END CERTIFICATE-----
cf:
  cmd: cf
  credentials: # Optional; CLI will need to be logged in already if not present
    user: iamaperson
    pswd: hopefullysomethingdifferent
    api: cf.example.com
    skip_ssl_validation: true # add as needed (false by default)
container-port-whitelist:
 - 22
 - 2222
host-port-whitelist: []
service-whitelist:
 - logger
quantum: 6000
#services:  # custom service definitions, not needed for bound services
#  - name: google
#    host: google.com
#    ports:
#     - ['tcp', 80]
#     - ['tcp', 443]
#     - ['icmp', 'all']

Usage

There are two ways you can call these scripts. The first is the Python Shell which will allow you to manually block services or applications and then unblock them at your leisure. The second is through the actions and probes which should be called by Chaos Toolkit.

Chaos Toolkit Interface

If you have not installed the monarch package, then make sure you run Chaos Toolkit from this directory (the root of this repository) using python -m chaostoolkit run exp.json or else the monarch module will not be found. Otherwise just use chaos run exp.json from any directory.

Currently, the Chaos Toolkit interface does not support saving information about what was targeted, which should be okay for the time being as we have yet to observe Cloud Foundry moving app instances as a result of any of these actions. Though it is a good reason to be cautious of its use as it simply re-queries again when unblocking, so if something did move, it will not remove the old rule in the location the app is no longer at. If you need to manually verify that all of the rules have been removed, you can go through each diego-cell in the Cloud Foundry deployment and run iptables -L | grep DROP to see if any rules are lingering. (This script should be the only source of DROP rules).

The following is a sample, Chaos-Toolkit experiment file to block all traffic to the application.

{
  "version": "0.1.0",
  "title": "Blocking spring-music makes it unreachable.",
  "description": "This is a testing experiment to verify the script's block traffic function works.",
  "tags": ["cloudfoundry", "bosh", "springboot"],
  "configuration": {
    "TODO": "Some of this needs to be part of the application configuration since the user of this would not know what the cli commands are for instance.",
    "bosh": {
      "cmd": "bosh2",
      "env": "tt-stg02",
      "cf-dep": "cf-da0ba81cb255ad93a508",
      "cfdot-dc": "diego_cell/0"
    },
    "cf": {
      "cmd": "cf"
    },
    "container-port-whitelist": [22, 2222],
    "host-port-whitelist": [],
    "service-whitelist": ["T-Logger"],
    "quantum": 6000
  },
  "steady-state-hypothesis": {
    "title": "We can access the application and other neighboring applications (This should fail because we block all traffic)",
    "probes": [
      {
        "type": "probe",
        "name": "spring-music-responds",
        "tolerance": 200,
        "provider": {
          "type": "http",
          "url": "http://spring-music-interested-bonobo.apps.tt-stg02.cf.t-mobile.com/"
        }
      },
      {
        "type": "probe",
        "name": "spring-music2-responds",
        "tolerance": 200,
        "provider": {
          "type": "http",
          "url": "http://spring-music2-lean-sable.apps.tt-stg02.cf.t-mobile.com/"
        }
      }
    ]
  },
  "method": [
    {
      "type": "action",
      "name": "block-traffic",
      "provider": {
        "type": "python",
        "module": "monarch.pcf.actions",
        "func": "block_traffic",
        "arguments": {
          "org": "sys-tmo",
          "space": "test",
          "appname": "spring-music"
        }
      }
    }
  ],
  "rollbacks": [
    {
      "type": "action",
      "name": "unblock-traffic",
      "provider": {
        "type": "python",
        "module": "monarch.pcf.actions",
        "func": "unblock_traffic",
        "arguments": {
          "org": "sys-tmo",
          "space": "test",
          "appname": "spring-music"
        }
      }
    }
  ]
}

CLI Interface

For now, there is no CLI interface, instead use an interactive python shell session. See bleow.

From Python Shell

Example session:

from monarch.pcf.config import Config
from monarch.pcf.app import App

Config().load_yaml('config/tt-stg02.yml')

app = App.discover('sys-tmo', 'ce-service-registry', 'spring-music')

app.block()
app.unblock()

app.crash_random_instance(2) # will require that you rediscover the app once CF brings a new container up
app = App.discover('sys-tmo', 'ce-service-registry', 'spring-music')

app.block_services('musicdb')
app.unblock_services()

Tests

Unit tests are written with pytest and can be run with ./setup.py test. Before running the tests, you will need to add tests/config/app_test.yml which is the same as the above configuration with the following appended:

# ...
testing:
  org: coolkids
  space: ce-testing
  appname: spring-music
  push-app: true
  db-market-name: p-mysql
  db-plan: 100mb
  db-instance-name: musicdb

You will also need to include the credential sections for the bosh and cf cli configs. If push-app is true, it will expect the org and space to be pre-existing, but deploy spring-music from scratch (meaning db-instance-name and appname should not already exist). It will perform cleanup after tests are done leaving the space in the state it was originally.

These tests can be run from within docker using

License

Monarch is open-sourced under the terms of section 7 of the Apache 2.0 license and is released AS-IS WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].