All Projects → ndelitski → Rancher Alarms

ndelitski / Rancher Alarms

Will kick your ass if found unhealthy service in Rancher environment

Programming Languages

javascript
184084 projects - #8 most used programming language

Labels

Projects that are alternatives of or similar to Rancher Alarms

terraform-metal-k3s
Manage K3s (k3s.io) region clusters on Equinix Metal
Stars: ✭ 41 (-53.41%)
Mutual labels:  rancher
ansible-rancher
Some Ansible plays & roles to install Rancher and Kubernetes Cluster
Stars: ✭ 25 (-71.59%)
Mutual labels:  rancher
Ui
Rancher UI
Stars: ✭ 463 (+426.14%)
Mutual labels:  rancher
monitoring-rancher
🤠How to Set up Rancher Server Monitoring with TIG Stack?
Stars: ✭ 22 (-75%)
Mutual labels:  rancher
rancher-terraform-digitalocean
Terraform module for a rancher server on digitalocean.
Stars: ✭ 21 (-76.14%)
Mutual labels:  rancher
Youtube Videos
Documentation for Techno Tim YouTube Videos
Stars: ✭ 250 (+184.09%)
Mutual labels:  rancher
Certified-Rancher-Operator-Thai
มาเรียนรู้ Kuberntes แบบ On-Premise และ Architecture ของ Rancher ที่ใช้ในการจัดการ Kubernetes Cluster เพื่อนำสู่ Certified Kubernetes Administrator และ Certified Rancer Operator
Stars: ✭ 78 (-11.36%)
Mutual labels:  rancher
Terraform Rancher Ha Example
Terraform files for deploying a Rancher HA cluster in AWS
Stars: ✭ 61 (-30.68%)
Mutual labels:  rancher
metrics-server-on-rancher-2.0.2
Method to Setup Metrics-Server on Kubernetes via Rancher-Deployed Cluster
Stars: ✭ 14 (-84.09%)
Mutual labels:  rancher
Rancher Letsencrypt
🐮 Rancher service that obtains and manages free SSL certificates from the Let's Encrypt CA
Stars: ✭ 318 (+261.36%)
Mutual labels:  rancher
gitlab-docker-rancher-letsencrypt-setup
Automated private dev environment with docker, gitlab CI/CD...
Stars: ✭ 15 (-82.95%)
Mutual labels:  rancher
skalogs-bundle
Open Source data and event driven real time Monitoring and Analytics Platform
Stars: ✭ 16 (-81.82%)
Mutual labels:  rancher
Awesome Rancher
Curated list of Rancher resources
Stars: ✭ 265 (+201.14%)
Mutual labels:  rancher
k3d-demo
Demo of k3d: Tool to run k3s (Kubernetes) in Docker
Stars: ✭ 197 (+123.86%)
Mutual labels:  rancher
Camptocamp Rancher Catalog
Camptocamp's Rancher Catalog
Stars: ✭ 32 (-63.64%)
Mutual labels:  rancher
rancher-redis
A containerized redis master/slave configuration with sentinels for use in Rancher
Stars: ✭ 13 (-85.23%)
Mutual labels:  rancher
ez-rancher
Terraform to provision vSphere VMs, and install Rancher on an RKE cluster
Stars: ✭ 20 (-77.27%)
Mutual labels:  rancher
Rancher Lets Encrypt
Automatically create and manage certificates in Rancher using Let's Encrypt webroot verification via a minimal service
Stars: ✭ 88 (+0%)
Mutual labels:  rancher
Docs Rancher2
Rancher 中文文档,包括Rancher1.x、Rancher 2.x、RKE、K3s、Octopus和Harvester的中文文档,其中Rancher1.x的文档已经不再维护;Rancher 2.x、RKE、K3s、Octopus和Harvester会定期刷新。
Stars: ✭ 50 (-43.18%)
Mutual labels:  rancher
Rancher
Complete container management platform
Stars: ✭ 18,191 (+20571.59%)
Mutual labels:  rancher

rancher-alarms

Send notifications when something goes wrong in rancher

Features

  • Will kick your ass when service goes down and send message when on recover
  • Various notification mechanisms
    • email
    • slack
      • please create an issue if you need more
  • Configure notification mechanisms globally or on a per service level(supported in .json config setup for now)
  • Customize your notification messages

Quick start

Inside Rancher environment using rancher-compose CLI

rancher-alarms:
  image: ndelitski/rancher-alarms
  environment:
    ALARM_SLACK_WEBHOOK_URL:https://hooks.slack.com/services/:UUID
  labels:
    io.rancher.container.create_agent: true
    io.rancher.container.agent.role: environment

How to create Slack Webhook URL

NOTE: Including rancher agent labels is crucial otherwise you need provide rancher credentials manually with RANCHER_* variables

Outside Rancher environment using docker run

docker run \
    -d \
    -e RANCHER_ADDRESS=rancher.yourdomain.com \
    -e RANCHER_ACCESS_KEY=ACCESS-KEY \
    -e RANCHER_SECRET_KEY=SECRET-KEY \
    -e RANCHER_PROJECT_ID=1a8 \
    -e ALARM_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR_SLACK_UUID \
    --name rancher-alarms \
    ndelitski/rancher-alarms

How it works

On startup get a list of services and instantiate healthcheck monitor for each of them if service is in a running state. Removed, purged and etc services will be ignored

List of healthcheck monitors is updated with a pollServicesInterval interval. When service is removed it will be no longer monitored.

When a service transitions to a degraded state, all targets will be invoked to process notification(s).

docker-compose configuration

Docker compose for email notification target

rancher-alarms:
  image: ndelitski/rancher-alarms
  environment:
    RANCHER_ADDRESS:your-rancher.com
    ALARM_SLACK_WEBHOOK_URL:https://hooks.slack.com/services/...

More docker-compose examples see in examples

Configuration

Environment variables

Rancher settings

Could be ignored if you are running inside Rancher environment (service should be started as a rancher agent though)

  • RANCHER_ADDRESS
  • RANCHER_PROJECT_ID
  • RANCHER_ACCESS_KEY
  • RANCHER_SECRET_KEY

Polling settings

  • ALARM_POLL_INTERVAL
  • ALARM_MONITOR_INTERVAL
  • ALARM_MONITOR_HEALTHY_THRESHOLD
  • ALARM_MONITOR_UNHEALTHY_THRESHOLD
  • ALARM_FILTER

Email target settings

  • ALARM_EMAIL_ADDRESSES
  • ALARM_EMAIL_USER
  • ALARM_EMAIL_PASS
  • ALARM_EMAIL_SSL
  • ALARM_EMAIL_SMTP_HOST
  • ALARM_EMAIL_SMTP_PORT
  • ALARM_EMAIL_FROM
  • ALARM_EMAIL_SUBJECT
  • ALARM_EMAIL_TEMPLATE
  • ALARM_EMAIL_TEMPLATE_FILE

Slack target settings

  • ALARM_SLACK_WEBHOOK_URL
  • ALARM_SLACK_CHANNEL
  • ALARM_SLACK_BOTNAME
  • ALARM_SLACK_TEMPLATE
  • ALARM_SLACK_TEMPLATE_FILE

See examples using environment config in docker-compose files

Local json config

{
    "rancher": {
        "address": "rancher-host:port",
        "auth": {
            "accessKey": "<ACCESS_KEY>",
            "secretKey": "<KEEP_YOUR_SECRETS_SAFE>"
        },
        "projectId": "1a5"
    },
    "pollServicesInterval": 10000,
    "filter": [
        "app/*"
    ],
    "notifications": {
        "*": {
            "targets": {
                "email": {
                    "recipients": [
                        "[email protected]"
                    ]
                }
            },
            "healthcheck": {
                "pollInterval": 5000,
                "healthyThreshold": 2,
                "unhealthyThreshold": 3
            },
        },
        "frontend": {
            "targets": {
                "email": {
                    "recipients": [
                        "[email protected]"
                    ]
                }
            }
        }
    },
    "targets": {
        "email": {
            "smtp": {
                "from": "<Alarm Service> [email protected]",
                "auth": {
                    "user": "[email protected]",
                    "password": "Str0ngPa$$"
                },
                "host": "smtp.gmail.com",
                "secureConnection": true,
                "port": 465
            }
        },
        "slack": {
            "webhookUrl": "https://hooks.slack.com/services/YOUR_SLACK_UUID",
            "botName": "rancher-alarm",
            "channel": "#devops"
        }
    }
}

Config file sections

  • rancher Rancher API settings. required
  • pollServicesInterval interval in ms of fetching list of services. required.
  • filter whitelist filter for stack/services names in environment. List of string values. Every string is a RegExp expression so you can use something like this to match all stack services frontend/*. optional
  • notifications per service notification settings. Wildcard means any service required
    • healtcheck monitoring state options. optional defaults are:
    {
      pollInterval: 5000,
      healthyThreshold: 2,
      unhealthyThreshold: 3
    }
    
    • targets what notification targets to use. Will override base target settings in a root targets section. Currently each target must be an Object value. If you have nothing to override from a base settings just place {} as a value. optional
  • targets base settings for each notification target. required

Templates

List of template variables:

  • healthyState HEALTHY or UNHEALTHY
  • state service state like it named in Rancher API
  • prevMonitorState rancher-alarms previous service state name
  • monitorState rancher-alarms service state name - e.g. always degraded for unhealthy
  • serviceName Name of a service in a Rancher
  • serviceUrl Url to a running service in a Rancher UI
  • stackUrl Url to stack in a Rancher UI
  • stackName Name of a stack in a Rancher
  • environmentName Name of a environment in a Rancher
  • environmentUrl URL to environment in a rancher UI

Using variables in template string:

Hey buddy! Your service #{serviceName} become #{healthyState}, direct link to the service #{serviceUrl}

More detailed examples your can see in the examples folder

Roadmap

  • [] Simplify configuration.
  • [] More use of rancher labels and metadata. Alternate configuration through rancher labels/metadata(can be used in a conjunction with initial config).
  • [] Run in a rancher environment as an agent with a new label agent: true. No need to specify keys anymore!
  • [] More notifications mechanisms: AWS SNS, http, sms
  • [x] Support templating
  • [] Test coverage. Setup drone.io
  • [x] Notify when all services operate normal after some of them were in a degraded state
  • [] Refactor code
  • [x] Shrinking image size with alpine linux
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].