All Projects → xing → Kubernetes Oom Event Generator

xing / Kubernetes Oom Event Generator

Licence: apache-2.0
Generate a Kubernetes Event when a Pod's container has been OOMKilled

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to Kubernetes Oom Event Generator

Cloudprober
An active monitoring software to detect failures before your customers do.
Stars: ✭ 1,269 (+1264.52%)
Mutual labels:  monitoring, prometheus, k8s
Homer7 Docker
HOMER 7 Docker Images
Stars: ✭ 47 (-49.46%)
Mutual labels:  monitoring, prometheus
Giropops Monitoring
Full stack tools for monitoring containers and other stuff. ;)
Stars: ✭ 1,019 (+995.7%)
Mutual labels:  monitoring, prometheus
Ciao
HTTP checks & tests (private & public) monitoring - check the status of your URL
Stars: ✭ 1,322 (+1321.51%)
Mutual labels:  monitoring, prometheus
Zookeeper exporter
Prometheus exporter for monitoring a ZooKeeper ensemble.
Stars: ✭ 32 (-65.59%)
Mutual labels:  monitoring, prometheus
Karma
Alert dashboard for Prometheus Alertmanager
Stars: ✭ 1,007 (+982.8%)
Mutual labels:  monitoring, prometheus
Mqtt blackbox exporter
Prometheus Exporter for MQTT monitoring
Stars: ✭ 57 (-38.71%)
Mutual labels:  monitoring, prometheus
Promviz
Visualize the traffic of your clusters in realtime from Prometheus data
Stars: ✭ 884 (+850.54%)
Mutual labels:  monitoring, prometheus
Prometheus.cl
Prometheus.io Common Lisp client
Stars: ✭ 67 (-27.96%)
Mutual labels:  monitoring, prometheus
Graylog Plugin Metrics Reporter
Graylog Metrics Reporter Plugins
Stars: ✭ 71 (-23.66%)
Mutual labels:  monitoring, prometheus
Stackdriver Prometheus Sidecar
A sidecar for the Prometheus server that can send metrics to Stackdriver.
Stars: ✭ 91 (-2.15%)
Mutual labels:  monitoring, prometheus
Nginx Lua Prometheus
Prometheus metric library for Nginx written in Lua
Stars: ✭ 964 (+936.56%)
Mutual labels:  monitoring, prometheus
Go Grpc Prometheus
Prometheus monitoring for your gRPC Go servers.
Stars: ✭ 965 (+937.63%)
Mutual labels:  monitoring, prometheus
Cilium
eBPF-based Networking, Security, and Observability
Stars: ✭ 10,256 (+10927.96%)
Mutual labels:  monitoring, k8s
Prometheus Net
.NET library to instrument your code with Prometheus metrics
Stars: ✭ 944 (+915.05%)
Mutual labels:  monitoring, prometheus
Unifi exporter
Multiarch images for scraping Prometheus metrics from a Unifi Controller. Kubernetes / prometheus-operator compatible.
Stars: ✭ 54 (-41.94%)
Mutual labels:  monitoring, prometheus
Hana sql exporter
SAP Hana SQL Exporter for Prometheus
Stars: ✭ 18 (-80.65%)
Mutual labels:  monitoring, prometheus
Postgresql exporter
A Prometheus exporter for some postgresql metrics
Stars: ✭ 26 (-72.04%)
Mutual labels:  monitoring, prometheus
Sanic Prometheus
Prometheus metrics for Sanic, an async python web server
Stars: ✭ 63 (-32.26%)
Mutual labels:  monitoring, prometheus
Beamium
Prometheus to Warp10 metrics forwarder
Stars: ✭ 82 (-11.83%)
Mutual labels:  monitoring, prometheus

kubernetes-oom-event-generator

Build Status

Generates Kubernetes Event when a container is starting and indicates that it was previously out-of-memory killed.

Design

The Controller listens to the Kubernetes API for new Events and changes to Events. Every time a notification regarding an Event is received it checks whether this Event refers to a "ContainerStarted" event, based on the Reason for the Event and the Kind of the involved object. If this is the case and the Event constitutes a change (meaning it is not a not-changing update, which happens when the resync, that is executed every two minutes, is run) it checks the underlying Pod resource. Should the LastTerminationState of the Pod refer to an OOM kill the controller will emit a Kubernetes Event with a level of Warning and a reason of PreviousContainerWasOOMKilled.

Usage

Usage:
  kubernetes-oom-event-generator [OPTIONS]

Application Options:
  -v, --verbose= Show verbose debug information [$VERBOSE]
      --version  Print version information

Help Options:
  -h, --help     Show this help message

Run the pre-built image xingse/kubernetes-oom-event-generator locally (with local permission):

echo VERBOSE=2 >> .env
docker run --env-file=.env -v $HOME/.kube/config:/root/.kube/config xingse/kubernetes-oom-event-generator

Deployment

Example Clusterrole:

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: xing:controller:kubernetes-oom-event-generator
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - pods/status
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - patch
      - list
      - watch

Run this controller on Kubernetes with the following commands:

kubectl create serviceaccount kubernetes-oom-event-generator \
  --namespace=kube-system

kubectl create -f path/to/example-clusterrole.yml
# alternatively run: `cat | kubectl create -f -` and paste the above example, hit Ctrl+D afterwards.

kubectl create clusterrolebinding xing:controller:kubernetes-oom-event-generator \
  --clusterrole=xing:controller:kubernetes-oom-event-generator \
  --serviceaccount=kube-system:kubernetes-oom-event-generator

kubectl run kubernetes-oom-event-generator \
  --image=xingse/kubernetes-oom-event-generator \
  --env=VERBOSE=2 \
  --serviceaccount=kubernetes-oom-event-generator \
  --namespace=kube-system

Alerting on OOM killed pods

There are many different ways to send alerts when an OOM occurs. We just want to mention two of them here.

Forwarding OOM events to Graylog

Graylog is a popular log management solution, and it includes an alerting feature. See the Graylog docs for more details.

At XING we forward all Kubernetes cluster events to Graylog using our kubernetes-event-forwarder-gelf. This allows us to configure alerts whenever a PreviousContainerWasOOMKilled event generated by the kubernetes-oom-event-generator occurs.

Using kube-state-metrics and Prometheus alerts

When kube-state-metrics is deployed in the cluster and a Prometheus installation is scraping the metrics, you can alert on OOM-killed pods using the prometheus alert manager.

Example alert:

alert: ComponentOutOfMemory
expr: sum_over_time(kube_pod_container_status_terminated_reason{reason="OOMKilled"}[5m])
  > 0
for: 10s
labels:
  severity: warning
annotations:
  description: Critical Pod {{$labels.namespace}}/{{$labels.pod}} was OOMKilled.

The downside is that kube_pod_container_status_terminated_reason always returns to 0 once a container starts back up. See the introduction of kube_pod_container_status_last_terminated_reason for more details.

Developing

You will need a working Go installation (1.11+) and the make program. You will also need to clone the project to a place outside you normal go code hierarchy (usually ~/go), as it uses the new Go module system.

All build and install steps are managed in the central Makefile. make test will fetch external dependencies, compile the code and run the tests. If all goes well, hack along and submit a pull request. You might need to modify the go.mod to specify desired constraints on dependencies.

Make sure to run go mod tidy before you check in after changing dependencies in any way.

Releases

Releases are a two-step process, beginning with a manual step:

  • Create a release commit
  • Run make release, which will create an image, retrieve the version from the binary, create a git tag and push both your commit and the tag

The Travis CI run will then realize that the current tag refers to the current master commit and will tag the built docker image accordingly.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].