All Projects → dlanza1 → ExDeMon

dlanza1 / ExDeMon

Licence: GPL-3.0 license
A general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...

Programming Languages

java
68154 projects - #9 most used programming language
shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to ExDeMon

gochanges
**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-36.84%)
Mutual labels:  monitor, monitoring-application, monitoring-tool
leek
Celery Tasks Monitoring Tool
Stars: ✭ 77 (+305.26%)
Mutual labels:  monitor, monitoring-tool
Spark Streaming Monitoring With Lightning
Plot live-stats as graph from ApacheSpark application using Lightning-viz
Stars: ✭ 15 (-21.05%)
Mutual labels:  spark-streaming, monitoring-tool
Hastic Server
Hastic data management server for analyzing patterns and anomalies from Grafana
Stars: ✭ 292 (+1436.84%)
Mutual labels:  monitor, monitoring-tool
assimilation-official
This is the official main repository for the Assimilation project
Stars: ✭ 47 (+147.37%)
Mutual labels:  monitoring-application, monitoring-tool
dawgmon
dawg the hallway monitor - monitor operating system changes and analyze introduced attack surface when installing software
Stars: ✭ 52 (+173.68%)
Mutual labels:  monitoring-application, monitoring-tool
Clearly
Clearly see and debug your celery cluster in real time!
Stars: ✭ 287 (+1410.53%)
Mutual labels:  monitor, monitoring-tool
Iglance
Free system monitor for OSX and macOS. See all system information at a glance in the menu bar.
Stars: ✭ 1,358 (+7047.37%)
Mutual labels:  monitor, monitoring-tool
Laravel Api Health
Monitor first and third-party services and get notified when something goes wrong!
Stars: ✭ 65 (+242.11%)
Mutual labels:  monitor, monitoring-tool
Owl
distributed monitoring system
Stars: ✭ 794 (+4078.95%)
Mutual labels:  monitor, monitoring-tool
tmo-live-graph
A simpe react app that plots a live view of the T-Mobile Home Internet Nokia 5G Gateway signal stats, helpful for optimizing signal.
Stars: ✭ 15 (-21.05%)
Mutual labels:  monitor, monitoring-tool
Myperf4j
High performance Java APM. Powered by ASM. Try it. Test it. If you feel its better, use it.
Stars: ✭ 2,281 (+11905.26%)
Mutual labels:  monitor, monitoring-tool
Monitoror
Unified monitoring wallboard — Light, ergonomic and reliable monitoring for anything.
Stars: ✭ 3,400 (+17794.74%)
Mutual labels:  monitor, monitoring-tool
Container Monitor
容器监控方案汇总
Stars: ✭ 107 (+463.16%)
Mutual labels:  monitor, monitoring-tool
Monitorfe
🍉前端埋点监控,提供前端 JS 执行错误,第三方资源加载异常,Ajax 请求错误监控
Stars: ✭ 190 (+900%)
Mutual labels:  monitor, monitoring-tool
Moreco
moreco 是一个能够为小、中、大型项目提供最合适架构的一条龙生态系统。满足项目从小型到中型至大型的衍变过程。从编码到监控至运维都满足、且各种功能都插件化,支持插件间的切换。支持Spring Boot、Spring Cloud、Axon 无缝升级
Stars: ✭ 231 (+1115.79%)
Mutual labels:  monitor
Pagerbeauty
📟✨ PagerDuty on-call widget for monitoring dashboard. Datadog and Grafana compatible
Stars: ✭ 250 (+1215.79%)
Mutual labels:  monitor
Cryptotrader
A cryptocurrency trader for all famous exchanges
Stars: ✭ 228 (+1100%)
Mutual labels:  monitor
Wam
Web App Monitor
Stars: ✭ 216 (+1036.84%)
Mutual labels:  monitor
wowstat
A World of Warcraft realm status monitor
Stars: ✭ 20 (+5.26%)
Mutual labels:  monitor

ExDeMon: extract, define and monitor metrics

A general purpose metric monitor implemented with Apache Spark.

Metrics can come from several sources like Kafka, results and actions can be sunk to Elastic or any other system, new metrics can be defined combining other metrics, different analysis can be applied, notifications, configuration can be updated without restarting, it can detect missing metrics, ...

This tool was introduced at Spark Summit 2017 conference, you can watch the talk here.

User's manual

An example of a monitored metric can be observed in the following image. As it can be observed, thresholds around the value are calculated and statuses are generated if analyzed value exceed these limits. Action can be triggered if certain statuses like error or warning are maintained during some time.

Example of monitored metric

Key features

  • Stand-alone or distributed (scalable) execution, all possible platforms that Spark is compatible with.
  • Metrics value can be float, string or boolean.
  • New metrics can be defined. Mathematical operations can be applied. Value of new metrics can be computed by aggregating different incoming metrics.
  • Several monitors can be declared, each monitor can have a metric filter, a metric analysis and triggers.
  • Several metric sources can be declared. Incoming data can have different schemas that can be configured to produce metrics.
  • One analysis result sink is shared by all monitors.
  • Several actuators, to be able to perform different actions.
  • Components: properties source, metrics source, analysis, analysis results sink, trigger and actuators. They can be easily replaced.
  • Some built-in components: Kafka source, different analysis, Elastic sink, e-mail, ...
  • Metrics can arrive at different frequencies.
  • Configuration can be updated while running. Configuration comes from an external source (Apache Zookeeper, HTTP request, data base, files, ...).
  • Detection of missing metrics.

An image that describes some of the previous concepts and shows the data flow in the streaming job can be seen here.
Data flow

Define new metrics

The value of these defined metrics is computed from an equation.

This equation can have variables, these variables represent incoming metrics. So, values from several metrics can be aggregated in order to compute the value for the new metric.

Metrics can be grouped by (e.g. cluster) in order to apply the equation to a set of metrics.

Some possibilities of defined metrics could be:

  • Multiply all metrics by 10: value * 100
  • Compute the ratio read/write for all machines: (groupby: hostname) readbytes / writebytes
  • Temperature inside minus temperature outside: tempinside - tempoutside
  • Average of CPU usage of all machines per cluster
  • Total throughput of machines per cluster in production
  • Count log lines of the previous hour
  • Threshold for /tmp/ directory usage.
<defined-metric-id>.value = !shouldBeMonitored || (trim(dir) == "/tmp/") && (abs(used / capacity) > 0.8)

Easy debugging, you would get:

# With errors
!(var(shouldBeMonitored)=true)=false || ((trim(var(dir)=" /tmp/  ")="/tmp/" == "/tmp/")=true && (abs((var(used)=900.0 / var(capacity)={Error: no value for the last 10 minutes})={Error: in arguments})={Error: in arguments} > 0.8)={Error: in arguments})={Error: in arguments})={Error: in arguments}

# With successful computation
!(var(shouldBeMonitored)=true)=false || ((trim(var(dir)=" /tmp/  ")="/tmp/" == "/tmp/")=true && (abs((var(used)=900.0 / var(capacity)=1000.0)=0.9)=0.9 > 0.8)=true)=true)=true

Monitors

Metrics are consumed by several sources and sent to all monitors.

Many monitors can be declared. Each monitor has a filter to determine to which metrics it should be applied. Filtered metrics are analyzed to determine the current status of the metric. Several triggers can be configured to raise actions.

Results from analysis can be sunk to external storages and actions are processed by actuators.

Components

They are considered parts of the processing pipeline that can be easily replaced by other built-in components or by an externally developed component.

If you are willing to develop any component, look at the developers guide.

Properties source

This component is meant to consume configuration properties from an external source.

This source is periodically queried and the job will be updated with new configuration.

Metric source

This component is meant to consume metrics from a source and generate an stream of metrics.

Several sources can be declared for the job. All monitors consume from all sources.

Consumed metrics could come with different schemas. Each schema can be declared to parse metrics.

Built-in metric sources:

  • Kafka.

Metric analysis

This component is meant to determine the status (error, warning, exception, ok) of each of the incoming metrics.

Each monitor configures its own analysis.

Built-in metric analysis:

  • Fixed threshold: error and warning thresholds.
  • Recent activity: error and warning thresholds are computed using average and variance from recent activity.
  • Percentile: error and warning thresholds are computed based on percentiles from recent activity.
  • Seasonal: a season is configured (hour, day or week), using a learning coefficient, average and variance are computed along the season, these two values are used to calculate error and warning thresholds.

Analysis results sink

Analysis produce results for each of the incoming metrics. These results can be sunk to an external storage for watching the metric and analysis results.

Only one analysis results sink is declared for the job. All monitors use this sink.

Built-in analysis results sink:

  • Elastic.
  • HTTP (POST).

Trigger

A trigger determine when to raise an action based on analysis results.

Several triggers can be configured in a monitor.

Built-in triggers:

  • Statuses: raise an action as soon as it receives a metric with one of the configured statuses.
  • Constant status: if a metric has been in configured statuses during a certain period.
  • Percentage status: if a metric has been in configured statuses during a percentage of a certain period.

Actuators

Actions triggered by monitors are processed by one or several actuators. Actuators can save actions into an external storage, send e-mails, run jobs, etc.

Built-in actuators:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].