All Projects → pinterest → orion

pinterest / orion

Licence: Apache-2.0 license
Management and automation platform for Stateful Distributed Systems

Programming Languages

java
68154 projects - #9 most used programming language
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to orion

phoenix
Apache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-70.13%)
Mutual labels:  hadoop, hbase
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+19.48%)
Mutual labels:  hadoop, hbase
Atsd
Axibase Time Series Database Documentation
Stars: ✭ 68 (-11.69%)
Mutual labels:  hadoop, hbase
Learning Spark
零基础学习spark,大数据学习
Stars: ✭ 37 (-51.95%)
Mutual labels:  hadoop, hbase
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+2032.47%)
Mutual labels:  hadoop, hbase
Weblogsanalysissystem
A big data platform for analyzing web access logs
Stars: ✭ 37 (-51.95%)
Mutual labels:  hadoop, hbase
Wifi
基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (+20.78%)
Mutual labels:  hadoop, hbase
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+961.04%)
Mutual labels:  hadoop, hbase
Haproxy Configs
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (+37.66%)
Mutual labels:  hadoop, hbase
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+14174.03%)
Mutual labels:  hadoop, hbase
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1012.99%)
Mutual labels:  hadoop, hbase
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (+109.09%)
Mutual labels:  hadoop, hbase
Dockerfiles
50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+1000%)
Mutual labels:  hadoop, hbase
Nagios Plugins
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+1198.7%)
Mutual labels:  hadoop, hbase
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+972.73%)
Mutual labels:  hadoop, hbase
Hadoop cookbook
Cookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (+6.49%)
Mutual labels:  hadoop, hbase
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+7702.6%)
Mutual labels:  hadoop, hbase
Bigdata
💎🔥大数据学习笔记
Stars: ✭ 488 (+533.77%)
Mutual labels:  hadoop, hbase
Antsdb
AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (+28.57%)
Mutual labels:  hadoop, hbase
Hbaseclient
HBase客户端数据管理软件
Stars: ✭ 135 (+75.32%)
Mutual labels:  hadoop, hbase

Orion

Orion is a generalized pluggable management and automation platform for stateful distributed systems. Orion provides a unified interface of one or more clusters to both human and machine operators. Orion is capable of efficiently handling thousands of nodes spread across of 10s of clusters.

Our intent is to use automation to handle commonly encountered operations issues and build a library of learnings from experiences of various large scale environments like ours.

Problem

Orion aims to address the following problems:

  • Lack of single console to manage large clusters of Stateful Distributed Systems: Present open source tooling lacks ability to manage 10s of clusters and 1000s of nodes simultaneously, requiring engineers to switch between multiple consoles and perform manual correlations.

  • Conflicts between human and automation operations: When automation scripts are implemented they mostly lack the visibility into out of band human operations and vice-a-versa as a single control surface is missing for machine and human operations this has proven to cause catastrophic failures e.g. engineer manually restarts process but automation system thinks that the node is down and replaces it causing stability issues in the cluster.

  • Missing generalized community learnings on operations: There currently is a lack of a generic unified interface and "store" for sharing community learnings for operations of various systems this leads to various companies having to rewrite remediations of problems. e.g. topic rebalancing in Kafka, auto replacement of slow HDFS nodes, concurrent rolling upgrades etc.

  • Missing ability for sensor fusion for automation: Lack of ability to fuse information from multiple sensors to find root cause of an issue on cluster to make best decisions for automated remediations. e.g. finding if Kafka replica lag is due to a slow leader or a slow follower or due to increased client load or due to faulty readings for replica lag.

Key Features

  • Unified Interface: Orion provides a unified interface to implement human and automation actions along with a barrier to prevent conflicts

  • Library of Remediations: Orion comes with a pre-defined set of common Sensors, Operators and Actions and interfaces to define new ones to build on-top of learnings

  • Scalable: Ability to handle thousands of nodes and 10s of clusters efficiently

  • Pluggable: Orion was developed with the understanding that not all distributed systems behave the same way, pluggability, extensibility and abstraction is at the very core of Orion

Current State

Orion is actively under development and the API may change over time.

Orion currently supports management of the following systems:

Usage

Orion allows implementations of user defined Actions which are made available via both UI and as well as automated Operators which allows engineers to program automated remediation of issues based on information from Sensors in a large environment.

Example:

  • Safely Replace Nodes in a Cloud Environment
  • Concurrent Rolling Restart
  • Concurrent Rolling Upgrade
  • Execution of custom workflows like Kafka topic rebalancing
  • Maintain settings e.g. monitor and fix topic configurations in Kafka

Temporarily move topics off of particular brokers

In order to temporarily move all topics off of a set of brokers, you can add a brokersetOverrides.json in a cluster's config folder with the brokerset.json. This file should take the form:

{
  "startBrokerIn": 1,
  "endBrokerIn": 3,
  "startBrokerOut": 4,
  "endBrokerOut": 6
}

This configuration will make sure that any brokerset that includes the brokers from 1-3 are replaced with the brokers 4-6 This makes it easier to move multiple topics off of particular brokers for maintenance. Once finished, remove the file to let the original brokers reassert themselves.

Architecture

Image of Orion's Architecture

Quick Start

Detailed quick install can be found here

Dr.Kafka to Orion Migration

If you were previously using Dr.Kafka you can find instructions on migrating to Orion here

Maintainers

  • Ping-Min Lin
  • Ambud Sharma

Contributors

  • Vahid Hashemian
  • Jeff Xiang

License

Orion is distributed under Apache License, Version 2.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].