All Projects → upgundecha → Howtheysre

upgundecha / Howtheysre

Licence: cc0-1.0
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

Programming Languages

javascript
184084 projects - #8 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to Howtheysre

Awesome Sre
A curated list of Site Reliability and Production Engineering resources.
Stars: ✭ 7,687 (+10.41%)
Mutual labels:  monitoring, devops, incident-response, alerting, sre, site-reliability-engineering, post-mortem, reliability, on-call
Wheel Of Misfortune
A role-playing game for incident management training
Stars: ✭ 57 (-99.18%)
Mutual labels:  devops, incident-response, sre, chaos-engineering
Cloudprober
An active monitoring software to detect failures before your customers do.
Stars: ✭ 1,269 (-81.77%)
Mutual labels:  monitoring, devops, observability, sre
Netdata
Real-time performance monitoring, done right! https://www.netdata.cloud
Stars: ✭ 57,056 (+719.53%)
Mutual labels:  monitoring, devops, observability, alerting
Gatus
⛑ Gatus - Automated service health dashboard
Stars: ✭ 1,203 (-82.72%)
Mutual labels:  monitoring, devops, alerting
Kapo
Wrap any command in a status socket
Stars: ✭ 45 (-99.35%)
Mutual labels:  monitoring, devops, sre
Defcon24 Infra Monitoring Workshop
Defcon24 Workshop Contents : Ninja Level Infrastructure Monitoring
Stars: ✭ 104 (-98.51%)
Mutual labels:  monitoring, devops, infrastructure
Tcpprobe
Modern TCP tool and service for network performance observability.
Stars: ✭ 207 (-97.03%)
Mutual labels:  monitoring, observability, sre
Swagger Stats
API Observability. Trace API calls and Monitor API performance, health and usage statistics in Node.js Microservices.
Stars: ✭ 559 (-91.97%)
Mutual labels:  monitoring, devops, observability
Prom2teams
prom2teams is an HTTP server built with Python that receives alert notifications from a previously configured Prometheus Alertmanager instance and forwards it to Microsoft Teams using defined connectors
Stars: ✭ 122 (-98.25%)
Mutual labels:  monitoring, devops, alerting
Minicron
🕰️ Monitor your cron jobs
Stars: ✭ 2,351 (-66.23%)
Mutual labels:  monitoring, devops, infrastructure
Devops Readme.md
What to Read to Learn More About DevOps
Stars: ✭ 398 (-94.28%)
Mutual labels:  monitoring, devops, sre
Sensu Go
Simple. Scalable. Multi-cloud monitoring.
Stars: ✭ 625 (-91.02%)
Mutual labels:  monitoring, observability, alerting
Sysadmin Reading List
A reading/viewing list for larval stage sysadmins and SREs
Stars: ✭ 240 (-96.55%)
Mutual labels:  best-practices, devops, sre
Performance-Engineers-DevOps
This repository helps performance testers and engineers who wants to dive into DevOps and SRE world.
Stars: ✭ 35 (-99.5%)
Mutual labels:  site-reliability-engineering, sre, chaos-engineering
Cabot
Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty
Stars: ✭ 5,209 (-25.18%)
Mutual labels:  monitoring, devops, alerting
Awesome Sre Tools
A curated list of Site Reliability and Production Engineering Tools
Stars: ✭ 186 (-97.33%)
Mutual labels:  monitoring, devops, sre
Kubernetes Failure Stories
Compilation of public failure/horror stories related to Kubernetes
Stars: ✭ 6,217 (-10.7%)
Mutual labels:  sre, post-mortem, reliability
wazuh-puppet
Wazuh - Puppet module
Stars: ✭ 25 (-99.64%)
Mutual labels:  incident-response, hacktoberfest-accepted, hacktoberfest2021
xk6-chaos
xk6 extension for running chaos experiments with k6 💣
Stars: ✭ 18 (-99.74%)
Mutual labels:  reliability, sre, chaos-engineering

How they SRE

PRs Welcome CI

How they SRE

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

Introduction

How They SRE is a curated knowledge repository of best practices, tools, techniques, and culture of SRE adopted by the leading technology or tech-savvy organizations.

Many organizations regularly come forward and share their best practices, tools, techniques and offer an insight into engineering culture on various public platforms like engineering blogs, conferences & meetups. The content is curated from these avenues and shared in this repository.

Note to readers: This list refers to some of the articles, posts, videos, tools, and techniques published before 2015. Please use such material with caution as there may be recent advances in technology and practices which offer better alternatives and perspectives.

Topics

  • Site Reliability Engineering
  • Hiring and Building SRE teams
  • SRE Culture
  • DevOps
  • Monitoring & Observability
  • Alerting
  • Incident Response & Post-Mortem
  • On-Call
  • Testing in Production
  • Chaos Engineering
  • Automation
  • Performance

Organizations

Achievers

Blog Posts

Airbnb

Blog Posts

Algolia

Blog Posts

Alibaba Cloud

Blog Posts

Asana

Blog Posts

ASOS

Blog Posts

Atlassian

Blog Posts

BackMarket

Blog Posts

Baidu

Videos

Basecamp

Blog Posts

Books

Bloomberg

Videos

Booking.com

Blog Posts

Videos

Capital One

Blog Posts

Major incidents & analysis reports

Videos

Coinbase

Blog Posts

DAZN

Blog Posts

DBS

Blog Posts

Videos

DeepSource

Blog Posts

Dream11

Blog Posts

Dropbox

Blog Posts

Videos

eBay

Blog Posts

Video

Epic Games

Video

Etsy

Blog Posts

Videos

Expedia

Blog Posts

Facebook

Blog Posts

Videos

Fastly

Videos

Getaround

Blog Posts

GitHub

Blog Posts

Major incidents & analysis reports

Videos

GitLab

Blog Posts

GoCardless

Blog Posts

Major incidents & analysis reports

GoDaddy

Blog Posts

Gojek

Blog Posts

Google

Blog Posts

Videos

Grab

Blog Posts

Grammarly

Blog Posts

Gusto

Blog Posts

Halodoc

Blog Posts

Heroku

Blog Posts

Indeed

Blog Posts

Videos

Khan Academy

Blog Posts

LinkedIn

Blog Posts

Videos

Tools

Loggi

Blog Posts

Macquarie

Blog Posts

Mattermost

Blog Posts

Meituan (美团)

Blog Posts

Mercari

Blog Posts

Microsoft

Videos

MIRO

Blog Posts

Monzo

Blog Posts

Videos

Tools

Netflix

Blog Posts

Major incidents & analysis reports

Videos

Podcasts

Tools

New Relic

Blog Posts

Nubank

Blog Posts

PayPal

Blog Posts

Videos

Picnic

Blog Posts

Pinterest

Blog Posts

Videos

Postman

Blog Posts

Red Hat

Blog Posts

Riot Games

Blog Posts

Salesforce

Blog Posts

Schibsted Media

Blog Posts

Scribd

Blog Posts

Shopify

Blog Posts

Videos

Sky Betting and Gaming

Blog Posts

Slack

Blog Posts

Videos

Slalom Build

Blog Posts

Soundcloud

Blog Posts

Spotify

Blog Posts

Videos

Squarespace

Blog Posts

Videos

Stack Overflow

Blog Posts

Videos

Strava

Blog Posts

Stripe

Blog Posts

Videos

Target

Blog Posts

Teads

Blog Posts

Tinder

Blog Posts

Tokopedia

Blog Posts

Trivago

Blog Posts

Twilio

Blog Posts

Twitter

Blog Posts

Uber

Blog Posts

Videos

upGrad

Blog Posts

VGW

Blog Posts

Videos

Wikimedia Foundation

Videos

Wix

Blog Posts

Yelp

Blog Posts

Videos

Zalando

Blog Posts

Zerodha

Blog Posts

Zomato

Blog Posts

SRECon Mix Playlist

Videos


Resources

Books

Events

Other Resources

Awesome Lists

SRE Resources from various organizations

Incidents & postmortems

Newsletters

Credits

Other How They... repos

Contribute

Contributions welcome! Read the contribution guidelines first.

License

CC0

To the extent possible under law, Unmesh Gundecha has waived all copyright and related or neighboring rights to this work.


If you decide to use this anywhere please give a credit to @upgundecha on twitter, also If you like my work, check out other projects on my Github.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].