Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+146.31%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-96.55%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+108.62%)
waspWASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-95.32%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-85.71%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-77.34%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+1379.8%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-76.11%)
Pdf编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+2857.88%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+304.43%)
CheckovPrevent cloud misconfigurations during build-time for Terraform, Cloudformation, Kubernetes, Serverless framework and other infrastructure-as-code-languages with Checkov by Bridgecrew.
Stars: ✭ 3,572 (+779.8%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-46.8%)
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (-3.2%)
Devops Bash Tools550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (-44.33%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-56.4%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-63.05%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+111.08%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+2702.46%)
Spring Boot Quick🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌
Stars: ✭ 1,819 (+348.03%)
ForemastSpinnaker Pipeline/Infrastructure Configuration and Templating Tool - Pipelines as Code.
Stars: ✭ 263 (-35.22%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (-42.86%)
Vscode Data PreviewData Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (-39.66%)
JanusgraphJanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+953.45%)
PorterKubernetes powered PaaS that runs in your own cloud.
Stars: ✭ 498 (+22.66%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-76.11%)
Dev SetupmacOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+1276.85%)
Cfn nagLinting tool for CloudFormation templates
Stars: ✭ 808 (+99.01%)
Cfn Create Or UpdateCreate or update CloudFormation stack also if no updates are to be performed.
Stars: ✭ 59 (-85.47%)
Gcs ToolsGCS support for avro-tools, parquet-tools and protobuf
Stars: ✭ 57 (-85.96%)
OpshellDevOps Toolkit for Every Cloud on Every Cloud
Stars: ✭ 19 (-95.32%)
Elasticsearch loaderA tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (-26.11%)
TerracognitaReads from existing Cloud Providers (reverse Terraform) and generates your infrastructure as code on Terraform configuration
Stars: ✭ 452 (+11.33%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-26.6%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+5330.54%)
My LinksKnowledge seeks no man
Stars: ✭ 311 (-23.4%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-79.8%)
Awesome AwsA curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
Stars: ✭ 9,895 (+2337.19%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+253.45%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (-69.21%)
darwinAvro Schema Evolution made easy
Stars: ✭ 26 (-93.6%)
PerunA command-line validation tool for AWS Cloud Formation that allows to conquer the cloud faster!
Stars: ✭ 82 (-79.8%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-94.09%)
aaocp一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (-86.95%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-95.07%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-96.8%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-92.12%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (-8.37%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (-90.39%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-72.66%)