DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.

Stars: ✭ 162 (+100%)

Mutual labels: kafka, hdfs

Devops Bash Tools

550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (+179.01%)

Mutual labels: kafka, hadoop

docker-hadoop

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Stars: ✭ 59 (-27.16%)

Mutual labels: hadoop, hdfs

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (+72.84%)

Mutual labels: kafka, connector

Kafka Connect Storage Cloud

Kafka Connect suite of connectors for Cloud storage (Amazon S3)

Stars: ✭ 153 (+88.89%)

Mutual labels: kafka, confluent

Ksql Udf Deep Learning Mqtt Iot

Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data

Stars: ✭ 219 (+170.37%)

Mutual labels: kafka, confluent

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+118.52%)

Mutual labels: kafka, hadoop

skein

A tool and library for easily deploying applications on Apache YARN

Stars: ✭ 128 (+58.02%)

Mutual labels: hadoop, hdfs

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-74.07%)

Mutual labels: hadoop, hdfs

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-77.78%)

Mutual labels: hadoop, hdfs

py-hdfs-mount

Mount HDFS with fuse, works with kerberos!

Stars: ✭ 13 (-83.95%)

Mutual labels: hadoop, hdfs

fluent-plugin-webhdfs

Hadoop WebHDFS output plugin for Fluentd

Stars: ✭ 57 (-29.63%)

Mutual labels: hadoop, hdfs

aaocp

一个对用户行为日志进行分析的大数据项目

Stars: ✭ 53 (-34.57%)

Mutual labels: hadoop, hdfs

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-75.31%)

Mutual labels: hadoop, hdfs

Flume Canal Source

Flume NG Canal source

Stars: ✭ 56 (-30.86%)

Mutual labels: kafka, hdfs

Gather Deployment

Gathers scalable tensorflow and infrastructure deployment

Stars: ✭ 326 (+302.47%)

Mutual labels: kafka, hadoop

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (+359.26%)

Mutual labels: kafka, hadoop

Kafka Connect Ui

Web tool for Kafka Connect |

Stars: ✭ 388 (+379.01%)

Mutual labels: kafka, hdfs

Eel Sdk

Big Data Toolkit for the JVM

Stars: ✭ 140 (+72.84%)

Mutual labels: kafka, hadoop

Kafka Rest

Confluent REST Proxy for Kafka

Stars: ✭ 1,863 (+2200%)

Mutual labels: kafka, confluent

Confluent Kafka Dotnet

Confluent's Apache Kafka .NET client

Stars: ✭ 2,110 (+2504.94%)

Mutual labels: kafka, confluent

Kafka Connect Mongodb

**Unofficial / Community** Kafka Connect MongoDB Sink Connector - Find the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector

Stars: ✭ 137 (+69.14%)

Mutual labels: kafka, connector

Hivemq Mqtt Tensorflow Kafka Realtime Iot Machine Learning Training Inference

Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required

Stars: ✭ 204 (+151.85%)

Mutual labels: kafka, confluent

Recommendsys

推荐项目（实时推荐和离线推荐）

Stars: ✭ 198 (+144.44%)

Mutual labels: kafka, hadoop

Storagetapper

StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service

Stars: ✭ 232 (+186.42%)

Mutual labels: kafka, hdfs

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (+62.96%)

Mutual labels: kafka, hadoop

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (-30.86%)

Mutual labels: hadoop, hdfs

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-54.32%)

Mutual labels: hadoop, hdfs

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-76.54%)

Mutual labels: hadoop, hdfs

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (-40.74%)

Mutual labels: hadoop, hdfs

ros hadoop

Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.

Stars: ✭ 92 (+13.58%)

Mutual labels: hadoop, hdfs

fsbrowser

Fast desktop client for Hadoop Distributed File System

Stars: ✭ 27 (-66.67%)

Mutual labels: hadoop, hdfs

Schema Registry

Confluent Schema Registry for Kafka

Stars: ✭ 1,647 (+1933.33%)

Mutual labels: kafka, confluent

Sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Stars: ✭ 513 (+533.33%)

Mutual labels: kafka, hdfs

Cp Ansible

Ansible playbooks for the Confluent Platform

Stars: ✭ 285 (+251.85%)

Mutual labels: kafka, confluent

Cp Demo

Confluent Platform Demo including Apache Kafka, ksqlDB, Control Center, Replicator, Confluent Schema Registry, Security

Stars: ✭ 278 (+243.21%)

Mutual labels: kafka, confluent

Divolte Collector

Stars: ✭ 264 (+225.93%)

Mutual labels: kafka, hdfs

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+401.23%)

Mutual labels: hadoop, hdfs

Kafka Connect Elasticsearch

Kafka Connect Elasticsearch connector

Stars: ✭ 550 (+579.01%)

Mutual labels: kafka, confluent

Stream Reactor

Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.

Stars: ✭ 753 (+829.63%)

Mutual labels: kafka, connector

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (+502.47%)

Mutual labels: hadoop, hdfs

Kafka Connect Jdbc

Kafka Connect connector for JDBC-compatible databases

Stars: ✭ 698 (+761.73%)

Mutual labels: kafka, confluent

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-82.72%)

Mutual labels: hadoop, hdfs

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+919.75%)

Mutual labels: kafka, hadoop

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (+945.68%)

Mutual labels: kafka, hadoop

Cp Docker Images

[DEPRECATED] Docker images for Confluent Platform.

Stars: ✭ 975 (+1103.7%)

Mutual labels: kafka, confluent

Cdc Kafka Hadoop

MySQL to NoSQL real time dataflow

Stars: ✭ 13 (-83.95%)

Mutual labels: kafka, hadoop

Jsr203 Hadoop

A Java NIO file system provider for HDFS

Stars: ✭ 35 (-56.79%)

Mutual labels: hadoop, hdfs

Bigdata Notebook

Stars: ✭ 100 (+23.46%)

Mutual labels: kafka, hadoop

Schema Registry

A CLI and Go client for Kafka Schema Registry

Stars: ✭ 105 (+29.63%)

Mutual labels: kafka, confluent

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-83.95%)

Mutual labels: hadoop, hdfs

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (+908.64%)

Mutual labels: kafka, hadoop

1-60 of 701 similar projects

›

next*5