ThingsboardOpen-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+2729.57%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-78.23%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+37.9%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-19.89%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-73.92%)
Spark Hbase ConnectorConnect Spark to HBase for reading and writing data with ease
Stars: ✭ 299 (-19.62%)
Kafka Connectequivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (-72.58%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-87.37%)
Delta ArchitectureStreaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-88.44%)
Connection Pool Client💥 A simple multi-purpose connection pool client (Kafka & Hbase & Redis & RMDB & Socket & Http)
Stars: ✭ 40 (-89.25%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (-10.75%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-65.05%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-62.37%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-2.96%)
DagsterAn orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+1001.88%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+285.75%)
Devops Bash Tools550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (-39.25%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (-37.63%)
Every Single Day I TldrA daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (-33.06%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-33.6%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+113.17%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+11.02%)
phoenixApache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-93.82%)
thainThain is a distributed flow schedule platform.
Stars: ✭ 81 (-78.23%)
Gather DeploymentGathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (-12.37%)
smart-data-lakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (-78.76%)
hive to es同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-94.35%)
zdh web大数据采集,抽取平台
Stars: ✭ 292 (-21.51%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-82.53%)
BenthosFancy stream processing made operationally mundane
Stars: ✭ 3,705 (+895.97%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-89.52%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+1131.45%)
disk基于hadoop+hbase+springboot实现分布式网盘系统
Stars: ✭ 53 (-85.75%)
waspWASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-94.89%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (-93.82%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (-91.67%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-95.43%)
darwinAvro Schema Evolution made easy
Stars: ✭ 26 (-93.01%)
Hbase RddSpark RDD to read, write and delete from HBase
Stars: ✭ 277 (-25.54%)
cobra-policytoolManage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-95.7%)
orionManagement and automation platform for Stateful Distributed Systems
Stars: ✭ 77 (-79.3%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-84.95%)
cmuxA set of commands for managing CDH clusters using Cloudera Manager REST API.
Stars: ✭ 34 (-90.86%)
litemall-dw基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (-90.32%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-94.62%)
TILToday I Learned
Stars: ✭ 43 (-88.44%)