A real-time interactive web app based on data pipelines using streaming Twitter data, automated sentiment analysis, and MySQL&PostgreSQL database (Deployed on Heroku)

Stars: ✭ 127 (-95.09%)

Mutual labels: stream-processing

Griddb

GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Stars: ✭ 1,587 (-38.63%)

Mutual labels: bigdata

Awesome Streaming

a curated list of awesome streaming frameworks, applications, etc

Stars: ✭ 1,879 (-27.34%)

Mutual labels: stream-processing

Splash

Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange

Stars: ✭ 105 (-95.94%)

Mutual labels: bigdata

Neo4j Streams

Neo4j Kafka Integrations, Docs =>

Stars: ✭ 126 (-95.13%)

Mutual labels: stream-processing

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+325.02%)

Mutual labels: bigdata

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (-48.26%)

Mutual labels: bigdata

Kafka Tutorials

Kafka Tutorials microsite

Stars: ✭ 144 (-94.43%)

Mutual labels: stream-processing

Awesome Single Cell

Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.

Stars: ✭ 1,937 (-25.1%)

Mutual labels: data-integration

Riko

A Python stream processing engine modeled after Yahoo! Pipes

Stars: ✭ 1,571 (-39.25%)

Mutual labels: stream-processing

Mnemonic

Apache Mnemonic - A non-volatile hybrid memory storage oriented library

Stars: ✭ 91 (-96.48%)

Mutual labels: bigdata

Ignite Book Code Samples

All code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.

Stars: ✭ 86 (-96.67%)

Mutual labels: bigdata

Genie

Distributed Big Data Orchestration Service

Stars: ✭ 1,544 (-40.29%)

Mutual labels: bigdata

Mlsql

The Programming Language Designed For Big Data and AI

Stars: ✭ 1,262 (-51.2%)

Mutual labels: bigdata

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (-33.45%)

Mutual labels: bigdata

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+339.98%)

Mutual labels: stream-processing

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (-94.62%)

Mutual labels: bigdata

Gsf

Grid Solutions Framework

Stars: ✭ 106 (-95.9%)

Mutual labels: stream-processing

Fpart

Sort files and pack them into partitions

Stars: ✭ 127 (-95.09%)

Mutual labels: bigdata

Flink Notes

flink学习笔记

Stars: ✭ 106 (-95.9%)

Mutual labels: bigdata

Mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

Stars: ✭ 15,338 (+493.12%)

Mutual labels: stream-processing

Leofs

The LeoFS Storage System

Stars: ✭ 1,439 (-44.35%)

Mutual labels: datalake

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (-95.13%)

Mutual labels: bigdata

Bigdata Notebook

Stars: ✭ 100 (-96.13%)

Mutual labels: bigdata

Wayeb

Wayeb is a Complex Event Processing and Forecasting (CEP/F) engine written in Scala.

Stars: ✭ 138 (-94.66%)

Mutual labels: stream-processing

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-96.25%)

Mutual labels: stream-processing

Pulsar Flink

Elastic data processing with Apache Pulsar and Apache Flink

Stars: ✭ 126 (-95.13%)

Mutual labels: stream-processing

Covid19 Market Waiting Times

A project to help people stand in line at the market as little as possible

Stars: ✭ 95 (-96.33%)

Mutual labels: bigdata

Avro

Apache Avro is a data serialization system.

Stars: ✭ 2,005 (-22.47%)

Mutual labels: bigdata

Biglasso

biglasso: Extending Lasso Model Fitting to Big Data in R

Stars: ✭ 87 (-96.64%)

Mutual labels: bigdata

Liteflow

liteflow是一个基于任务版本来实现的分布式任务流调度系统

Stars: ✭ 112 (-95.67%)

Mutual labels: bigdata

Bigdata File Viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (-96.67%)

Mutual labels: bigdata

Mara Pipelines

A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow

Stars: ✭ 1,841 (-28.81%)

Mutual labels: data-integration

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (-95.71%)

Mutual labels: bigdata

Athena Cli

Presto-like CLI tool for AWS Athena