Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (+438.89%)

Mutual labels: big-data, stream-processing

Storm Doc Zh

Apache Storm 官方文档中文版

Stars: ✭ 142 (+688.89%)

Mutual labels: big-data, storm

bigquery-kafka-connect

☁️ nodejs kafka connect connector for Google BigQuery

Stars: ✭ 17 (-5.56%)

Mutual labels: big-data

rastercube

rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)

Stars: ✭ 15 (-16.67%)

Mutual labels: big-data

airavata-php-gateway

Mirror of Apache Airavata PHP Gateway

Stars: ✭ 15 (-16.67%)

Mutual labels: big-data

CS Book

🔥 Latest computer science e-books。提供最新技术类电子书下载， “我无非就是想卷死各位，或者被各位卷死！”

Stars: ✭ 40 (+122.22%)

Mutual labels: big-data

Big-Data-Demo

基于Vue、three.js、echarts，数据可视化展示项目，包含三维模型导入交互、三维模型标注等功能

Stars: ✭ 146 (+711.11%)

Mutual labels: big-data

kafka-shell

⚡A supercharged, interactive Kafka shell built on top of the existing Kafka CLI tools.

Stars: ✭ 107 (+494.44%)

Mutual labels: stream-processing

xxhadoop

Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !

Stars: ✭ 37 (+105.56%)

Mutual labels: storm

spark-records

Bulletproof Apache Spark jobs with fast root cause analysis of failures.

Stars: ✭ 67 (+272.22%)

Mutual labels: big-data

ramen

A stream processing language and compiler for small-scale monitoring

Stars: ✭ 14 (-22.22%)

Mutual labels: stream-processing

go-rivers

Collection of stream processing / multiplexing / networking libs in Go

Stars: ✭ 35 (+94.44%)

Mutual labels: stream-processing

blockchain-etl-streaming

Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes

Stars: ✭ 57 (+216.67%)

Mutual labels: stream-processing

artml

ARTML- Real time learning

Stars: ✭ 20 (+11.11%)

Mutual labels: stream-processing

gretel-python-client

The Gretel Python Client allows you to interact with the Gretel REST API.

Stars: ✭ 28 (+55.56%)

Mutual labels: stream-processing

SGDLibrary

MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20

Stars: ✭ 165 (+816.67%)

Mutual labels: big-data

RemoteShuffleService

Celeborn provides an elastic and high-performance service for shuffle and spilled data.

Stars: ✭ 262 (+1355.56%)

Mutual labels: big-data

siembol

An open-source, real-time Security Information & Event Management tool based on big data technologies, providing a scalable, advanced security analytics framework.

Stars: ✭ 153 (+750%)

Mutual labels: big-data

xcast

A High-Performance Data Science Toolkit for the Earth Sciences

Stars: ✭ 28 (+55.56%)

Mutual labels: big-data

csvplus

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

Stars: ✭ 67 (+272.22%)

Mutual labels: stream-processing

ByteSlice

"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)

Stars: ✭ 24 (+33.33%)

Mutual labels: big-data

azure-big-data-starter

A boilerplate project for Azure Big Data PaaS services

Stars: ✭ 13 (-27.78%)

Mutual labels: big-data

flink-connectors

Apache Flink connectors for Pravega.

Stars: ✭ 84 (+366.67%)

Mutual labels: stream-processing

beam-site

Apache Beam Site

Stars: ✭ 28 (+55.56%)

Mutual labels: big-data

mage

MAGE - Memgraph Advanced Graph Extensions 🔮

Stars: ✭ 89 (+394.44%)

Mutual labels: stream-processing

product-sp

An open source, cloud-native streaming data integration and analytics product optimized for agile digital businesses

Stars: ✭ 80 (+344.44%)

Mutual labels: stream-processing

arrow-datafusion

Apache Arrow DataFusion SQL Query Engine

Stars: ✭ 2,360 (+13011.11%)

Mutual labels: big-data

crawling-framework

Easily crawl news portals or blog sites using Storm Crawler.

Stars: ✭ 22 (+22.22%)

Mutual labels: storm

Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

Stars: ✭ 47 (+161.11%)

Mutual labels: big-data

scarf

Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.

Stars: ✭ 54 (+200%)

Mutual labels: big-data

LoL-Match-Prediction

Win probability predictions for League of Legends matches using neural networks

Stars: ✭ 34 (+88.89%)

Mutual labels: big-data

bullet-storm

The Apache Storm implementation of the Bullet backend

Stars: ✭ 39 (+116.67%)

Mutual labels: storm

terraform-aws-kinesis-firehose

This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.

Stars: ✭ 25 (+38.89%)

Mutual labels: big-data

IoT-system-PLC-data-to-InfluxDB

This project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.

Stars: ✭ 26 (+44.44%)

Mutual labels: big-data

insightedge

InsightEdge Core

Stars: ✭ 22 (+22.22%)

Mutual labels: big-data

ripple

Simple shared surface streaming application

Stars: ✭ 17 (-5.56%)

Mutual labels: stream-processing

spark-root

Apache Spark Data Source for ROOT File Format

Stars: ✭ 28 (+55.56%)

Mutual labels: big-data

cloudberry

Big Data Visualization

Stars: ✭ 89 (+394.44%)

Mutual labels: big-data

dxram

A distributed in-memory key-value storage for billions of small objects.

Stars: ✭ 25 (+38.89%)

Mutual labels: big-data

nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability

Stars: ✭ 8,196 (+45433.33%)

Mutual labels: big-data

incubator-liminal

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

Stars: ✭ 117 (+550%)

Mutual labels: big-data

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Stars: ✭ 1,173 (+6416.67%)

Mutual labels: big-data

Real Time Social Media Mining

DevOps pipeline for Real Time Social/Web Mining

Stars: ✭ 22 (+22.22%)

Mutual labels: big-data

godsend

A simple and eloquent workflow for streaming messages to micro-services.

Stars: ✭ 15 (-16.67%)

Mutual labels: stream-processing

awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 11,093 (+61527.78%)

Mutual labels: stream-processing

stormnode

Node js node client for storm.dev

Stars: ✭ 11 (-38.89%)

Mutual labels: storm

GDLibrary

Matlab library for gradient descent algorithms: Version 1.0.1

Stars: ✭ 50 (+177.78%)

Mutual labels: big-data

airavata-django-portal

Mirror of Apache Airavata Django Portal

Stars: ✭ 20 (+11.11%)

Mutual labels: big-data

kafka-workers

Kafka Workers is a client library which unifies records consuming from Kafka and processing them by user-defined WorkerTasks.

Stars: ✭ 30 (+66.67%)

Mutual labels: stream-processing

lcbo-api

A crawler and API server for Liquor Control Board of Ontario retail data

Stars: ✭ 152 (+744.44%)

Mutual labels: big-data

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+116.67%)

Mutual labels: big-data

1-60 of 523 similar projects

›

next*5