All Categories → Data Processing → spark-streaming

Top 58 spark-streaming open source projects

Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Example Spark
Spark, Spark Streaming and Spark SQL unit testing strategies
Spark Streaming With Kafka
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Movie recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Example Spark Kafka
Apache Spark and Apache Kafka integration example
Kinesis Sql
Kinesis Connector for Structured Streaming
Spark Mllib Twitter Sentiment Analysis
🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Waterdrop
Production Ready Data Integration Product, documentation:
Spark States
Custom state store providers for Apache Spark
Pyspark Examples
Code examples on Apache Spark using python
Utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Learning Spark
零基础学习spark,大数据学习
Project Fortis
Repository for all parts of the Fortis architecture
Spark Streaming Monitoring With Lightning
Plot live-stats as graph from ApacheSpark application using Lightning-viz
Wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform
Mobius
C# and F# language binding and extensions to Apache Spark
Bandar Log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Cdap
An open source framework for building data analytic applications.
Learningspark
Scala examples for learning to use Spark
Sylph
Stream computing platform for bigdata
Coolplayspark
酷玩 Spark: Spark 源代码解析、Spark 类库等
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
live twitter sentiment analysis
Live Twitter sentiment analysis using Python, Apache Spark Streaming, Kafka, NLTK, SocketIO
Spark-and-Kafka IoT-Data-Processing-and-Analytics
Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time
litemall-dw
基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
NYC Taxi Pipeline
Design/Implement stream/batch architecture on NYC taxi data | #DE
AdRealTimeAnalysis
四川大学拓思艾诺广告流量实时分析项目
spark-utils
Basic framework utilities to quickly start writing production ready Apache Spark applications
cassandra.realtime
Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
xxhadoop
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
interview-refresh-java-bigdata
a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Tweet-Analysis-With-Kafka-and-Spark
A real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Spark ALS
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Real-time-log-analysis-system
🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)
ExDeMon
A general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...
1-58 of 58 spark-streaming projects