All Projects → Sparta → Similar Projects or Alternatives

2813 Open source projects that are alternatives of or similar to Sparta

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (-51.85%)

Mutual labels: kafka, spark, spark-streaming, streaming-data, streaming

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (-72.71%)

Mutual labels: kafka, spark, spark-streaming, streaming, real-time

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (-82.07%)

Mutual labels: kafka, spark, olap, hdfs

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+235.48%)

Mutual labels: spark, analytics, spark-streaming, streaming

Streamline

StreamLine - Streaming Analytics

Stars: ✭ 151 (-70.57%)

Mutual labels: kafka, spark-streaming, streaming, real-time

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (-57.89%)

Mutual labels: kafka, spark, spark-streaming

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-19.49%)

Mutual labels: kafka, spark, analytics

Real Time Stream Processing Engine

This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.

Stars: ✭ 37 (-92.79%)

Mutual labels: kafka, spark, spark-streaming

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+2117.93%)

Mutual labels: kafka, spark, streaming

prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (-89.47%)

Mutual labels: workflow, spark, olap

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+67.06%)

Mutual labels: kafka, spark, hdfs

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (-76.61%)

Mutual labels: kafka, spark, spark-streaming

Kafka Streams In Action

Source code for the Kafka Streams in Action Book

Stars: ✭ 167 (-67.45%)

Mutual labels: kafka, streaming-data, streaming

Learning Spark

零基础学习spark，大数据学习

Stars: ✭ 37 (-92.79%)

Mutual labels: spark, spark-streaming, hdfs

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-81.09%)

Mutual labels: kafka, spark, analytics

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+2042.5%)

Mutual labels: kafka, spark, hdfs

Divolte Collector

Stars: ✭ 264 (-48.54%)

Mutual labels: kafka, analytics, hdfs

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-70.76%)

Mutual labels: spark, analytics, hdfs

Kafka Connect Hdfs

Kafka Connect HDFS connector

Stars: ✭ 400 (-22.03%)

Mutual labels: kafka, hdfs, streaming

Smart open

Utils for streaming large files (S3, HDFS, gzip, bz2...)

Stars: ✭ 2,306 (+349.51%)

Mutual labels: streaming-data, hdfs, streaming

Mobius

C# and F# language binding and extensions to Apache Spark

Stars: ✭ 929 (+81.09%)

Mutual labels: spark, spark-streaming, streaming

Hydra

A real-time data replication platform that "unbundles" the receiving, transforming, and transport of data streams.

Stars: ✭ 68 (-86.74%)

Mutual labels: kafka, streaming, real-time

Bigdata Notebook

Stars: ✭ 100 (-80.51%)

Mutual labels: kafka, spark, streaming

Spark Streaming With Kafka

Self-contained examples of Apache Spark streaming integrated with Apache Kafka.

Stars: ✭ 180 (-64.91%)

Mutual labels: kafka, spark, spark-streaming

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+1071.15%)

Mutual labels: kafka, spark, hdfs

Every Single Day I Tldr

A daily digest of the articles or videos I've found interesting, that I want to share with you.

Stars: ✭ 249 (-51.46%)

Mutual labels: kafka, spark

Stepfunctions2processing

Configuration with AWS step functions and lambdas which initiates processing from activity state

Stars: ✭ 90 (-82.46%)

Mutual labels: lambda, workflow

Iopipe Js Core

Observe and develop serverless apps with confidence on AWS Lambda with Tracing, Metrics, Profiling, Monitoring, and more.

Stars: ✭ 123 (-76.02%)

Mutual labels: lambda, analytics

Video Stream Analytics

Stars: ✭ 240 (-53.22%)

Mutual labels: kafka, spark

Whatsmars

Java生态研究(Spring Boot + Redis + Dubbo + RocketMQ + Elasticsearch)🔥🔥🔥🔥🔥

Stars: ✭ 1,389 (+170.76%)

Mutual labels: lambda, kafka

Storagetapper

StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service

Stars: ✭ 232 (-54.78%)

Mutual labels: kafka, hdfs

Serverless Analytics

Track website visitors with Serverless Analytics using Kinesis, Lambda, and TypeScript.

Stars: ✭ 219 (-57.31%)

Mutual labels: lambda, analytics

Flogo

Project Flogo is an open source ecosystem of opinionated event-driven capabilities to simplify building efficient & modern serverless functions, microservices & edge apps.

Stars: ✭ 1,891 (+268.62%)

Mutual labels: lambda, streaming

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-96.3%)

Mutual labels: spark-streaming, hdfs

traffic

Massively real-time traffic streaming application

Stars: ✭ 25 (-95.13%)

Mutual labels: streaming, real-time

Tributary

Streaming reactive and dataflow graphs in Python

Stars: ✭ 231 (-54.97%)

Mutual labels: kafka, streaming

Spark On Lambda

Apache Spark on AWS Lambda

Stars: ✭ 137 (-73.29%)

Mutual labels: lambda, spark

matrixone

Hyperconverged cloud-edge native database

Stars: ✭ 1,057 (+106.04%)

Mutual labels: streaming, olap

wink-statistics

Fast & numerically stable statistical analysis

Stars: ✭ 36 (-92.98%)

Mutual labels: streaming, real-time

duckdb

DuckDB is an in-process SQL OLAP Database Management System

Stars: ✭ 4,707 (+817.54%)

Mutual labels: analytics, olap

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-96.1%)

Mutual labels: spark, hdfs

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-81.48%)

Mutual labels: spark, analytics

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-97.47%)

Mutual labels: spark, hdfs

MStream

Anomaly Detection on Time-Evolving Streams in Real-time. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.

Stars: ✭ 68 (-86.74%)

Mutual labels: streaming, real-time

transform-hub

Flexible and efficient data processing engine and an evolution of the popular Scramjet Framework based on node.js. Our Transform Hub was designed specifically for data processing and has its own unique algorithms included.

Stars: ✭ 38 (-92.59%)

Mutual labels: streaming, real-time

TogetherStream

A social and synchronized streaming experience

Stars: ✭ 16 (-96.88%)

Mutual labels: streaming, real-time

beneath

Beneath is a serverless real-time data platform ⚡️

Stars: ✭ 65 (-87.33%)

Mutual labels: streaming, analytics

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-96.88%)

Mutual labels: spark, hdfs

Materialize

Materialize lets you ask questions of your live data, which it answers and then maintains for you as your data continue to change. The moment you need a refreshed answer, you can get it in milliseconds. Materialize is designed to help you interactively explore your streaming data, perform data warehousing analytics against live relational data, or just increase the freshness and reduce the load of your dashboard and monitoring tasks.

Stars: ✭ 3,341 (+551.27%)

Mutual labels: kafka, streaming

transit

Massively real-time city transit streaming application

Stars: ✭ 20 (-96.1%)

Mutual labels: real-time, streaming-data

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-97.27%)

Mutual labels: spark, hdfs

kafka-spark-streaming-zeppelin-docker

One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)