Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-43.32%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+596.76%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+107.69%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-28.34%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-60.73%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+480.97%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+276.11%)
StreamlineStreamLine - Streaming Analytics
Stars: ✭ 151 (-38.87%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+1073.68%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-12.55%)
Azure Event Hubs☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (-5.67%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-66.4%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (-6.88%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-39.27%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-94.74%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+67.21%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-55.06%)
Coolplayspark酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+1243.32%)
Kafka Streamsequivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (+148.18%)
Go StreamsA lightweight stream processing library for Go
Stars: ✭ 615 (+148.99%)
Bandar LogMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (-92.31%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+194.74%)
ThingsboardOpen-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+4161.54%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-12.96%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+34.41%)
Streamxkafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Stars: ✭ 96 (-61.13%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+4506.48%)
FlogoProject Flogo is an open source ecosystem of opinionated event-driven capabilities to simplify building efficient & modern serverless functions, microservices & edge apps.
Stars: ✭ 1,891 (+665.59%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-43.72%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (-43.32%)
OryxOryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+622.67%)
SamsaraSamsara is a real-time analytics platform
Stars: ✭ 132 (-46.56%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (-41.7%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+4813.36%)
SupersafebankSample Event Sourcing implementation with .NET Core
Stars: ✭ 142 (-42.51%)
A Kafka StoryKafka ecosystem ... but step by step!
Stars: ✭ 148 (-40.08%)
Azkarra Streams🚀 Azkarra is a lightweight java framework to make it easy to develop, deploy and manage cloud-native streaming microservices based on Apache Kafka Streams.
Stars: ✭ 146 (-40.89%)
TributaryStreaming reactive and dataflow graphs in Python
Stars: ✭ 231 (-6.48%)
RedpandaRedpanda is the real-time engine for modern apps. Kafka API Compatible; 10x faster 🚀 See more at vectorized.io/redpanda
Stars: ✭ 3,114 (+1160.73%)
Spark.jlJulia binding for Apache Spark
Stars: ✭ 153 (-38.06%)
Whylogs JavaProfile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (-33.6%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (-41.3%)
DatasciencevmTools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (-38.06%)