All Projects → Pulsar Spark → Similar Projects or Alternatives

1595 Open source projects that are alternatives of or similar to Pulsar Spark

Pulsar Flink
Elastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (+129.09%)
Awesome Kafka
A list about Apache Kafka
Stars: ✭ 397 (+621.82%)
Streaming Readings
Streaming System 相关的论文读物
Stars: ✭ 554 (+907.27%)
Mutual labels:  apache-spark, stream-processing, flink
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+650.91%)
Mutual labels:  data-science, spark, apache-spark
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-9.09%)
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+320%)
Spark Notebook
Interactive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+5501.82%)
Mutual labels:  data-science, spark, apache-spark
Flink Learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+20587.27%)
Mutual labels:  spark, stream-processing, flink
Data Ingestion Platform
Stars: ✭ 39 (-29.09%)
Mutual labels:  spark, flink, batch-processing
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-63.64%)
Mutual labels:  spark, flink
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-58.18%)
Mutual labels:  spark, apache-spark
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+370.91%)
Mutual labels:  spark, apache-spark
Spark As Service Using Embedded Server
This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server
Stars: ✭ 46 (-16.36%)
Mutual labels:  spark, apache-spark
flink-connectors
Apache Flink connectors for Pravega.
Stars: ✭ 84 (+52.73%)
Mutual labels:  stream-processing, flink
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-1.82%)
Mutual labels:  spark, data-processing
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-29.09%)
Mutual labels:  spark, apache-spark
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-9.09%)
Mutual labels:  spark, apache-spark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (+458.18%)
Mutual labels:  spark, apache-spark
Wirbelsturm
Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+503.64%)
Mutual labels:  spark, apache-spark
Spark Structured Streaming Book
The Internals of Spark Structured Streaming
Stars: ✭ 371 (+574.55%)
Mutual labels:  spark, apache-spark
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (+372.73%)
Mutual labels:  data-science, spark
Sparkmeasure
This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (+569.09%)
Mutual labels:  spark, apache-spark
Sparkle
Haskell on Apache Spark.
Stars: ✭ 419 (+661.82%)
Mutual labels:  spark, apache-spark
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+10823.64%)
Mutual labels:  spark, flink
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+10183.64%)
Mutual labels:  data-science, spark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+1050.91%)
Mutual labels:  data-science, spark
Spring Cloud Dataflow
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Stars: ✭ 753 (+1269.09%)
open-stream-processing-benchmark
This repository contains the code base for the Open Stream Processing Benchmark.
Stars: ✭ 37 (-32.73%)
Mutual labels:  stream-processing, flink
SANSA-Stack
Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
Stars: ✭ 130 (+136.36%)
Mutual labels:  apache-spark, flink
proxima-platform
The Proxima platform.
Stars: ✭ 17 (-69.09%)
Mutual labels:  apache-spark, batch-processing
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+101.82%)
Mutual labels:  spark, apache-spark
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-76.36%)
Mutual labels:  spark, apache-spark
Spark Tda
SparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-18.18%)
Mutual labels:  spark, apache-spark
streamsx.kafka
Repository for integration with Apache Kafka
Stars: ✭ 13 (-76.36%)
Mutual labels:  apache-spark, stream-processing
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+1401.82%)
Mutual labels:  spark, flink
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+405.45%)
Mutual labels:  spark, flink
Coolplayspark
酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+5932.73%)
Mutual labels:  spark, apache-spark
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+7178.18%)
Mutual labels:  data-science, data-processing
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+1309.09%)
Mutual labels:  spark, apache-spark
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+1589.09%)
Mutual labels:  spark, apache-spark
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-52.73%)
Mutual labels:  data-science, spark
Featran
A Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (+663.64%)
Mutual labels:  spark, flink
FlinkExperiments
Experiments with Apache Flink.
Stars: ✭ 3 (-94.55%)
Mutual labels:  stream-processing, flink
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+1014.55%)
Mutual labels:  data-science, apache-spark
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+9923.64%)
Mutual labels:  spark, flink
Kafka Storm Starter
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+1223.64%)
Mutual labels:  spark, apache-spark
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1385.45%)
Mutual labels:  spark, flink
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+1341.82%)
Mutual labels:  spark, apache-spark
Spark Examples
Spark examples
Stars: ✭ 41 (-25.45%)
Mutual labels:  spark, apache-spark
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+729.09%)
Mutual labels:  spark, flink
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1458.18%)
Mutual labels:  spark, flink
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+1470.91%)
Mutual labels:  data-science, data-processing
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-74.55%)
Mutual labels:  spark, apache-spark
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-32.73%)
Mutual labels:  spark, apache-spark
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (+1454.55%)
Spark Flamegraph
Easy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-45.45%)
Mutual labels:  spark, apache-spark
Apache Spark Internals
The Internals of Apache Spark
Stars: ✭ 1,045 (+1800%)
Mutual labels:  spark, apache-spark
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+5434.55%)
Mutual labels:  data-science, spark
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+39987.27%)
Mutual labels:  data-science, spark
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+1452.73%)
Mutual labels:  data-science, data-processing
1-60 of 1595 similar projects