All Projects → Pulsar Spark → Similar Projects or Alternatives

1595 Open source projects that are alternatives of or similar to Pulsar Spark

Bats
面向 OLTP、OLAP、批处理、流处理场景的大一统 SQL 引擎
Stars: ✭ 152 (+176.36%)
Spark States
Custom state store providers for Apache Spark
Stars: ✭ 83 (+50.91%)
Mutual labels:  spark, apache-spark
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+2072.73%)
Mutual labels:  spark, flink
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+67.27%)
Mutual labels:  spark, flink
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1385.45%)
Mutual labels:  spark, flink
Spark On K8s Operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+3136.36%)
Mutual labels:  spark, apache-spark
Spring Cloud Dataflow
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Stars: ✭ 753 (+1269.09%)
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+1401.82%)
Mutual labels:  spark, flink
Apache Spark Internals
The Internals of Apache Spark
Stars: ✭ 1,045 (+1800%)
Mutual labels:  spark, apache-spark
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-32.73%)
Mutual labels:  spark, apache-spark
Splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (+90.91%)
Mutual labels:  spark, apache-spark
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+3029.09%)
Mutual labels:  spark, apache-spark
Hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+129.09%)
Mutual labels:  spark, flink
Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+3210.91%)
Mutual labels:  spark, flink
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+39987.27%)
Mutual labels:  data-science, spark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+200%)
Mutual labels:  spark, apache-spark
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (+196.36%)
Mutual labels:  spark, flink
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+290.91%)
Mutual labels:  spark, apache-spark
Whylogs Java
Profile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (+198.18%)
Mutual labels:  spark, apache-spark
Dpark
Python clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+4750.91%)
Mutual labels:  spark, stream-processing
Mastering Spark Sql Book
The Internals of Spark SQL
Stars: ✭ 234 (+325.45%)
Mutual labels:  spark, apache-spark
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (+9.09%)
Mutual labels:  data-science, spark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+172.73%)
Mutual labels:  spark, apache-spark
Awesome Bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+18950.91%)
Mutual labels:  data-science, stream-processing
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+2332.73%)
Mutual labels:  data-science, spark
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (+121.82%)
Mutual labels:  data-science, spark
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+43.64%)
Mutual labels:  data-science, spark
Datacompy
Pandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (+167.27%)
Mutual labels:  data-science, spark
Scalable Data Science
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (+158.18%)
Mutual labels:  data-science, apache-spark
Scalable Data Science Platform
Content for architecting a data science platform for products using Luigi, Spark & Flask.
Stars: ✭ 158 (+187.27%)
Mutual labels:  data-science, spark
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+1589.09%)
Mutual labels:  spark, apache-spark
proxima-platform
The Proxima platform.
Stars: ✭ 17 (-69.09%)
Mutual labels:  apache-spark, batch-processing
streamsx.kafka
Repository for integration with Apache Kafka
Stars: ✭ 13 (-76.36%)
Mutual labels:  apache-spark, stream-processing
flink-connectors
Apache Flink connectors for Pravega.
Stars: ✭ 84 (+52.73%)
Mutual labels:  stream-processing, flink
FlinkExperiments
Experiments with Apache Flink.
Stars: ✭ 3 (-94.55%)
Mutual labels:  stream-processing, flink
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-76.36%)
Mutual labels:  spark, apache-spark
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-29.09%)
Mutual labels:  spark, apache-spark
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+5434.55%)
Mutual labels:  data-science, spark
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+405.45%)
Mutual labels:  spark, flink
Wirbelsturm
Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+503.64%)
Mutual labels:  spark, apache-spark
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+7178.18%)
Mutual labels:  data-science, data-processing
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (+1454.55%)
Spark Structured Streaming Book
The Internals of Spark Structured Streaming
Stars: ✭ 371 (+574.55%)
Mutual labels:  spark, apache-spark
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+1452.73%)
Mutual labels:  data-science, data-processing
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1458.18%)
Mutual labels:  spark, flink
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+9923.64%)
Mutual labels:  spark, flink
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-74.55%)
Mutual labels:  spark, apache-spark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+1050.91%)
Mutual labels:  data-science, spark
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+729.09%)
Mutual labels:  spark, flink
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+1341.82%)
Mutual labels:  spark, apache-spark
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+1309.09%)
Mutual labels:  spark, apache-spark
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+312.73%)
Mutual labels:  data-science, spark
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+10823.64%)
Mutual labels:  spark, flink
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-52.73%)
Mutual labels:  data-science, spark
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+1470.91%)
Mutual labels:  data-science, data-processing
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1714.55%)
Mutual labels:  data-science, spark
Cv Pretrained Model
A collection of computer vision pre-trained models.
Stars: ✭ 995 (+1709.09%)
Mutual labels:  data-science
Causalnex
A Python library that helps data scientists to infer causation rather than observing correlation.
Stars: ✭ 1,036 (+1783.64%)
Mutual labels:  data-science
Scala Plotly Client
Visualise your data from Scala using Plotly
Stars: ✭ 39 (-29.09%)
Mutual labels:  data-science
Datumbox Framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (+1832.73%)
Mutual labels:  data-science
61-120 of 1595 similar projects