All Projects → Spark With Python → Similar Projects or Alternatives

9362 Open source projects that are alternatives of or similar to Spark With Python

Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+14598.67%)
Mutual labels:  spark, big-data, hadoop
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-6.67%)
Mutual labels:  spark, apache-spark, apache
Databases workshop
RCS Intro to Databases workshop materials
Stars: ✭ 25 (-83.33%)
Mutual labels:  jupyter-notebook, sql, database
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-62%)
Mutual labels:  spark, big-data, hadoop
Docker Superset
Repository for Docker Image of Apache-Superset. [Docker Image: https://hub.docker.com/r/abhioncbr/docker-superset]
Stars: ✭ 86 (-42.67%)
Mutual labels:  sql, analytics, apache
Pyspark Setup Demo
Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-84%)
Mutual labels:  jupyter-notebook, big-data, pyspark
Hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+64%)
Mutual labels:  spark, analytics, big-data
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+44%)
Mutual labels:  spark, big-data, pyspark
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+307.33%)
Mutual labels:  dataframe, sql, spark
Duckdb
DuckDB is an in-process SQL OLAP Database Management System
Stars: ✭ 4,014 (+2576%)
Mutual labels:  sql, analytics, database
Calcite
Apache Calcite
Stars: ✭ 2,816 (+1777.33%)
Mutual labels:  sql, big-data, hadoop
Sciblog support
Support content for my blog
Stars: ✭ 694 (+362.67%)
Mutual labels:  jupyter-notebook, analytics, big-data
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+994.67%)
Mutual labels:  spark, big-data, hadoop
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1047.33%)
Mutual labels:  spark, analytics, apache-spark
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+1929.33%)
Mutual labels:  dataframe, spark, big-data
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+242%)
Mutual labels:  spark, analytics, hdfs
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+396.67%)
Mutual labels:  jupyter-notebook, spark, big-data
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-84.67%)
Mutual labels:  jupyter-notebook, spark, pyspark
Calcite Avatica
Mirror of Apache Calcite - Avatica
Stars: ✭ 130 (-13.33%)
Mutual labels:  sql, big-data, hadoop
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+2136.67%)
Mutual labels:  big-data, apache-spark, pyspark
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-78.67%)
Mutual labels:  big-data, apache-spark, hadoop
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-86.67%)
Mutual labels:  spark, hadoop, hdfs
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-8.67%)
Mutual labels:  spark, big-data, apache-spark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-83.33%)
Mutual labels:  spark, hadoop, pyspark
hadoop-data-ingestion-tool
OLAP and ETL of Big Data
Stars: ✭ 17 (-88.67%)
Mutual labels:  big-data, hadoop, apache
Yandex Big Data Engineering
Stars: ✭ 17 (-88.67%)
Mutual labels:  jupyter-notebook, spark, hdfs
Interview Questions Collection
按知识领域整理面试题,包括C++、Java、Hadoop、机器学习等
Stars: ✭ 21 (-86%)
Mutual labels:  spark, hadoop, database
Data Algorithms Book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+532.67%)
Mutual labels:  spark, hadoop, distributed-computing
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+557.33%)
Mutual labels:  jupyter-notebook, spark, pyspark
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+2442%)
Mutual labels:  spark, big-data, hadoop
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+536%)
Mutual labels:  jupyter-notebook, spark, pyspark
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+583.33%)
Design Of Experiment Python
Design-of-experiment (DOE) generator for science, engineering, and statistics
Stars: ✭ 143 (-4.67%)
Mutual labels:  dataframe, jupyter-notebook, analytics
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (+166.67%)
Mutual labels:  spark, analytics, hadoop
Eventql
Distributed "massively parallel" SQL query engine
Stars: ✭ 1,121 (+647.33%)
Mutual labels:  sql, analytics, database
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+3905.33%)
Mutual labels:  spark, hadoop, hdfs
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+142%)
Mutual labels:  sql, spark, analytics
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (+636%)
Mutual labels:  sql, analytics, database
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-58%)
Mutual labels:  jupyter-notebook, spark, pyspark
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+364%)
Mutual labels:  sql, spark, pyspark
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+3575.33%)
Mutual labels:  spark, big-data, database
Beeva Best Practices
Best Practices and Style Guides in BEEVA
Stars: ✭ 335 (+123.33%)
Mutual labels:  jupyter-notebook, analytics, big-data
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+519.33%)
Mutual labels:  dataframe, spark, apache-spark
Pyspark Examples
Code examples on Apache Spark using python
Stars: ✭ 58 (-61.33%)
Mutual labels:  jupyter-notebook, spark, apache
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-57.33%)
Mutual labels:  jupyter-notebook, spark, pyspark
Spark Website
Apache Spark Website
Stars: ✭ 75 (-50%)
Mutual labels:  sql, spark, big-data
Locustdb
Massively parallel, high performance analytics database that will rapidly devour all of your data.
Stars: ✭ 1,250 (+733.33%)
Mutual labels:  analytics, database
Snowflake Jdbc
Snowflake JDBC Driver
Stars: ✭ 83 (-44.67%)
Mutual labels:  sql, database
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-44%)
Mutual labels:  jupyter-notebook, spark
Graphjin
GraphJin - Build APIs in 5 minutes with GraphQL. An instant GraphQL to SQL compiler.
Stars: ✭ 1,264 (+742.67%)
Mutual labels:  sql, database
Evolutility Server Node
Model-driven REST or GraphQL backend for CRUD and more, written in Javascript, using Node.js, Express, and PostgreSQL.
Stars: ✭ 84 (-44%)
Mutual labels:  sql, database
Training Material
A collection of code examples as well as presentations for training purposes
Stars: ✭ 85 (-43.33%)
Mutual labels:  jupyter-notebook, sql
Cuesheet
A framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-42.67%)
Mutual labels:  spark, apache-spark
Electrocrud
Database CRUD Application Built on Electron | MySQL, Postgres, SQLite
Stars: ✭ 1,267 (+744.67%)
Mutual labels:  sql, database
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (-42%)
Mutual labels:  spark, pyspark
React Native Firebase
🔥 A well-tested feature-rich modular Firebase implementation for React Native. Supports both iOS & Android platforms for all Firebase services.
Stars: ✭ 9,674 (+6349.33%)
Mutual labels:  analytics, database
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (-41.33%)
Mutual labels:  jupyter-notebook, spark
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-40.67%)
Mutual labels:  jupyter-notebook, spark
Qtl
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.
Stars: ✭ 92 (-38.67%)
Mutual labels:  sql, database
Wifi
基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-38%)
Mutual labels:  hadoop, hdfs
61-120 of 9362 similar projects