All Projects → Eat_pyspark_in_10_days → Similar Projects or Alternatives

458 Open source projects that are alternatives of or similar to Eat_pyspark_in_10_days

Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+2399.14%)
Mutual labels:  spark, pyspark
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (-25%)
Mutual labels:  spark, pyspark
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (+26.72%)
Mutual labels:  spark, pyspark
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-87.93%)
Mutual labels:  spark, pyspark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-70.69%)
Mutual labels:  spark, pyspark
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (+43.1%)
Mutual labels:  spark, pyspark
Hnswlib
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-6.9%)
Mutual labels:  spark, pyspark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+445.69%)
Mutual labels:  spark, pyspark
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+86.21%)
Mutual labels:  spark, pyspark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-4.31%)
Mutual labels:  spark, pyspark
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+500%)
Mutual labels:  spark, pyspark
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+722.41%)
Mutual labels:  spark, pyspark
kafka-compose
🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-72.41%)
Mutual labels:  spark, pyspark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1053.45%)
Mutual labels:  spark, pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+42.24%)
Mutual labels:  spark, pyspark
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-6.9%)
Mutual labels:  spark, pyspark
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+36.21%)
Mutual labels:  spark, pyspark
Pyspark Learning
Updated repository
Stars: ✭ 147 (+26.72%)
Mutual labels:  spark, pyspark
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+1902.59%)
Mutual labels:  spark, pyspark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+2070.69%)
Mutual labels:  spark, pyspark
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+72.41%)
Mutual labels:  spark, pyspark
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2019.83%)
Mutual labels:  spark, pyspark
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-77.59%)
Mutual labels:  spark, pyspark
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-80.17%)
Mutual labels:  spark, pyspark
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-89.66%)
Mutual labels:  spark, pyspark
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-56.9%)
Mutual labels:  spark, pyspark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-45.69%)
Mutual labels:  spark, pyspark
spark-extension
A library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-78.45%)
Mutual labels:  spark, pyspark
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+250%)
Mutual labels:  spark, pyspark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+29.31%)
Mutual labels:  spark, pyspark
Relation extraction
Relation Extraction using Deep learning(CNN)
Stars: ✭ 96 (-17.24%)
Mutual labels:  spark, pyspark
Learningapachespark
LearningApacheSpark
Stars: ✭ 155 (+33.62%)
Mutual labels:  spark, pyspark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-78.45%)
Mutual labels:  spark, pyspark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+750%)
Mutual labels:  spark, pyspark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-44.83%)
Mutual labels:  spark, pyspark
Almond
A Scala kernel for Jupyter
Stars: ✭ 1,354 (+1067.24%)
Mutual labels:  spark
Java learning practice
java 进阶之路:面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等
Stars: ✭ 110 (-5.17%)
Mutual labels:  spark
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (-15.52%)
Mutual labels:  pyspark
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-16.38%)
Mutual labels:  spark
Spring Shiro Spark
Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-1.72%)
Mutual labels:  spark
Bigdataclass
Two-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-5.17%)
Mutual labels:  spark
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-16.38%)
Mutual labels:  spark
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (-6.03%)
Mutual labels:  spark
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-20.69%)
Mutual labels:  spark
Elassandra
Elassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+1287.93%)
Mutual labels:  spark
Spark Mllib Twitter Sentiment Analysis
🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Stars: ✭ 113 (-2.59%)
Mutual labels:  spark
Distributed Dataset
A distributed data processing framework in Haskell.
Stars: ✭ 108 (-6.9%)
Mutual labels:  spark
Spark Summit 2017 Sanfrancisco
spark summit 2017 SanFrancisco
Stars: ✭ 93 (-19.83%)
Mutual labels:  spark
Big Data
🔧 Use dplyr to analyze Big Data 🐘
Stars: ✭ 93 (-19.83%)
Mutual labels:  spark
Spark On Kubernetes Helm
Spark on Kubernetes infrastructure Helm charts repo
Stars: ✭ 92 (-20.69%)
Mutual labels:  spark
Xlearning Xdml
extremely distributed machine learning
Stars: ✭ 113 (-2.59%)
Mutual labels:  spark
Pyspark Tutorial
PySpark Code for Hands-on Learners
Stars: ✭ 91 (-21.55%)
Mutual labels:  pyspark
Bitcoin Value Predictor
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-21.55%)
Mutual labels:  pyspark
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-23.28%)
Mutual labels:  spark
Flink Learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+9708.62%)
Mutual labels:  spark
Ammonite Spark
Run spark calculations from Ammonite
Stars: ✭ 88 (-24.14%)
Mutual labels:  spark
Teddy
Spark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (+3.45%)
Mutual labels:  spark
Ibis
A pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+1305.17%)
Mutual labels:  spark
Python Bigdata
Data science and Big Data with Python
Stars: ✭ 112 (-3.45%)
Mutual labels:  spark
Seldon Server
Machine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1137.07%)
Mutual labels:  spark
1-60 of 458 similar projects