All Projects → spark-extension → Similar Projects or Alternatives

456 Open source projects that are alternatives of or similar to spark-extension

incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+9736%)
Mutual labels:  spark, pyspark
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-8%)
Mutual labels:  spark, pyspark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+2432%)
Mutual labels:  spark, pyspark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (+0%)
Mutual labels:  spark, pyspark
Learningapachespark
LearningApacheSpark
Stars: ✭ 155 (+520%)
Mutual labels:  spark, pyspark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+3844%)
Mutual labels:  spark, pyspark
kafka-compose
🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (+28%)
Mutual labels:  spark, pyspark
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+9192%)
Mutual labels:  spark, pyspark
Relation extraction
Relation Extraction using Deep learning(CNN)
Stars: ✭ 96 (+284%)
Mutual labels:  spark, pyspark
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (+488%)
Mutual labels:  spark, pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+560%)
Mutual labels:  spark, pyspark
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (+100%)
Mutual labels:  spark, pyspark
Eat pyspark in 10 days
pyspark🍒🥭 is delicious,just eat it!😋😋
Stars: ✭ 116 (+364%)
Mutual labels:  spark, pyspark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (+152%)
Mutual labels:  spark, pyspark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+5252%)
Mutual labels:  spark, pyspark
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-44%)
Mutual labels:  spark, pyspark
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (+564%)
Mutual labels:  spark, pyspark
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+1524%)
Mutual labels:  spark, pyspark
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+2684%)
Mutual labels:  spark, pyspark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (+156%)
Mutual labels:  spark, pyspark
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (+248%)
Mutual labels:  spark, pyspark
Pyspark Learning
Updated repository
Stars: ✭ 147 (+488%)
Mutual labels:  spark, pyspark
Hnswlib
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (+332%)
Mutual labels:  spark, pyspark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+9972%)
Mutual labels:  spark, pyspark
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+700%)
Mutual labels:  spark, pyspark
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+332%)
Mutual labels:  spark, pyspark
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+764%)
Mutual labels:  spark, pyspark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+500%)
Mutual labels:  spark, pyspark
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (+4%)
Mutual labels:  spark, pyspark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (+36%)
Mutual labels:  spark, pyspark
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+3716%)
Mutual labels:  spark, pyspark
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-52%)
Mutual labels:  spark, pyspark
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+532%)
Mutual labels:  spark, pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+11496%)
Mutual labels:  spark, pyspark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+344%)
Mutual labels:  spark, pyspark
spark-druid-olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+1044%)
Mutual labels:  spark
sentry-spark
Apache Spark Sentry Integration
Stars: ✭ 14 (-44%)
Mutual labels:  spark
Azure-Databricks-NYC-Taxi-Workshop
An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset
Stars: ✭ 71 (+184%)
Mutual labels:  pyspark
tpch-spark
TPC-H queries in Apache Spark SQL using native DataFrames API
Stars: ✭ 63 (+152%)
Mutual labels:  spark
Python Master Courses
人生苦短 我用Python
Stars: ✭ 61 (+144%)
Mutual labels:  spark
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (+40%)
Mutual labels:  spark
sparkar-volts
An extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-40%)
Mutual labels:  spark
spark-acid
ACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (+264%)
Mutual labels:  spark
Spark-Ar
Resources for Spark AR
Stars: ✭ 43 (+72%)
Mutual labels:  spark
smolder
HL7 Apache Spark Datasource
Stars: ✭ 33 (+32%)
Mutual labels:  spark
experiments
Code examples for my blog posts
Stars: ✭ 21 (-16%)
Mutual labels:  spark
spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
Stars: ✭ 20 (-20%)
Mutual labels:  spark
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-20%)
Mutual labels:  spark
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+624%)
Mutual labels:  spark
spark-word2vec
A parallel implementation of word2vec based on Spark
Stars: ✭ 24 (-4%)
Mutual labels:  spark
spark-stringmetric
Spark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (+104%)
Mutual labels:  spark
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (+140%)
Mutual labels:  spark
frovedis
Framework of vectorized and distributed data analytics
Stars: ✭ 59 (+136%)
Mutual labels:  spark
spark-kubernetes
spark on kubernetes
Stars: ✭ 80 (+220%)
Mutual labels:  spark
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+360%)
Mutual labels:  pyspark
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+36%)
Mutual labels:  pyspark
shamash
Autoscaling for Google Cloud Dataproc
Stars: ✭ 31 (+24%)
Mutual labels:  spark
dlsa
Distributed least squares approximation (dlsa) implemented with Apache Spark
Stars: ✭ 25 (+0%)
Mutual labels:  pyspark
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-32%)
Mutual labels:  pyspark
Casper
A compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (+80%)
Mutual labels:  spark
1-60 of 456 similar projects