All Projects → learn-by-examples → Similar Projects or Alternatives

201 Open source projects that are alternatives of or similar to learn-by-examples

Pyspark Boilerplate
A boilerplate for writing PySpark Jobs
Stars: ✭ 318 (+278.57%)
Mutual labels:  apache-spark, pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+96.43%)
Mutual labels:  apache-spark, pyspark
jupyterlab-sparkmonitor
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (-7.14%)
Mutual labels:  apache-spark, pyspark
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (-34.52%)
Mutual labels:  apache-spark, pyspark
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-66.67%)
Mutual labels:  apache-spark, pyspark
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (-64.29%)
Mutual labels:  apache-spark, pyspark
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+16.67%)
Mutual labels:  apache-spark, pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+3351.19%)
Mutual labels:  apache-spark, pyspark
Awesome Spark
A curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+1163.1%)
Mutual labels:  apache-spark, pyspark
Sparkora
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (-39.29%)
Mutual labels:  apache-spark, pyspark
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-53.57%)
Mutual labels:  apache-spark, pyspark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+78.57%)
Mutual labels:  apache-spark, pyspark
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+36.9%)
Mutual labels:  apache-spark, pyspark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-40.48%)
Mutual labels:  apache-spark, pyspark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+32.14%)
Mutual labels:  apache-spark, pyspark
Quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+158.33%)
Mutual labels:  apache-spark, pyspark
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-72.62%)
Mutual labels:  apache-spark, pyspark
Spark-for-data-engineers
Apache Spark for data engineers
Stars: ✭ 22 (-73.81%)
Mutual labels:  apache-spark, pyspark
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+3894.05%)
Mutual labels:  apache-spark, pyspark
Spark Gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Stars: ✭ 308 (+266.67%)
Mutual labels:  apache-spark, pyspark
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-83.33%)
Mutual labels:  apache-spark, pyspark
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (+71.43%)
Mutual labels:  apache-spark
Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (+165.48%)
Mutual labels:  apache-spark
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+66.67%)
Mutual labels:  apache-spark
Spark Tpc Ds Performance Test
Use the TPC-DS benchmark to test Spark SQL performance
Stars: ✭ 133 (+58.33%)
Mutual labels:  apache-spark
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+1508.33%)
Mutual labels:  pyspark
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1948.81%)
Mutual labels:  apache-spark
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (+52.38%)
Mutual labels:  apache-spark
Oryx
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+2025%)
Mutual labels:  apache-spark
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+175%)
Mutual labels:  apache-spark
Scalable Data Science
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (+69.05%)
Mutual labels:  apache-spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (+63.1%)
Mutual labels:  apache-spark
Spark Workshop
Apache Spark™ and Scala Workshops
Stars: ✭ 224 (+166.67%)
Mutual labels:  apache-spark
fink-broker
Astronomy Broker based on Apache Spark
Stars: ✭ 18 (-78.57%)
Mutual labels:  apache-spark
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+155.95%)
Mutual labels:  apache-spark
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (+44.05%)
Mutual labels:  apache-spark
Spark On K8s Operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+2019.05%)
Mutual labels:  apache-spark
Splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (+25%)
Mutual labels:  apache-spark
spark-connector
A connector for Apache Spark to access Exasol
Stars: ✭ 13 (-84.52%)
Mutual labels:  apache-spark
Learning Apache Spark
Notes on Apache Spark (pyspark)
Stars: ✭ 211 (+151.19%)
Mutual labels:  apache-spark
Docker Spark
Apache Spark docker image
Stars: ✭ 1,396 (+1561.9%)
Mutual labels:  apache-spark
Analytics Zoo
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Stars: ✭ 2,448 (+2814.29%)
Mutual labels:  apache-spark
Cuesheet
A framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (+2.38%)
Mutual labels:  apache-spark
Spark States
Custom state store providers for Apache Spark
Stars: ✭ 83 (-1.19%)
Mutual labels:  apache-spark
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-30.95%)
Mutual labels:  pyspark
jgit-spark-connector
jgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
Stars: ✭ 71 (-15.48%)
Mutual labels:  pyspark
workshop-spark
Código para workshops Spark com ambiente de desenvolvimento em docker
Stars: ✭ 27 (-67.86%)
Mutual labels:  pyspark
Sparktorch
Train and run Pytorch models on Apache Spark.
Stars: ✭ 195 (+132.14%)
Mutual labels:  apache-spark
Mlflow
Open source platform for the machine learning lifecycle
Stars: ✭ 10,898 (+12873.81%)
Mutual labels:  apache-spark
Awesome Pulsar
A curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-32.14%)
Mutual labels:  apache-spark
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+110.71%)
Mutual labels:  apache-spark
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-34.52%)
Mutual labels:  apache-spark
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (-44.05%)
Mutual labels:  apache-spark
Sparkit Learn
PySpark + Scikit-learn = Sparkit-learn
Stars: ✭ 1,073 (+1177.38%)
Mutual labels:  apache-spark
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-40.48%)
Mutual labels:  apache-spark
Whylogs Java
Profile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (+95.24%)
Mutual labels:  apache-spark
Spark Sklearn
(Deprecated) Scikit-learn integration package for Apache Spark
Stars: ✭ 1,055 (+1155.95%)
Mutual labels:  apache-spark
Apache Spark Internals
The Internals of Apache Spark
Stars: ✭ 1,045 (+1144.05%)
Mutual labels:  apache-spark
spark-dgraph-connector
A connector for Apache Spark and PySpark to Dgraph databases.
Stars: ✭ 36 (-57.14%)
Mutual labels:  pyspark
Spark Atlas Connector
A Spark Atlas connector to track data lineage in Apache Atlas
Stars: ✭ 160 (+90.48%)
Mutual labels:  apache-spark
1-60 of 201 similar projects