All Projects → Quinn → Similar Projects or Alternatives

200 Open source projects that are alternatives of or similar to Quinn

Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-41.94%)
Mutual labels:  pyspark
Cloud Based Sql Engine Using Spark
Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.
Stars: ✭ 30 (-86.18%)
Mutual labels:  apache-spark
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (-27.19%)
Mutual labels:  pyspark
Datahacksummit 2017
Apache Zeppelin notebooks for Recommendation Engines using Keras and Machine Learning on Apache Spark
Stars: ✭ 30 (-86.18%)
Mutual labels:  apache-spark
Eat pyspark in 10 days
pyspark🍒🥭 is delicious,just eat it!😋😋
Stars: ✭ 116 (-46.54%)
Mutual labels:  pyspark
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-18.43%)
Mutual labels:  apache-spark
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-94.47%)
Mutual labels:  pyspark
Hnswlib
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-50.23%)
Mutual labels:  pyspark
Pyspark Setup Demo
Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-88.94%)
Mutual labels:  pyspark
Cluster Pack
A library on top of either pex or conda-pack to make your Python code easily available on a cluster
Stars: ✭ 23 (-89.4%)
Mutual labels:  pyspark
Splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-51.61%)
Mutual labels:  apache-spark
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+257.14%)
Mutual labels:  apache-spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-89.4%)
Mutual labels:  apache-spark
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+220.74%)
Mutual labels:  pyspark
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-74.65%)
Mutual labels:  apache-spark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-88.48%)
Mutual labels:  pyspark
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+182.49%)
Mutual labels:  apache-spark
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (-32.26%)
Mutual labels:  pyspark
Streaming Readings
Streaming System 相关的论文读物
Stars: ✭ 554 (+155.3%)
Mutual labels:  apache-spark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+516.59%)
Mutual labels:  pyspark
Sparkle
Haskell on Apache Spark.
Stars: ✭ 419 (+93.09%)
Mutual labels:  apache-spark
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (-23.5%)
Mutual labels:  pyspark
Spark Syntax
This is a repo documenting the best practices in PySpark.
Stars: ✭ 412 (+89.86%)
Mutual labels:  pyspark
Bitcoin Value Predictor
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-58.06%)
Mutual labels:  pyspark
Awesome Kafka
A list about Apache Kafka
Stars: ✭ 397 (+82.95%)
Mutual labels:  apache-spark
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (-33.18%)
Mutual labels:  apache-spark
Sparkmeasure
This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (+69.59%)
Mutual labels:  apache-spark
Cuesheet
A framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-60.37%)
Mutual labels:  apache-spark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-84.33%)
Mutual labels:  pyspark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (+41.47%)
Mutual labels:  apache-spark
Mlflow
Open source platform for the machine learning lifecycle
Stars: ✭ 10,898 (+4922.12%)
Mutual labels:  apache-spark
Analytics Zoo
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Stars: ✭ 2,448 (+1028.11%)
Mutual labels:  apache-spark
Spark Atlas Connector
A Spark Atlas connector to track data lineage in Apache Atlas
Stars: ✭ 160 (-26.27%)
Mutual labels:  apache-spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-36.87%)
Mutual labels:  apache-spark
spark-extension
A library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-88.48%)
Mutual labels:  pyspark
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (-33.64%)
Mutual labels:  apache-spark
Spark Notebook
Interactive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+1319.82%)
Mutual labels:  apache-spark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-70.97%)
Mutual labels:  pyspark
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+27.19%)
Mutual labels:  apache-spark
Whylogs Java
Profile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (-24.42%)
Mutual labels:  apache-spark
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+19.35%)
Mutual labels:  apache-spark
Awesome Pulsar
A curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-73.73%)
Mutual labels:  apache-spark
HAL-9000
Automatically setup a productive development environment with Ansible on macOS
Stars: ✭ 72 (-66.82%)
Mutual labels:  apache-spark
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-35.48%)
Mutual labels:  apache-spark
Sparkit Learn
PySpark + Scikit-learn = Sparkit-learn
Stars: ✭ 1,073 (+394.47%)
Mutual labels:  apache-spark
spark-streaming-visualize
Simple demonstration of how to build a complex real time machine learning visualization tool.
Stars: ✭ 16 (-92.63%)
Mutual labels:  apache-spark
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-76.96%)
Mutual labels:  apache-spark
Spark Tpc Ds Performance Test
Use the TPC-DS benchmark to test Spark SQL performance
Stars: ✭ 133 (-38.71%)
Mutual labels:  apache-spark
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+1033.18%)
Mutual labels:  pyspark
Spark Sklearn
(Deprecated) Scikit-learn integration package for Apache Spark
Stars: ✭ 1,055 (+386.18%)
Mutual labels:  apache-spark
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-94.01%)
Mutual labels:  apache-spark
kafka-compose
🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-85.25%)
Mutual labels:  pyspark
Apache Spark Internals
The Internals of Apache Spark
Stars: ✭ 1,045 (+381.57%)
Mutual labels:  apache-spark
Spark-and-Kafka IoT-Data-Processing-and-Analytics
Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time
Stars: ✭ 42 (-80.65%)
Mutual labels:  pyspark
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-82.03%)
Mutual labels:  apache-spark
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+970.51%)
Mutual labels:  pyspark
Repo 2019
BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (-38.71%)
Mutual labels:  pyspark
Spark As Service Using Embedded Server
This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server
Stars: ✭ 46 (-78.8%)
Mutual labels:  apache-spark
connected-component
Map Reduce Implementation of Connected Component on Apache Spark
Stars: ✭ 68 (-68.66%)
Mutual labels:  apache-spark
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-76.96%)
Mutual labels:  pyspark
61-120 of 200 similar projects