All Projects → Spark Notebook → Similar Projects or Alternatives

1926 Open source projects that are alternatives of or similar to Spark Notebook

Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-98.21%)
Mutual labels:  data-science, spark, apache-spark
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-86.6%)
Mutual labels:  data-science, spark, apache-spark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-56.57%)
Mutual labels:  data-science, spark, notebook
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-98.12%)
Mutual labels:  data-science, spark
Ipython Dashboard
A stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated dashboard, and real-time streaming graph.
Stars: ✭ 664 (-78.45%)
Mutual labels:  data-science, notebook
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-98.64%)
Mutual labels:  data-science, notebook
D2l En
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.
Stars: ✭ 11,837 (+284.19%)
Mutual labels:  data-science, notebook
Starcraft2 Replay Analysis
A jupyter notebook that provides analysis for StarCraft 2 replays
Stars: ✭ 90 (-97.08%)
Mutual labels:  data-science, notebook
Scipy 2017 Cython Tutorial
Material for the SciPy 2017 Cython tutorial
Stars: ✭ 114 (-96.3%)
Mutual labels:  data-science, notebook
Cape Python
Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-95.94%)
Mutual labels:  data-science, spark
Machinelearningnotebooks
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
Stars: ✭ 2,790 (-9.44%)
Mutual labels:  data-science, notebook
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+83.58%)
Mutual labels:  data-science, spark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (-79.45%)
Mutual labels:  data-science, spark
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (-67.61%)
Mutual labels:  data-science, spark
Lambdaschooldatascience
Completed assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-99.29%)
Mutual labels:  data-science, notebook
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-97.44%)
Mutual labels:  data-science, spark
Jupytemplate
Templates for jupyter notebooks
Stars: ✭ 85 (-97.24%)
Mutual labels:  data-science, notebook
Python Bigdata
Data science and Big Data with Python
Stars: ✭ 112 (-96.36%)
Mutual labels:  data-science, spark
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-98.05%)
Mutual labels:  data-science, spark
Scalable Data Science Platform
Content for architecting a data science platform for products using Luigi, Spark & Flask.
Stars: ✭ 158 (-94.87%)
Mutual labels:  data-science, spark
Bookstore
📚 Notebook storage and publishing workflows for the masses
Stars: ✭ 162 (-94.74%)
Mutual labels:  data-science, notebook
Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (-93.64%)
Mutual labels:  data-science, notebook
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (-92.5%)
Mutual labels:  data-science, apache-spark
Cjworkbench
The data journalism platform with built in training
Stars: ✭ 244 (-92.08%)
Mutual labels:  data-science, notebook
Ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Stars: ✭ 15,107 (+390.33%)
Mutual labels:  data-science, notebook
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-99.58%)
Mutual labels:  spark, apache-spark
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (-80.1%)
Mutual labels:  data-science, apache-spark
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (-82.8%)
Mutual labels:  data-science, notebook
Nteract
📘 The interactive computing suite for you! ✨
Stars: ✭ 5,713 (+85.43%)
Mutual labels:  data-science, notebook
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+615.61%)
Mutual labels:  data-science, spark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-68%)
Mutual labels:  data-science, spark
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-99.16%)
Mutual labels:  data-science, spark
percival
📝 Web-based, reactive Datalog notebooks for data analysis and visualization
Stars: ✭ 285 (-90.75%)
Mutual labels:  reactive, notebook
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-99.25%)
Mutual labels:  spark, apache-spark
Allstate capstone
Allstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (-97.66%)
Mutual labels:  data-science, notebook
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-97.89%)
Mutual labels:  data-science, spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-97.92%)
Mutual labels:  data-science, spark
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (-91.56%)
Mutual labels:  data-science, spark
Spark R Notebooks
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-96.46%)
Mutual labels:  data-science, notebook
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-96.49%)
Mutual labels:  data-science, spark
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-96.04%)
Mutual labels:  data-science, spark
D2l Vn
Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-86.95%)
Mutual labels:  data-science, notebook
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-95.07%)
Mutual labels:  data-science, spark
Benchm Ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (-40.44%)
Mutual labels:  data-science, spark
Handout
Turn Python scripts into handouts with Markdown and figures
Stars: ✭ 1,973 (-35.96%)
Mutual labels:  data-science, notebook
Datacompy
Pandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-95.23%)
Mutual labels:  data-science, spark
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-92.63%)
Mutual labels:  data-science, spark
Jupyterlab templates
Support for jupyter notebook templates in jupyterlab
Stars: ✭ 223 (-92.76%)
Mutual labels:  data-science, notebook
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (-1.2%)
Mutual labels:  data-science, spark
Scalable Data Science
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (-95.39%)
Mutual labels:  data-science, apache-spark
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-98.73%)
Mutual labels:  spark, apache-spark
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-98.05%)
Mutual labels:  spark, notebook
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-96.4%)
Mutual labels:  spark, apache-spark
Pluto.jl
🎈 Simple reactive notebooks for Julia
Stars: ✭ 3,430 (+11.33%)
Mutual labels:  reactive, notebook
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-91.98%)
Mutual labels:  spark, apache-spark
Quantitative Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (-88.45%)
Mutual labels:  data-science, notebook
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (-95.85%)
Mutual labels:  data-science, apache-spark
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-97.05%)
Mutual labels:  reactive, data-science
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-91.59%)
Mutual labels:  spark, apache-spark
Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (-3.73%)
Mutual labels:  data-science, notebook
1-60 of 1926 similar projects