All Projects → spark-notebook → Spark Notebook

spark-notebook / Spark Notebook

Licence: apache-2.0
Interactive and Reactive Data Science using Scala and Spark.

Programming Languages

javascript
184084 projects - #8 most used programming language
scala
5932 projects
Jupyter Notebook
11667 projects
HTML
75241 projects
Less
1899 projects
CSS
56736 projects

Projects that are alternatives of or similar to Spark Notebook

Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-86.6%)
Mutual labels:  data-science, spark, apache-spark
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-98.21%)
Mutual labels:  data-science, spark, apache-spark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-56.57%)
Mutual labels:  data-science, spark, notebook
Ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Stars: ✭ 15,107 (+390.33%)
Mutual labels:  data-science, notebook
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (-92.5%)
Mutual labels:  data-science, apache-spark
Cjworkbench
The data journalism platform with built in training
Stars: ✭ 244 (-92.08%)
Mutual labels:  data-science, notebook
Machinelearningnotebooks
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
Stars: ✭ 2,790 (-9.44%)
Mutual labels:  data-science, notebook
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-98.05%)
Mutual labels:  spark, notebook
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-97.05%)
Mutual labels:  reactive, data-science
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-98.73%)
Mutual labels:  spark, apache-spark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-96.4%)
Mutual labels:  spark, apache-spark
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-92.63%)
Mutual labels:  data-science, spark
Jupyterlab templates
Support for jupyter notebook templates in jupyterlab
Stars: ✭ 223 (-92.76%)
Mutual labels:  data-science, notebook
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (-1.2%)
Mutual labels:  data-science, spark
Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (-93.64%)
Mutual labels:  data-science, notebook
Pluto.jl
🎈 Simple reactive notebooks for Julia
Stars: ✭ 3,430 (+11.33%)
Mutual labels:  reactive, notebook
percival
📝 Web-based, reactive Datalog notebooks for data analysis and visualization
Stars: ✭ 285 (-90.75%)
Mutual labels:  reactive, notebook
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-99.25%)
Mutual labels:  spark, apache-spark
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (-91.56%)
Mutual labels:  data-science, spark
Bookstore
📚 Notebook storage and publishing workflows for the masses
Stars: ✭ 162 (-94.74%)
Mutual labels:  data-science, notebook

Spark Notebook

Gitter

The Spark Notebook is the open source notebook aimed at enterprise environments, providing Data Scientists and Data Engineers with an interactive web-based editor that can combine Scala code, SQL queries, Markup and JavaScript in a collaborative manner to explore, analyse and learn from massive data sets.

notebook intro

The Spark Notebook allows performing reproducible analysis with Scala, Apache Spark and the Big Data ecosystem.

Features Highlights

Apache Spark

Apache Spark is available out of the box, and is simply accessed by the variable sparkContext or sc.

Multiple Spark Context Support

One of the top most useful feature brought by the Spark Notebook is its separation of the running notebooks. Each started notebook will spawn a new JVM with its own SparkSession instance. This allows a maximal flexibility for:

  • dependencies without clashes
  • access different clusters
  • tune differently each notebook
  • external scheduling (on the roadmap)

Metadata-driven configuration

We achieve maximum flexibility with the availability of multiple sparkContexts by enabling metadata driven configuration.

Scala

The Spark Notebook supports exclusively the Scala programming language, the Unpredicted Lingua Franca for Data Science and extensibly exploits the JVM ecosystem of libraries to drive an smooth evolution of data-driven software from exploration to production.

The Spark Notebook is available for *NIX and Windows systems in easy to use ZIP/TAR, Docker and DEB packages.

Reactive

All components in the Spark Notebook are dynamic and reactive.

The Spark Notebook comes with dynamic charts and most (if not all) components can be listened for and can react to events. This is very helpful in many cases, for example:

  • data entering the system live at runtime
  • visually plots of events
  • multiple interconnected visual components Dynamic and reactive components mean that you don't have write the html, js, server code just for basic use cases.

Quick Start

Go to Quick Start for our 5-minutes guide to get up and running with the Spark Notebook.

C'mon on to Gitter to discuss things, to get some help, or to start contributing!

Learn more

Testimonials

Skymind - Deeplearning4j

Spark Notebook gives us a clean, useful way to mix code and prose when we demo and explain our tech to customers. The Spark ecosystem needed this.

Vinted.com

It allows our analysts and developers (15+ users) to run ad-hoc queries, to perform complex data analysis and data visualisations, prototype machine learning pipelines. In addition, we use it to power our BI dashboards.

Adopters

Name Logo URL Description
Kensu Kensu website Lifting Data Science to the Enterprise level
Agile Lab Agile Lab website The only Italian Spark Certified systems integrator
CloudPhysics CloudPhysics website Data-Driven Inisghts for Smarter IT
Aliyun Alibaba - Aliyun ECS product Spark runtime environment on ECS and management tool of Spark Cluster running on Aliyun ECS
EMBL European Bioinformatics Institute EMBL - EBI website EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academia and industry.
Metail Metail website The best body shape and garment fit company in the world. To create and empower everyone’s online body identity.
kt NexR kt NexR website the kt NexR is one of the leading BigData company in the Korea from 2007.
Skymind website At Skymind, we’re tackling some of the most advanced problems in data analysis and machine intelligence. We offer start-of-the-art, flexible, scalable deep learning for industry.
Amino website A new way to get the facts about your health care choices.
Vinted Vinted website Online marketplace and a social network focused on young women’s lifestyle.
Vingle Vingle website Vingle is the community where you can meet someone like you.
47 Degrees website 47 Degrees is a global consulting firm and certified Typesafe & Databricks Partner specializing in Scala & Spark.
Barclays Barclays website Barclays is a British multinational banking and financial services company headquartered in London.
Swisscom Swisscom website Swisscom is the leading mobile service provider in Switzerland.
Knoldus knoldus website Knoldus is a global consulting firm and certified "Select" Lightbend & Databricks Partner specializing in Scala & Spark ecosystem.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].