All Projects → pixiedust → Pixiedust

pixiedust / Pixiedust

Licence: apache-2.0
Python Helper library for Jupyter Notebooks

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pixiedust

H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+466.73%)
Mutual labels:  jupyter-notebook, data-science, spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-93.59%)
Mutual labels:  jupyter-notebook, data-science, spark
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-93.99%)
Mutual labels:  jupyter-notebook, data-science, spark
Python Bigdata
Data science and Big Data with Python
Stars: ✭ 112 (-88.78%)
Mutual labels:  jupyter-notebook, data-science, spark
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-77.25%)
Mutual labels:  jupyter-notebook, data-science, spark
Scalable Data Science Platform
Content for architecting a data science platform for products using Luigi, Spark & Flask.
Stars: ✭ 158 (-84.17%)
Mutual labels:  jupyter-notebook, data-science, spark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+34.07%)
Mutual labels:  jupyter-notebook, data-science, spark
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-58.62%)
Mutual labels:  jupyter-notebook, data-science, spark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-1.2%)
Mutual labels:  jupyter-notebook, data-science, spark
Awesome Google Colab
Google Colaboratory Notebooks and Repositories (by @firmai)
Stars: ✭ 863 (-13.53%)
Mutual labels:  jupyter-notebook, data-science
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (-13.43%)
Mutual labels:  jupyter-notebook, data-science
Tedsds
Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-98.6%)
Mutual labels:  jupyter-notebook, spark
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-97.39%)
Mutual labels:  data-science, spark
Resources
PyMC3 educational resources
Stars: ✭ 930 (-6.81%)
Mutual labels:  jupyter-notebook, data-science
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+734.57%)
Mutual labels:  jupyter-notebook, data-science
Kubeflow Data Science On Steroids
The blog post about Kubeflow, including all materials
Stars: ✭ 25 (-97.49%)
Mutual labels:  jupyter-notebook, data-science
Intro Python
Python pour Statistique et Science des Données -- Syntaxe, Trafic de Données, Graphes, Programmation, Apprentissage
Stars: ✭ 21 (-97.9%)
Mutual labels:  jupyter-notebook, data-science
Mlnet Workshop
ML.NET Workshop to predict car sales prices
Stars: ✭ 29 (-97.09%)
Mutual labels:  jupyter-notebook, data-science
Python Introducing Pandas
Introduction to pandas Treehouse course
Stars: ✭ 24 (-97.6%)
Mutual labels:  jupyter-notebook, data-science
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-98%)
Mutual labels:  jupyter-notebook, data-science

PixieDust

PyPI version Build Status

PixieDust is a productivity tool for Python or Scala notebooks, which lets a developer encapsulate business logic into something easy for your customers to consume.

New Book now available: Thoughtful Data Science

This book published by Packt Publishing is the user and developer reference for using PixieDust

Pixiedust developer community

Wait! There is a developer community? Yes there is! If you already are a member, login. If you would like to contribute please join us.

Why you need it

Notebooks are a powerful tool for fast and flexible data analysis. But the learning curve is steep.

Python data science notebooks were first popularized in academia, and there are some formalities to work through before you can get to your analysis. For example, in a Python interactive notebook, a mundane task like creating a simple chart or saving data into a persistence repository requires mastery of complex code like this matplotlib snippet:

All this for a chart?
All this for a chart?

Once you do create a notebook that provides great data insights, it's hard to share with business users, who don't want to slog through all that dry, hard-to-read code, much less tweak it and collaborate.

PixieDust to the rescue.

What is PixieDust?

PixieDust is an open source helper library that works as an add-on to Jupyter notebooks to improve the user experience of working with data. It also fills a gap for users who have no access to configuration files when a notebook is hosted on the cloud.

Use in Python or Scala

PixieDust greatly simplifies working with Python display libraries like matplotlib, but works just as effectively in Scala notebooks too. You no longer have compromise your love of Scala to generate great charts. PixieDust lets you bring robust Python visualization options to your Scala notebooks. Installer and instructions to use Scala with PixieDust are coming soon...

Features

PixieDust's current capabilities include:

  • packageManager lets you install Spark packages inside a Python notebook. This is something that you can't do today on hosted Jupyter notebooks, which prevents developers from using a large number of spark package add-ons.

  • Visualizations. One single API called display() lets you visualize your Spark object in different ways: table, charts, maps, etc.... This module is designed to be extensible, providing an API that lets anyone easily contribute a new visualization plugin.

    This sample visualization plugin uses d3 to show the different flight routes for each airport:

    graph map

  • Embedded apps. Let nonprogrammers actively use notebooks. Transform a hard-to-read notebook into a polished graphic app for business users. Check out these preliminary sample apps:

    • An app can feature embedded forms and responses, flightpredict, which lets users enter flight details to see the likelihood of landing on-time.
    • Or present a sophisticated workflow, like our twitter demo, which delivers a real-time feed of tweets, trending hashtags, and aggregated sentiment charts with Watson Tone Analyzer.
  • Extensibility. Create your own visualizations or apps using the PixieDust extensibility APIs. If you know html and css, you can write and deliver amazing graphics without forcing notebook users to type one line of code. Use the shape of the data to control when PixieDust shows your visualization in a menu.

  • Export. Notebook users can download data to .csv, HTML, JSON, etc. locally on your laptop or into a variety of back-end data sources, like Cloudant, dashDB, GraphDB, etc...

    save as options

  • Scala Bridge. Use Scala directly in your Python notebook. Variables are automatically transfered from Python to Scala and vice-versa. Learn more.

    Or start in a Scala notebook. As mentioned, all these PixieDust features work not only in Python, but in Scala too. So if you prefer Scala, you'll soon be able to start there and use PixieDust to insert sophisticated Python graphic options within your Scala notebook. Instructions coming soon.

  • Spark progress monitor. Track the status of your Spark job. No more waiting in the dark. Notebook users can now see how a cell's code is running behind the scenes.

Watch this video to see PixieDust in action:

about PixieDust

Usage

You can use PixieDust locally or online within IBM's Watson Studio.

Use online

To use PixieDust online

Use locally

  • Pixiedust supports
  • Spark 1.6 or 2.0
  • Python 2.7 or 3.5

Sample notebooks

Wherever you prefer to work, try out the following sample notebooks:

Tutorials

Contribute

Note: PixieDust currently supports Spark DataFrames, Spark GraphFrames and Pandas DataFrames, with more to come. If you can't wait, write your own today and contribute it back.

Read how to contribute for details on our code of conduct and instructions for submitting pull requests to us.

Developer Guide

Dive into the PixieDust developer docs and learn how to build your own custom visualization or embedded app. You can also pitch in and contribute an enhancement to PixieDust's core features.

We can't wait to see what you build.

License

Apache License, Version 2.0.

For details and all the legalese, read LICENSE.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].