All Projects → PAIR-code → Facets

PAIR-code / Facets

Licence: apache-2.0
Visualizations for machine learning datasets

Programming Languages

Jupyter Notebook
11667 projects
HTML
75241 projects
typescript
32286 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Facets

Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-95.95%)
Mutual labels:  jupyter-notebook, data-visualization
Articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-94.81%)
Mutual labels:  jupyter-notebook, data-visualization
Cryptocurrency Analysis Python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
Stars: ✭ 278 (-95.88%)
Mutual labels:  jupyter-notebook, data-visualization
Datascienceprojects
The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.
Stars: ✭ 223 (-96.69%)
Mutual labels:  jupyter-notebook, data-visualization
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (-93.27%)
Mutual labels:  jupyter-notebook, data-visualization
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-96.63%)
Mutual labels:  jupyter-notebook, data-visualization
Joypy
Joyplots in Python with matplotlib & pandas 📈
Stars: ✭ 322 (-95.23%)
Mutual labels:  jupyter-notebook, data-visualization
Dexplot
Simple plotting library that wraps Matplotlib and integrated with DataFrames
Stars: ✭ 208 (-96.92%)
Mutual labels:  jupyter-notebook, data-visualization
Py d3
D3 block magic for Jupyter notebook.
Stars: ✭ 428 (-93.65%)
Mutual labels:  jupyter-notebook, data-visualization
Bap
Bayesian Analysis with Python (Second Edition)
Stars: ✭ 379 (-94.38%)
Mutual labels:  jupyter-notebook, data-visualization
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-96.77%)
Mutual labels:  jupyter-notebook, data-visualization
Qs ledger
Quantified Self Personal Data Aggregator and Data Analysis
Stars: ✭ 559 (-91.71%)
Mutual labels:  jupyter-notebook, data-visualization
Edaviz
edaviz - Python library for Exploratory Data Analysis and Visualization in Jupyter Notebook or Jupyter Lab
Stars: ✭ 220 (-96.74%)
Mutual labels:  jupyter-notebook, data-visualization
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-96.44%)
Mutual labels:  jupyter-notebook, data-visualization
Gwu data mining
Materials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (-96.78%)
Mutual labels:  jupyter-notebook, data-visualization
Pydataroad
open source for wechat-official-account (ID: PyDataLab)
Stars: ✭ 302 (-95.52%)
Mutual labels:  jupyter-notebook, data-visualization
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-97.46%)
Mutual labels:  jupyter-notebook, data-visualization
Dtale
Visualizer for pandas data structures
Stars: ✭ 2,864 (-57.53%)
Mutual labels:  jupyter-notebook, data-visualization
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (-95.33%)
Mutual labels:  jupyter-notebook, data-visualization
Cookbook 2nd Code
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (-91.98%)
Mutual labels:  jupyter-notebook, data-visualization

Introduction

The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive.

The visualizations are implemented as Polymer web components, backed by Typescript code and can be easily embedded into Jupyter notebooks or webpages.

Live demos of the visualizations can be found on the Facets project description page.

Facets Overview

Overview visualization of UCI census data

Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature.

Overview can help uncover issues with datasets, including the following:

  • Unexpected feature values
  • Missing feature values for a large number of examples
  • Training/serving skew
  • Training/test/validation set skew

Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets.

The python code to generate the statistics for visualization can be installed through pip install facets-overview.

Details about Overview usage can be found in its README.

Facets Dive

Dive visualization of UCI census data

Dive is a tool for interactively exploring up to tens of thousands of multidimensional data points, allowing users to seamlessly switch between a high-level overview and low-level details. Each example is a represented as single item in the visualization and the points can be positioned by faceting/bucketing in multiple dimensions by their feature values. Combining smooth animation and zooming with faceting and filtering, Dive makes it easy to spot patterns and outliers in complex data sets.

Details about Dive usage can be found in its README.

Setup

Usage in Google Colabratory/Jupyter Notebooks

Using Facets in Google Colabratory and Jupyter notebooks can be seen in this notebook. These notebooks work without the need to first download/install this repository.

Both Facets visualizations make use of HTML imports. So in order to use them, you must first load the appropriate polyfill, through <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>, as shown in the demo notebooks in this repo.

Note that for using Facets Overview in a Jupyter notebook, there are two considerations:

  1. In the notebook, you will need to change the path that the Facets Overview python code is loaded from to the correct path given where your notebook kernel is run from.
  2. You must also have the Protocol Buffers python runtime library installed: https://github.com/google/protobuf/tree/master/python. If you used pip or anaconda to install Jupyter, you can use the same tool to install the runtime library.

When visualizing a large amount of data in Dive in a Juypter notebook, as is done in the Dive demo Jupyter notebook, you will need to start the notebook server with an increased IOPub data rate. This can be done with the command jupyter notebook --NotebookApp.iopub_data_rate_limit=10000000.

Code Installation

git clone https://github.com/PAIR-code/facets
cd facets

Building the Visualizations

If you make code changes to the visualization and would like to rebuild them, follow these directions:

  1. Install bazel: https://bazel.build/
  2. Build the visualizations: bazel build facets:facets_jupyter (run from the facets top-level directory)

Using the rebuilt Visualizations in a Jupyter notebook

If you want to use the visualizations you built locally in a Jupyter notebook, follow these directions:

  1. Move the resulting vulcanized html file from the build step into the facets-dist directory: cp -f bazel-bin/facets/facets-jupyter.html facets-dist/
  2. Install the visualizations into Jupyter as an nbextension.
  • If jupyter was installed with pip, you can use jupyter nbextension install facets-dist/ if jupyter was installed system-wide or jupyter nbextension install facets-dist/ --user if installed per-user (run from the facets top-level directory). You do not need to run any follow-up jupyter nbextension enable command for this extension.
  • Alternatively, you can manually install the nbextension by finding your jupyter installation's share/jupyter/nbextensions folder and copying the facets-dist directory into it.
  1. In the notebook cell's HTML link tag that loads the built facets html, load from /nbextensions/facets-dist/facets-jupyter.html, which is the locally installed facets distribution. from the previous step.

Known Issues

  • The Facets visualizations currently work only in Chrome - Issue 9.

Disclaimer: This is not an official Google product

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].