All Projects → lux-org → Lux

lux-org / Lux

Licence: apache-2.0
Python API for Intelligent Visual Data Discovery

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Lux

Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+958.32%)
Mutual labels:  data-science, jupyter, pandas, exploratory-data-analysis
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-97.46%)
Mutual labels:  data-science, jupyter, pandas
Code
Compilation of R and Python programming codes on the Data Professor YouTube channel.
Stars: ✭ 287 (-63.53%)
Mutual labels:  data-science, pandas, exploratory-data-analysis
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (-32.66%)
Mutual labels:  data-science, jupyter, exploratory-data-analysis
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-95.55%)
Mutual labels:  data-science, jupyter, pandas
Spark R Notebooks
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-86.15%)
Mutual labels:  data-science, jupyter, exploratory-data-analysis
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+135.2%)
Mutual labels:  data-science, pandas, exploratory-data-analysis
Learnpythonforresearch
This repository provides everything you need to get started with Python for (social science) research.
Stars: ✭ 163 (-79.29%)
Mutual labels:  data-science, jupyter, pandas
Lantern
Data exploration glue
Stars: ✭ 292 (-62.9%)
Mutual labels:  data-science, jupyter, pandas
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+3.43%)
Mutual labels:  data-science, jupyter
Intro To Python
An intro to Python & programming for wanna-be data scientists
Stars: ✭ 536 (-31.89%)
Mutual labels:  data-science, jupyter
Data Science Portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
Stars: ✭ 559 (-28.97%)
Mutual labels:  data-science, pandas
Lets Plot
An open-source plotting library for statistical data.
Stars: ✭ 531 (-32.53%)
Mutual labels:  data-science, jupyter
Cookbook 2nd Code
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (-31.26%)
Mutual labels:  data-science, jupyter
Or Pandas
【运筹OR帷幄|数据科学】pandas教程系列电子书
Stars: ✭ 492 (-37.48%)
Mutual labels:  jupyter, pandas
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (-25.03%)
Mutual labels:  data-science, pandas
Datasheets
Read data from, write data to, and modify the formatting of Google Sheets
Stars: ✭ 593 (-24.65%)
Mutual labels:  data-science, pandas
Dataframe Go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Stars: ✭ 487 (-38.12%)
Mutual labels:  data-science, pandas
Alphapy
Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost
Stars: ✭ 564 (-28.34%)
Mutual labels:  data-science, pandas
Dataframe
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+5.21%)
Mutual labels:  data-science, pandas

A Python API for Intelligent Visual Discovery

Build Status PyPI version Documentation Status Slack Mailing List Binder CodeCov Twitter Follow

Lux is a Python library that makes data science easier by automating aspects of the data exploration process. Lux facilitate faster experimentation with data, even when the user does not have a clear idea of what they are looking for. Visualizations are displayed via an interactive widget that allow users to quickly browse through large collections of visualizations directly within their Jupyter notebooks.

Here is a 1-min video introducing Lux, and slides from a more extended talk.

Try out Lux on your own in a live Jupyter Notebook here!

Getting Started

To start using Lux, simply add an extra import statement along with your Pandas import.

import lux
import pandas as pd

Then, Lux can be used as-is, without modifying any of your existing Pandas code. Here, we use Pandas's read_csv command to load in a dataset of colleges and their properties.

df = pd.read_csv("https://raw.githubusercontent.com/lux-org/lux-datasets/master/data/college.csv")
df

Basic recommendations in Lux

Voila! Here's a set of visualizations that you can now use to explore your dataset further!

Next-step recommendations based on user intent:

In addition to dataframe visualizations at every step in the exploration, you can specify in Lux the attributes and values you're interested in. Based on this intent, Lux guides users towards potential next-steps in their exploration.

For example, we might be interested in the attributes AverageCost and SATAverage.

df.intent = ["AverageCost","SATAverage"]
df

Next-step Recommendations Based on User Context

The left-hand side of the widget shows the current visualization, i.e., the current visualization generated based on what the user is interested in. On the right, Lux generates three sets of recommendations, organized as separate tabs on the widget:

  • Enhance adds an additional attribute to the current selection, essentially highlighting how additional variables affect the relationship of AverageCost and SATAverage. We see that if we breakdown the relationship by FundingModel, there is a clear separation between public colleges (shown in red) and private colleges (in blue), with public colleges being cheaper to attend and with SAT average of lower than 1400. Enhance Recommendations
  • Filter adds a filter to the current selection, while keeping attributes (on the X and Y axes) fixed. These visualizations show how the relationship of AverageCost and SATAverage changes for different subsets of data. For instance, we see that colleges that offer Bachelor's degree as its highest degree show a roughly linear trend between the two variables. Filter Recommendations
  • Generalize removes an attribute to display a more general trend, showing the distributions of AverageCost and SATAverage on its own. From the AverageCost histogram, we see that many colleges with average cost of around $20000 per year, corresponding to the bulge we see in the scatterplot view. Generalize Recommendations

See this page for more information on additional ways for specifying the intent.

Easy programmatic access and export of visualizations:

Now that we have found some interesting visualizations through Lux, we might be interested in digging into these visualizations a bit more or sharing it with others. We can save the visualizations generated in Lux as a static, shareable HTML or programmatically access these visualizations further in Jupyter. Selected Vis objects can be translated into Altair or Vega-Lite code, so that they can be further edited.

Easily exportable visualization object

Learn more about how to save and export visualizations here.

Quick, on-demand visualizations with the help of automatic encoding:

We've seen how Viss are automatically generated as part of the recommendations. Users can also create their own Vis via the same syntax as specifying the intent. Lux is built on the philosophy that users should always be able to visualize anything they want, without having to think about how the visualization should look like. Lux automatically determines the mark and channel mappings based on a set of best practices. The visualizations are rendered via Altair into Vega-Lite specifications.

from lux.vis.Vis import Vis
Vis(["Region=New England","MedianEarnings"],df)

Specified Visualization

Powerful language for working with collections of visualizations:

Lux provides a powerful abstraction for working with collections of visualizations based on a partially specified queries. Users can provide a list or a wildcard to iterate over combinations of filter or attribute values and quickly browse through large numbers of visualizations. The partial specification is inspired by existing work on visualization query languages, including ZQL and CompassQL.

For example, we might be interested in looking at how the AverageCost distribution differs across different Regions.

from lux.vis.VisList import VisList
VisList(["Region=?","AverageCost"],df)

Example Vis List

To find out more about other features in Lux, see the complete documentation on ReadTheDocs.

Installation

To get started, Lux can be installed through PyPI.

pip install lux-api

If you use conda, you can install Lux via:

conda install -c conda-forge lux-api

Both the PyPI and conda installation include includes the Lux Jupyter widget frontend, lux-widget.

Setup in Jupyter Notebook, VSCode

To use Lux in Jupyter notebook or VSCode, activate the notebook extension:

jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

If the installation happens correctly, you should see two - Validating: OK after executing the two lines above.

Setup in Jupyter Lab

To use Lux in Jupyter Lab, activate the lab extension:

jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install luxwidget

Lux is only compatible with Jupyter Lab version 2.2.9 and below. Support for the recent JupyterLab 3 will come soon. Note that JupyterLab and VSCode is supported only for lux-widget version >=0.1.2, if you have an earlier version, please upgrade to the latest version of lux-widget. Lux currently only works with the Chrome browser.

If you encounter issues with the installation, please refer to this page to troubleshoot the installation. Follow these instructions to set up Lux for development purposes.

Support and Resources

Lux is undergoing active development. If you are interested in using Lux, we would love to hear from you. Any feedback, suggestions, and contributions for improving Lux are welcome.

Other additional resources:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].