All Projects → AutoViML → Autoviz

AutoViML / Autoviz

Licence: apache-2.0
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Programming Languages

python
139335 projects - #7 most used programming language
python3
1442 projects

Projects that are alternatives of or similar to Autoviz

Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+2602.58%)
Mutual labels:  scikit-learn, automl, xgboost, automated-machine-learning
Auto viml
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
Stars: ✭ 216 (-30.32%)
Mutual labels:  scikit-learn, automl, xgboost, automated-machine-learning
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+402.9%)
Mutual labels:  scikit-learn, automl, xgboost, automated-machine-learning
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+210%)
Mutual labels:  scikit-learn, automl, xgboost, automated-machine-learning
AutoTabular
Automatic machine learning for tabular data. ⚡🔥⚡
Stars: ✭ 51 (-83.55%)
Mutual labels:  scikit-learn, xgboost, automl
Machinejs
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
Stars: ✭ 412 (+32.9%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Featuretools
An open source python library for automated feature engineering
Stars: ✭ 5,891 (+1800.32%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Auto Sklearn
Automated Machine Learning with scikit-learn
Stars: ✭ 5,916 (+1808.39%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+286.77%)
Mutual labels:  automl, xgboost, automated-machine-learning
Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (+1164.52%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Hyperactive
A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-41.29%)
Mutual labels:  scikit-learn, xgboost, automated-machine-learning
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-36.13%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
AutoPrognosis
Codebase for "AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization", ICML 2018.
Stars: ✭ 47 (-84.84%)
Mutual labels:  automl, automated-machine-learning
handson-ml
도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.
Stars: ✭ 285 (-8.06%)
Mutual labels:  scikit-learn, xgboost
My Data Competition Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
Stars: ✭ 271 (-12.58%)
Mutual labels:  automl, xgboost
simon-frontend
💹 SIMON is powerful, flexible, open-source and easy to use machine learning knowledge discovery platform 💻
Stars: ✭ 114 (-63.23%)
Mutual labels:  automl, automated-machine-learning
datascienv
datascienv is package that helps you to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries
Stars: ✭ 53 (-82.9%)
Mutual labels:  scikit-learn, xgboost
Quora question pairs NLP Kaggle
Quora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training
Stars: ✭ 17 (-94.52%)
Mutual labels:  scikit-learn, xgboost
featuretoolsOnSpark
A simplified version of featuretools for Spark
Stars: ✭ 24 (-92.26%)
Mutual labels:  automl, automated-machine-learning
mloperator
Machine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (-72.58%)
Mutual labels:  scikit-learn, xgboost

AutoViz

banner

Pepy Downloads Pepy Downloads per week Pepy Downloads per month standard-readme compliant Python Versions PyPI Version PyPI License

Automatically Visualize any dataset, any size with a single line of code.

AutoViz performs automatic visualization of any dataset with one line. Give any input file (CSV, txt or json) and AutoViz will visualize it.

Table of Contents

Install

Prerequsites

To clone AutoViz, it's better to create a new environment, and install the required dependencies:

To install from PyPi:

conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
pip install autoviz

To install from source:

cd <AutoViz_Destination>
git clone [email protected]:AutoViML/AutoViz.git
# or download and unzip https://github.com/AutoViML/AutoViz/archive/master.zip
conda create -n <your_env_name> python=3.7 anaconda
conda activate <your_env_name> # ON WINDOWS: `source activate <your_env_name>`
cd AutoViz
pip install -r requirements.txt

Usage

Read this Medium article to know how to use AutoViz.

In the AutoViz directory, open a Jupyter Notebook and use this line to instantiate the library

from autoviz.AutoViz_Class import AutoViz_Class

AV = AutoViz_Class()

Load a dataset (any CSV or text file) into a Pandas dataframe or give the name of the path and filename you want to visualize. If you don't have a filename, you can simply assign the filename argument "" (empty string).

Call AutoViz using the filename (or dataframe) along with the separator and the name of the target variable in the input. AutoViz will do the rest. You will see charts and plots on your screen.

filename = ""
sep = ","
dft = AV.AutoViz(
    filename,
    sep=",",
    depVar="",
    dfte=None,
    header=0,
    verbose=0,
    lowess=False,
    chart_format="svg",
    max_rows_analyzed=150000,
    max_cols_analyzed=30,
)

AV.AutoViz is the main plotting function in AV.

Notes:

  • AutoViz will visualize any sized file using a statistically valid sample.
  • COMMA is assumed as default separator in file. But you can change it.
  • Assumes first row as header in file but you can change it.
  • verbose option
    • if 0, display minimal information but displays charts on your notebook
    • if 1, print extra information on the notebook and also display charts
    • if 2, will not display any charts, it will simply save them in your local machine under AutoViz_Plots directory

API

Arguments

  • filename - Make sure that you give filename as empty string ("") if there is no filename associated with this data and you want to use a dataframe, then use dfte to give the name of the dataframe. Otherwise, fill in the file name and leave dfte as empty string. Only one of these two is needed to load the data set.
  • sep - this is the separator in the file. It can be comma, semi-colon or tab or any value that you see in your file that separates each column.
  • depVar - target variable in your dataset. You can leave it as empty string if you don't have a target variable in your data.
  • dfte - this is the input dataframe in case you want to load a pandas dataframe to plot charts. In that case, leave filename as an empty string.
  • header - the row number of the header row in your file. If it is the first row, then this must be zero.
  • verbose - it has 3 acceptable values: 0, 1 or 2. With zero, you get all charts but limited info. With 1 you get all charts and more info. With 2, you will not see any charts but they will be quietly generated and save in your local current directory under the AutoViz_Plots directory which will be created. Make sure you delete this folder periodically, otherwise, you will have lots of charts saved here if you used verbose=2 option a lot.
  • lowess - this option is very nice for small datasets where you can see regression lines for each pair of continuous variable against the target variable. Don't use this for large data sets (that is over 100,000 rows)
  • chart_format - this can be SVG, PNG or JPG. You will get charts generated and saved in this format if you used verbose=2 option. Very useful for generating charts and using them later.
  • max_rows_analyzed - limits the max number of rows that is used to display charts. If you have a very large data set with millions of rows, then use this option to limit the amount of time it takes to generate charts. We will take a statistically valid sample.
  • max_cols_analyzed - limits the number of continuous vars that can be analyzed

Maintainers

Contributing

See the contributing file!

PRs accepted.

License

Apache License, Version 2.0

DISCLAIMER

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].