All Projects → ahmedbesbes → playground

ahmedbesbes / playground

Licence: MIT license
A Streamlit application to play with machine learning models directly from the browser

Programming Languages

python
139335 projects - #7 most used programming language
CSS
56736 projects
shell
77523 projects

Projects that are alternatives of or similar to playground

Traingenerator
🧙 A web app to generate template code for machine learning
Stars: ✭ 948 (+1875%)
Mutual labels:  scikit-learn, sklearn, webapp
img ai app boilerplate
An image classification app boilerplate to serve your deep learning models asap!
Stars: ✭ 27 (-43.75%)
Mutual labels:  webapp, heroku-deployment, streamlit
sklearn-audio-classification
An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, and cross-validation with a variety of ML techniques and MLP
Stars: ✭ 31 (-35.42%)
Mutual labels:  scikit-learn, sklearn, machine-learning-tutorials
Ailearning
AiLearning: 机器学习 - MachineLearning - ML、深度学习 - DeepLearning - DL、自然语言处理 NLP
Stars: ✭ 32,316 (+67225%)
Mutual labels:  scikit-learn, sklearn
five-minute-midas
Predicting Profitable Day Trading Positions using Decision Tree Classifiers. scikit-learn | Flask | SQLite3 | pandas | MLflow | Heroku | Streamlit
Stars: ✭ 41 (-14.58%)
Mutual labels:  scikit-learn, streamlit
Machinelearningstocks
Using python and scikit-learn to make stock predictions
Stars: ✭ 897 (+1768.75%)
Mutual labels:  scikit-learn, sklearn
Sklearn Evaluation
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.
Stars: ✭ 294 (+512.5%)
Mutual labels:  scikit-learn, sklearn
Ml code
A repository for recording the machine learning code
Stars: ✭ 75 (+56.25%)
Mutual labels:  scikit-learn, sklearn
Sklearn Porter
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
Stars: ✭ 1,014 (+2012.5%)
Mutual labels:  scikit-learn, sklearn
Facial Expression Recognition Svm
Training SVM classifier to recognize people expressions (emotions) on Fer2013 dataset
Stars: ✭ 110 (+129.17%)
Mutual labels:  scikit-learn, sklearn
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (+6058.33%)
Mutual labels:  scikit-learn, sklearn
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+1250%)
Mutual labels:  scikit-learn, sklearn
Hungabunga
HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!
Stars: ✭ 614 (+1179.17%)
Mutual labels:  scikit-learn, sklearn
Profanity Check
A fast, robust Python library to check for offensive language in strings.
Stars: ✭ 354 (+637.5%)
Mutual labels:  scikit-learn, sklearn
Qlik Py Tools
Data Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Stars: ✭ 135 (+181.25%)
Mutual labels:  scikit-learn, sklearn
imbalanced-ensemble
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+314.58%)
Mutual labels:  scikit-learn, sklearn
Mlatimperial2017
Materials for the course of machine learning at Imperial College organized by Yandex SDA
Stars: ✭ 71 (+47.92%)
Mutual labels:  scikit-learn, sklearn
KMeans elbow
Code for determining optimal number of clusters for K-means algorithm using the 'elbow criterion'
Stars: ✭ 35 (-27.08%)
Mutual labels:  scikit-learn, sklearn
skippa
SciKIt-learn Pipeline in PAndas
Stars: ✭ 33 (-31.25%)
Mutual labels:  scikit-learn, sklearn
python3-docker-devenv
Docker Start Guide with Python Development Environment
Stars: ✭ 13 (-72.92%)
Mutual labels:  scikit-learn, sklearn

Open in Streamlit

Check my video tutorial to learn how to build this app

End to end tutorial to Build and Deploy a Streamlit Application on Heroku

Playground 🧪

Playground is a streamlit application that allows you to tinker with machine learning models from your browser.

So if you're a data science practitioner you should definitely try it out 😉

This app is inspired by the great Tensorflow playground. The only difference here is that it addresses classical machine learning models

Demo

Right here

How does it work ?

  1. 🗂️ You pick and configure a dataset from a pre-defined list. You can set:
    • the number of samples
    • the noise on train and test data
  2. ⚙️ You select a model set its hyper-parameters. You can pick a model from: Logistic regression, decision tree, random forests, gradient boosting, neural networks, Naive Bayes, KNNs and SVM
  3. 📉 The app automatically displays the following results:
    • the decision boundary of the model on train and test data
    • the performance metrics (Accuracy and F1 score) on train and test data
    • the time it took the model to train
    • a generated python script to reproduce the model based on the dataset definition and the model hyper-parameters
  4. For each model, playground provides a link to the official documentation as well as a list of tips.

Bonus point: the app also provides the ability to perform feature engineering by adding polynomial features. This proves to be helpful for linear models such as logistic regressions on non-linear problems.

What can you learn from playground?

If you're new to machine learning, playing with this app will probably (and hopefully 😄) get you familiar with basic notions and help you build your first intuitions. It won't replace text books: it's only meant to complement your knowledge. Take it as it is.

1. Decision boundaries will (partially) tell you how models behave

You'll get more sense of how each model works by inspecting its decision boundary. For educational purposes, playground will process datasets that have 2 features (but same results can be obtained on multi-dimensional datasets after dimensionality reduction)

You'll see for example that a logistic regression separates the data by a line (or a hyperplane in the general case)

whereas a decision tree, who classifies the data based on successive conditionals on the values of the features, has a decision boundary composed of horizontal and vertical lines.

Interestingly, a random forest, which is a bagging of multiple decision trees, has a decision boundary that looks similar to the decision tree's but only smoother: this is result of the voting mechanism a random forest uses.

2. You'll get a sense of the speed of each model

Given the same dataset, you can compare the speed of each model and get a feeling of who's faster. In the previous plots, the logistic regression and the decision tree respectively took 0.004 and 0.001 seconds to train whereas the random foest took 0.154 seconds.

Try a Neural Network with 4 stacked layers of 100 neurons each: it takes 0.253 seconds

3. Feature engineering can help

Using a logistic regression on the moon dataset won't get you a good score given its non-linear nature.

However, increasing the dimensionality by adding polynomial features can help: try increasing the polynomial degree to 3 when using a logistic regression and notice how the decision boundary radically changes.

4. Some models are more robust than others to noise

You can experiment by setting a higher noise on the test data, thus making it drift from the train distribution. Some models such as Gradient Boosting are more stable than others against this problem.

5. Try out different combinations of hyper-parameters

A great way to learn and validate your intuitions is to experiment, and that's what this app is for: it'll allow you to tinker with a bunch of hyper-parameters (tree depth, number of estimators, number of layers etc...) and immediately see the results on the decision boundaries, the metrics as well as the execution time.

Go, and give it a try, and I hope you'll learn something from it!

Run the app locally

Make sure you have pip installed with Python 3.

  • install pipenv
pip install pipenv
  • go inside the folder and install the dependencies
pipenv install
  • run the app
streamlit run app.py

Structure of the code

  • app.py : The main script to start the app
  • utils/
    • ui.py: UI functions to display the different components of the app
    • functions.py: for data processing, training the model and building the plotly graphs
  • models/: where each model's hyper-parameter selector is defined

Contributions are welcome!

Feel free to open a pull request or an issue if you're thinking of a feature you'd like to see in the app.

Off the top of my head, I can think of:

  • Adding other non-linear datasets
  • Adding more models
  • Implementing sophisticated feature engineering (sinusoidal features for instance)
  • Implementing a custom dataset reader with dimensionality reduction
  • Adding feature importance plots

But if you've got other ideas, I will be happy to discuss them with you.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].