Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → sberbank-ai-lab → Lightautoml

sberbank-ai-lab / Lightautoml

Licence: other

LAMA - automatic model creation framework

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch nlp data-science classification pipeline kaggle automl regression feature-engineering automated-machine-learning gradient-boosting stacking

Projects that are alternatives of or similar to Lightautoml

Mlbox

MLBox is a powerful Automated Machine Learning python library.

Stars: ✭ 1,199 (+511.73%)

Mutual labels: kaggle, data-science, classification, pipeline, automl, automated-machine-learning, stacking, regression

Automlpipeline.jl

A package that makes it trivial to create and evaluate machine learning pipeline architectures.

Stars: ✭ 223 (+13.78%)

Mutual labels: data-science, classification, pipeline, automl, stacking

Auto ml

[UNMAINTAINED] Automated machine learning for analytics & production

Stars: ✭ 1,559 (+695.41%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning, gradient-boosting

Mlj.jl

A Julia machine learning framework

Stars: ✭ 982 (+401.02%)

Mutual labels: data-science, classification, pipeline, stacking, regression

Tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Stars: ✭ 8,378 (+4174.49%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning, gradient-boosting

Remixautoml

R package for automation of machine learning, forecasting, feature engineering, model evaluation, model interpretation, data generation, and recommenders.

Stars: ✭ 159 (-18.88%)

Mutual labels: classification, feature-engineering, automated-machine-learning, regression

Mlr

Machine Learning in R

Stars: ✭ 1,542 (+686.73%)

Mutual labels: data-science, classification, stacking, regression

Machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

Stars: ✭ 412 (+110.2%)

Mutual labels: kaggle, data-science, automl, automated-machine-learning

Featuretools

An open source python library for automated feature engineering

Stars: ✭ 5,891 (+2905.61%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (+390.31%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning

Autodl

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]

Stars: ✭ 854 (+335.71%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning

Kaggle Competitions

There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.

Stars: ✭ 86 (-56.12%)

Mutual labels: kaggle, data-science, classification, feature-engineering

Nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Stars: ✭ 10,698 (+5358.16%)

Mutual labels: data-science, automl, feature-engineering, automated-machine-learning

Machine Learning Workflow With Python

This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation

Stars: ✭ 157 (-19.9%)

Mutual labels: kaggle, feature-engineering, gradient-boosting

Auptimizer

An automatic ML model optimization tool.

Stars: ✭ 166 (-15.31%)

Mutual labels: data-science, automl, automated-machine-learning

Automl alex

State-of-the art Automated Machine Learning python library for Tabular Data

Stars: ✭ 132 (-32.65%)

Mutual labels: data-science, automl, stacking

Interactive machine learning

IPython widgets, interactive plots, interactive machine learning

Stars: ✭ 140 (-28.57%)

Mutual labels: data-science, classification, regression

Evalml

EvalML is an AutoML library written in python.

Stars: ✭ 145 (-26.02%)

Mutual labels: data-science, automl, feature-engineering

Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.

Stars: ✭ 169 (-13.78%)

Mutual labels: data-science, classification, regression

Neuroflow

Artificial Neural Networks for Scala

Stars: ✭ 105 (-46.43%)

Mutual labels: data-science, classification, regression

View All Similar Projects ➔

LightAutoML - automatic model creation framework

LightAutoML project from Sberbank AI Lab AutoML group is the framework for automatic classification and regression model creation.

Current available tasks to solve:

binary classification
multiclass classification
regression

Currently we work with datasets, where each row is an object with its specific features and target. Multitable datasets and sequences are now under contruction :)

Note: for automatic creation of interpretable models we use AutoWoE library made by our group as well.

Authors: Ryzhkov Alexander, Vakhrushev Anton, Simakov Dmitry, Bunakov Vasilii, Damdinov Rinchin, Shvets Pavel, Kirilin Alexander

LightAutoML video guides:

LightAutoML webinar for Sberloga community (Ryzhkov Alexander, Simakov Dmitry)
LightAutoML framework general overview, benchmarks and advantages for business (Ryzhkov Alexander)
LightAutoML practical guide - ML pipeline presets overview (Simakov Dmitry)

See the Documentation of LightAutoML.

Installation

Installation via pip from PyPI

To install LightAutoML framework on your machine:

pip install -U lightautoml

Installation from sources with virtual environment creation

If you want to create a specific virtual environment for LightAutoML, you need to install python3-venv system package and run the following command, which creates lama_venv virtual env with LightAutoML inside:

bash build_package.sh

To check this variant of installation and run all the demo scripts, use the command below:

bash test_package.sh

Docs generation

To generate documentation for LightAutoML framework, you can use command below (it uses virtual env created on installation step from sources):

bash build_docs.sh

Builded official documentation for LightAutoML is available here.

Usage examples

To find out how to work with LightAutoML, we have several tutorials:

Tutorial_1. Create your own pipeline.ipynb - shows how to create your own pipeline from specified blocks: pipelines for feature generation and feature selection, ML algorithms, hyperparameter optimization etc.
Tutorial_2. AutoML pipeline preset.ipynb - shows how to use LightAutoML presets (both standalone and time utilized variants) for solving ML tasks on tabular data. Using presets you can solve binary classification, multiclass classification and regression tasks, changing the first argument in Task.
Tutorial_3. Multiclass task.ipynb - shows how to build ML pipeline for multiclass ML task by hand
Tutorial_4. SQL data source for pipeline preset.ipynb - shows how to use LightAutoML presets (both standalone and time utilized variants) for solving ML tasks on tabular data from SQL data base instead of CSV.

Each tutorial has the step to enable Profiler and completes with Profiler run, which generates distribution for each function call time and shows it in interactive HTML report: the report show full time of run on its top and interactive tree of calls with percent of total time spent by the specific subtree.

Important 1: for production you have no need to use profiler (which increase work time and memory consomption), so please do not turn it on - it is in off state by default

Important 2: to take a look at this report after the run, please comment last line of demo with report deletion command.

Kaggle kernel examples of LightAutoML usage:

For more examples, in tests folder you can find different scenarios of LightAutoML usage:

demo0.py - building ML pipeline from blocks and fit + predict the pipeline itself.
demo1.py - several ML pipelines creation (using importances based cutoff feature selector) to build 2 level stacking using AutoML class
demo2.py - several ML pipelines creation (using iteartive feature selection algorithm) to build 2 level stacking using AutoML class
demo3.py - several ML pipelines creation (using combination of cutoff and iterative FS algos) to build 2 level stacking using AutoML class
demo4.py - creation of classification and regression tasks for AutoML with loss and evaluation metric setup
demo5.py - 2 level stacking using AutoML class with different algos on first level including LGBM, Linear and LinearL1
demo6.py - AutoML with nested CV usage
demo7.py - AutoML preset usage for tabular datasets (predefined structure of AutoML pipeline and simple interface for users without building from blocks)
demo8.py - creation pipelines from blocks to build AutoML, solving multiclass classification task
demo9.py - AutoML time utilization preset usage for tabular datasets (predefined structure of AutoML pipeline and simple interface for users without building from blocks)
demo10.py - creation pipelines from blocks (including CatBoost) to build AutoML , solving multiclass classification task
demo11.py - AutoML NLP preset usage for tabular datasets with text columns
demo12.py - AutoML tabular preset usage with custom validation scheme and multiprocessed inference

Contributing to LightAutoML

If you are interested in contributing to LightAutoML, please read the Contributing Guide to get started.

Questions / Issues / Suggestions

Write a message to us:

Alexander Ryzhkov (email: [email protected], telegram: @RyzhkovAlex)
Anton Vakhrushev (email: [email protected])

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 196

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗