Alternatives and detailed information of autogbt-alt

pfnet-research / autogbt-alt

Licence: MIT license

An experimental Python package that reimplements AutoGBT using LightGBM and Optuna.

Programming Languages

python

139335 projects - #7 most used programming language

shell

77523 projects

Projects that are alternatives of or similar to autogbt-alt

Lightautoml

LAMA - automatic model creation framework

Stars: ✭ 196 (+157.89%)

Mutual labels: kaggle, automl, gradient-boosting

Mlbox

MLBox is a powerful Automated Machine Learning python library.

Stars: ✭ 1,199 (+1477.63%)

Mutual labels: kaggle, lightgbm, automl

Lightgbm

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Stars: ✭ 13,293 (+17390.79%)

Mutual labels: kaggle, lightgbm, gradient-boosting

Apartment-Interest-Prediction

Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.

Stars: ✭ 17 (-77.63%)

Mutual labels: kaggle, lightgbm, gradient-boosting

Auto ml

[UNMAINTAINED] Automated machine learning for analytics & production

Stars: ✭ 1,559 (+1951.32%)

Mutual labels: lightgbm, automl, gradient-boosting

Kaggler

Code for Kaggle Data Science Competitions

Stars: ✭ 614 (+707.89%)

Mutual labels: kaggle, automl

Benchmarks

Comparison tools

Stars: ✭ 139 (+82.89%)

Mutual labels: kaggle, lightgbm

Machine Learning Workflow With Python

This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation

Stars: ✭ 157 (+106.58%)

Mutual labels: kaggle, gradient-boosting

Open Solution Home Credit

Open solution to the Home Credit Default Risk challenge 🏡

Stars: ✭ 397 (+422.37%)

Mutual labels: kaggle, lightgbm

Kaggle Competition Favorita

5th place solution for Kaggle competition Favorita Grocery Sales Forecasting

Stars: ✭ 169 (+122.37%)

Mutual labels: kaggle, lightgbm

Chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python

Stars: ✭ 176 (+131.58%)

Mutual labels: kaggle, gradient-boosting

Hungabunga

HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!

Stars: ✭ 614 (+707.89%)

Mutual labels: kaggle, automl

fast retraining

Show how to perform fast retraining with LightGBM in different business cases

Stars: ✭ 56 (-26.32%)

Mutual labels: kaggle, lightgbm

Machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

Stars: ✭ 412 (+442.11%)

Mutual labels: kaggle, automl

decision-trees-for-ml

Building Decision Trees From Scratch In Python

Stars: ✭ 61 (-19.74%)

Mutual labels: lightgbm, gradient-boosting

kaggle-recruit-restaurant

🏆 Kaggle 8th place solution

Stars: ✭ 102 (+34.21%)

Mutual labels: kaggle, lightgbm

MSDS696-Masters-Final-Project

Earthquake Prediction Challenge with LightGBM and XGBoost

Stars: ✭ 58 (-23.68%)

Mutual labels: kaggle, lightgbm

HumanOrRobot

a solution for competition of kaggle `Human or Robot`

Stars: ✭ 16 (-78.95%)

Mutual labels: kaggle, lightgbm

Open Solution Mapping Challenge

Open solution to the Mapping Challenge 🌎

Stars: ✭ 291 (+282.89%)

Mutual labels: kaggle, lightgbm

docker-kaggle-ko

머신러닝/딥러닝(PyTorch, TensorFlow) 전용 도커입니다. 한글 폰트, 한글 자연어처리 패키지(konlpy), 형태소 분석기, Timezone 등의 설정 등을 추가 하였습니다.

Stars: ✭ 46 (-39.47%)

Mutual labels: kaggle, lightgbm

View All Similar Projects ➔

About

This is an experimental Python package that reimplements AutoGBT using LightGBM and Optuna. AutoGBT is an automatically tuned machine learning classifier which won the first prize at NeurIPS'18 AutoML Challenge. AutoGBT has the following features:

Automatic Hyperparameter Tuning: the hyperparameters of LightGBM are automatically optimized,
Automatic Feature Engineering: simple feature engineering is applied for categorical and datetime features, and
Automatic Sampling: data rows are sampled for handling imbalanced and large datasets.

This implementation has the following differences from original AutoGBT:

This implementation uses Optuna for the hyperparameter tuning of LightGBM instead of Hyperopt,
it optimizes k-fold cross-validation AUC score, and
it equips simplified scikit-learn-like API interface.

Installation

$ pip install git+https://github.com/pfnet-research/autogbt-alt.git

$ pip install git+ssh://[email protected]/pfnet-research/autogbt-alt.git

Usage

Basic Usage: LightGBM with Automatic Hyperparameter Tuning

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from autogbt import AutoGBTClassifier

X, y = load_breast_cancer(return_X_y=True)
train_X, valid_X, train_y, valid_y = train_test_split(X, y, test_size=0.1)
model = AutoGBTClassifier()
model.fit(train_X, train_y)
print('valid AUC: %.3f' % (roc_auc_score(valid_y, model.predict(valid_X))))
print('CV AUC: %.3f' % (model.best_score))

Feature Engineering

from autogbt import Preprocessor

preprocessor = Preprocessor(train_frac=0.5, test_frac=0.5)
train_X, valid_X, train_y = preprocessor.transform(train_X, valid_X, train_y)

Training with Sampling

from autogbt import TrainDataSampler

sampler = TrainDataSampler(train_frac=0.5, valid_frac=0.5)
model = AutoGBTClassifier(sampler=sampler)
model.fit(train_X, train_y)
model.predict(test_X)

Experimental Evaluation

Please see benchmark directory for the details.

Comparison against Vanilla XGBoost and LightGBM

The default values are used for all hyperparameters of AutoGBT, XGBoost and LightGBM.

Airline Dataset

model	duration[s]	CV AUC
AutoGBT	6515.254±340.231	0.900±0.001
Xgboost	78.561±7.265	0.872±0.000
LightGBM	34.000±2.285	0.891±0.000

Amazon Challenge

model	duration[s]	CV AUC
AutoGBT	359.834±29.188	0.832±0.002
Xgboost	2.558±0.661	0.749±0.002
LightGBM	1.789±0.165	0.834±0.002

Avazu CTR Prediction

model	duration[s]	CV AUC
AutoGBT	20322.601±676.702	0.744±0.000
Xgboost	OoM	OoM
LightGBM	OoM	OoM

Bank Marketing Data Set

model	duration[s]	CV AUC
AutoGBT	372.090±32.857	0.925±0.001
Xgboost	2.683±0.204	0.912±0.001
LightGBM	2.406±0.236	0.927±0.001

Parameter Comparison

Performance on various train_frac and n_trials parameters

Testing

$ ./test.sh

Reference

Jobin Wilson and Amit Kumar Meher and Bivin Vinodkumar Bindu and Manoj Sharma and Vishakha Pareek and Santanu Chaudhury and Brejesh Lall, AutoGBT: Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams under Concept-Drift, 2018, https://github.com/flytxtds/AutoGBT.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

pfnet-research / autogbt-alt

Programming Languages

Labels

Projects that are alternatives of or similar to autogbt-alt

About

Installation

Usage

Basic Usage: LightGBM with Automatic Hyperparameter Tuning

Feature Engineering

Training with Sampling

Experimental Evaluation

Comparison against Vanilla XGBoost and LightGBM

Airline Dataset

Amazon Challenge

Avazu CTR Prediction

Bank Marketing Data Set

Parameter Comparison

Testing

Reference