All Projects → arogozhnikov → Infiniteboost

arogozhnikov / Infiniteboost

InfiniteBoost: building infinite ensembles with gradient descent

Projects that are alternatives of or similar to Infiniteboost

Isl Python
Solutions to labs and excercises from An Introduction to Statistical Learning, as Jupyter Notebooks.
Stars: ✭ 108 (-40%)
Mutual labels:  jupyter-notebook, random-forest
Youtube Like Predictor
YouTube Like Count Predictions using Machine Learning
Stars: ✭ 137 (-23.89%)
Mutual labels:  jupyter-notebook, random-forest
Network Intrusion Detection
Machine Learning with the NSL-KDD dataset for Network Intrusion Detection
Stars: ✭ 119 (-33.89%)
Mutual labels:  jupyter-notebook, random-forest
Hyperlearn
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (+568.89%)
Mutual labels:  jupyter-notebook, research
Learnpythonforresearch
This repository provides everything you need to get started with Python for (social science) research.
Stars: ✭ 163 (-9.44%)
Mutual labels:  jupyter-notebook, research
Rasalit
Visualizations and helpers to improve and debug machine learning models for Rasa Open Source
Stars: ✭ 101 (-43.89%)
Mutual labels:  jupyter-notebook, research
Datasets
🎁 3,000,000+ Unsplash images made available for research and machine learning
Stars: ✭ 1,805 (+902.78%)
Mutual labels:  jupyter-notebook, research
Whitehat
Information about my experiences on ethical hacking 💀
Stars: ✭ 54 (-70%)
Mutual labels:  jupyter-notebook, research
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (-12.78%)
Mutual labels:  jupyter-notebook, gradient-boosting
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+1120.56%)
Mutual labels:  jupyter-notebook, random-forest
Python nlp tutorial
This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Stars: ✭ 72 (-60%)
Mutual labels:  jupyter-notebook, research
Chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (-2.22%)
Mutual labels:  random-forest, gradient-boosting
Reproduce Stock Market Direction Random Forests
Reproduce research from paper "Predicting the direction of stock market prices using random forest"
Stars: ✭ 67 (-62.78%)
Mutual labels:  jupyter-notebook, random-forest
Dtreeviz
A python library for decision tree visualization and model interpretation.
Stars: ✭ 1,857 (+931.67%)
Mutual labels:  jupyter-notebook, random-forest
Instapy Research
📄 Research repository for InstaPy
Stars: ✭ 60 (-66.67%)
Mutual labels:  jupyter-notebook, research
The Data Science Workshop
A New, Interactive Approach to Learning Data Science
Stars: ✭ 126 (-30%)
Mutual labels:  jupyter-notebook, random-forest
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+4554.44%)
Mutual labels:  random-forest, gradient-boosting
25daysinmachinelearning
I will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-70.56%)
Mutual labels:  jupyter-notebook, random-forest
Awesome Decision Tree Papers
A collection of research papers on decision, classification and regression trees with implementations.
Stars: ✭ 1,908 (+960%)
Mutual labels:  random-forest, gradient-boosting
Shap
A game theoretic approach to explain the output of any machine learning model.
Stars: ✭ 14,917 (+8187.22%)
Mutual labels:  jupyter-notebook, gradient-boosting

InfiniteBoost

Code for a paper
InfiniteBoost: building infinite ensembles with gradient descent (arXiv:1706.01109).
A. Rogozhnikov, T. Likhomanenko

Description

InfiniteBoost is an approach to building ensembles which combines best sides of random forest and gradient boosting.

Trees in the ensemble encounter mistakes done by previous trees (as in gradient boosting), but due to modified scheme of encountering contributions the ensemble converges to the limit, thus avoiding overfitting (just as random forest).

Left: InfiniteBoost with automated search of capacity vs gradient boosting with different learning rates (shrinkages), right: random forest vs InfiniteBoost with small capacities.

More plots of comparison in research notebooks and in research/plots directory.

Reproducing research

Research is performed in jupyter notebooks (if you're not familiar, read why Jupyter notebooks are awesome).

You can use the docker image arogozhnikov/pmle:0.01 from docker hub. Dockerfile is stored in this repository (ubuntu 16 + basic sklearn stuff).

To run the environment (sudo is needed on Linux):

sudo docker run -it --rm -v /YourMountedDirectory:/notebooks -p 8890:8890 arogozhnikov/pmle:0.01

(and open localhost:8890 in your browser).

InfiniteBoost package

Self-written minimalistic implementation of trees as used for experiments against boosting.

Specific implementation was used to compare with random forest and based on the trees from scikit-learn package.

Code written in python 2 (expected to work with python 3, but not tested), some critical functions in fortran, so you need gfortran + openmp installed before installing the package (or simply use docker image).

pip install numpy
pip install .
# testing (optional)
cd tests && nosetests .

You can use implementation of trees from the package for your experiments, in this case please cite InfiniteBoost paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].