Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

Stars: ✭ 1,204 (+568.89%)

Mutual labels: jupyter-notebook, research

Learnpythonforresearch

This repository provides everything you need to get started with Python for (social science) research.

Stars: ✭ 163 (-9.44%)

Mutual labels: jupyter-notebook, research

Rasalit

Visualizations and helpers to improve and debug machine learning models for Rasa Open Source

Stars: ✭ 101 (-43.89%)

Mutual labels: jupyter-notebook, research

Datasets

🎁 3,000,000+ Unsplash images made available for research and machine learning

Stars: ✭ 1,805 (+902.78%)

Mutual labels: jupyter-notebook, research

Whitehat

Information about my experiences on ethical hacking 💀

Stars: ✭ 54 (-70%)

Mutual labels: jupyter-notebook, research

Machine Learning Workflow With Python

This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation

Stars: ✭ 157 (-12.78%)

Mutual labels: jupyter-notebook, gradient-boosting

Machine Learning With Python

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

Stars: ✭ 2,197 (+1120.56%)

Mutual labels: jupyter-notebook, random-forest

Python nlp tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

Stars: ✭ 72 (-60%)

Mutual labels: jupyter-notebook, research

Chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python

Stars: ✭ 176 (-2.22%)

Mutual labels: random-forest, gradient-boosting

Reproduce Stock Market Direction Random Forests

Reproduce research from paper "Predicting the direction of stock market prices using random forest"

Stars: ✭ 67 (-62.78%)

Mutual labels: jupyter-notebook, random-forest

Dtreeviz

A python library for decision tree visualization and model interpretation.

Stars: ✭ 1,857 (+931.67%)

Mutual labels: jupyter-notebook, random-forest

Instapy Research

📄 Research repository for InstaPy

Stars: ✭ 60 (-66.67%)

Mutual labels: jupyter-notebook, research

The Data Science Workshop

A New, Interactive Approach to Learning Data Science

Stars: ✭ 126 (-30%)

Mutual labels: jupyter-notebook, random-forest

Tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Stars: ✭ 8,378 (+4554.44%)

Mutual labels: random-forest, gradient-boosting

25daysinmachinelearning

I will update this repository to learn Machine learning with python with statistics content and materials

Stars: ✭ 53 (-70.56%)

Mutual labels: jupyter-notebook, random-forest

Awesome Decision Tree Papers

A collection of research papers on decision, classification and regression trees with implementations.

Stars: ✭ 1,908 (+960%)

Mutual labels: random-forest, gradient-boosting

Shap

A game theoretic approach to explain the output of any machine learning model.

Stars: ✭ 14,917 (+8187.22%)

Mutual labels: jupyter-notebook, gradient-boosting

View All Similar Projects ➔

InfiniteBoost

Code for a paper
InfiniteBoost: building infinite ensembles with gradient descent (arXiv:1706.01109).
A. Rogozhnikov, T. Likhomanenko

Description

InfiniteBoost is an approach to building ensembles which combines best sides of random forest and gradient boosting.

Trees in the ensemble encounter mistakes done by previous trees (as in gradient boosting), but due to modified scheme of encountering contributions the ensemble converges to the limit, thus avoiding overfitting (just as random forest).

Left: InfiniteBoost with automated search of capacity vs gradient boosting with different learning rates (shrinkages), right: random forest vs InfiniteBoost with small capacities.

More plots of comparison in research notebooks and in research/plots directory.

Reproducing research

Research is performed in jupyter notebooks (if you're not familiar, read why Jupyter notebooks are awesome).

You can use the docker image arogozhnikov/pmle:0.01 from docker hub. Dockerfile is stored in this repository (ubuntu 16 + basic sklearn stuff).

To run the environment (sudo is needed on Linux):

sudo docker run -it --rm -v /YourMountedDirectory:/notebooks -p 8890:8890 arogozhnikov/pmle:0.01

(and open localhost:8890 in your browser).

InfiniteBoost package

Self-written minimalistic implementation of trees as used for experiments against boosting.

Specific implementation was used to compare with random forest and based on the trees from scikit-learn package.

Code written in python 2 (expected to work with python 3, but not tested), some critical functions in fortran, so you need gfortran + openmp installed before installing the package (or simply use docker image).

pip install numpy
pip install .
# testing (optional)
cd tests && nosetests .

You can use implementation of trees from the package for your experiments, in this case please cite InfiniteBoost paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 180

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗