All Projects → cerlymarco → linear-tree

cerlymarco / linear-tree

Licence: MIT License
A python library to build Model Trees with Linear Models at the leaves.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to linear-tree

Dtreeviz
A python library for decision tree visualization and model interpretation.
Stars: ✭ 1,857 (+1350.78%)
Mutual labels:  random-forest, scikit-learn, decision-trees
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+1616.41%)
Mutual labels:  random-forest, scikit-learn, decision-trees
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+2362.5%)
Mutual labels:  random-forest, scikit-learn, decision-trees
ICC-2019-WC-prediction
Predicting the winner of 2019 cricket world cup using random forest algorithm
Stars: ✭ 41 (-67.97%)
Mutual labels:  random-forest, scikit-learn
MachineLearning
Implementations of machine learning algorithm by Python 3
Stars: ✭ 16 (-87.5%)
Mutual labels:  scikit-learn, decision-trees
Bike-Sharing-Demand-Kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Stars: ✭ 33 (-74.22%)
Mutual labels:  random-forest, decision-trees
dlime experiments
In this work, we propose a deterministic version of Local Interpretable Model Agnostic Explanations (LIME) and the experimental results on three different medical datasets shows the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME).
Stars: ✭ 21 (-83.59%)
Mutual labels:  random-forest, scikit-learn
aws-machine-learning-university-dte
Machine Learning University: Decision Trees and Ensemble Methods
Stars: ✭ 119 (-7.03%)
Mutual labels:  random-forest, decision-trees
How-to-score-0.8134-in-Titanic-Kaggle-Challenge
Solution of the Titanic Kaggle competition
Stars: ✭ 114 (-10.94%)
Mutual labels:  random-forest, scikit-learn
IBM-final-project-Machine-Learning
Final project of IBM's course https://www.coursera.org/learn/machine-learning-with-python on coursera
Stars: ✭ 33 (-74.22%)
Mutual labels:  scikit-learn, decision-trees
MLDay18
Material from "Random Forests and Gradient Boosting Machines in R" presented at Machine Learning Day '18
Stars: ✭ 15 (-88.28%)
Mutual labels:  random-forest, decision-trees
handson-ml
도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.
Stars: ✭ 285 (+122.66%)
Mutual labels:  random-forest, scikit-learn
rfvis
A tool for visualizing the structure and performance of Random Forests 🌳
Stars: ✭ 20 (-84.37%)
Mutual labels:  random-forest, decision-trees
click-through-rate-prediction
📈 Click-Through Rate Prediction using Logistic Regression and Tree Algorithms
Stars: ✭ 60 (-53.12%)
Mutual labels:  random-forest, decision-trees
goscore
Go Scoring API for PMML
Stars: ✭ 85 (-33.59%)
Mutual labels:  random-forest, decision-trees
supervised-machine-learning
This repo contains regression and classification projects. Examples: development of predictive models for comments on social media websites; building classifiers to predict outcomes in sports competitions; churn analysis; prediction of clicks on online ads; analysis of the opioids crisis and an analysis of retail store expansion strategies using…
Stars: ✭ 34 (-73.44%)
Mutual labels:  random-forest, decision-trees
Amazon-Fine-Food-Review
Machine learning algorithm such as KNN,Naive Bayes,Logistic Regression,SVM,Decision Trees,Random Forest,k means and Truncated SVD on amazon fine food review
Stars: ✭ 28 (-78.12%)
Mutual labels:  random-forest, decision-trees
EurekaTrees
Visualizes the Random Forest debug string from the MLLib in Spark using D3.js
Stars: ✭ 37 (-71.09%)
Mutual labels:  tree, random-forest
Lecture-3-Linear-Models
ICDSS Machine Learning Workshop Series: Linear Models
Stars: ✭ 19 (-85.16%)
Mutual labels:  scikit-learn, linear-models
R-stats-machine-learning
Misc Statistics and Machine Learning codes in R
Stars: ✭ 33 (-74.22%)
Mutual labels:  random-forest, decision-trees

linear-tree

A python library to build Model Trees with Linear Models at the leaves.

linear-tree provides also the implementations of LinearForest and LinearBoost inspired from these works.

Overview

Linear Trees combine the learning ability of Decision Tree with the predictive and explicative power of Linear Models. Like in tree-based algorithms, the data are split according to simple decision rules. The goodness of slits is evaluated in gain terms fitting Linear Models in the nodes. This implies that the models in the leaves are linear instead of constant approximations like in classical Decision Trees.

Linear Forests generalize the well known Random Forests by combining Linear Models with the same Random Forests. The key idea is to use the strength of Linear Models to improve the nonparametric learning ability of tree-based algorithms. Firstly, a Linear Model is fitted on the whole dataset, then a Random Forest is trained on the same dataset but using the residuals of the previous steps as target. The final predictions are the sum of the raw linear predictions and the residuals modeled by the Random Forest.

Linear Boosting is a two stage learning process. Firstly, a linear model is trained on the initial dataset to obtains predictions. Secondly, the residuals of the previous step are modeled with a decision tree using all the available features. The tree identifies the path leading to highest error (i.e. the worst leaf). The leaf contributing to the error the most is used to generate a new binary feature to be used in the first stage. The iterations continue until a certain stopping criterion is met.

linear-tree is developed to be fully integrable with scikit-learn. LinearTreeRegressor and LinearTreeClassifier are provided as scikit-learn BaseEstimator to build a decision tree using linear estimators. LinearForestRegressor and LinearForestClassifier use the RandomForest from sklearn to model residuals. LinearBoostRegressor and LinearBoostClassifier are available also as TransformerMixin in order to be integrated, in any pipeline, also for automated features engineering. All the models available in sklearn.linear_model can be used as base learner.

Installation

pip install --upgrade linear-tree

The module depends on NumPy, SciPy and Scikit-Learn (>=0.23.0). Python 3.6 or above is supported.

Media

Usage

Linear Tree Regression
from sklearn.linear_model import LinearRegression
from lineartree import LinearTreeRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=4,
                       n_informative=2, n_targets=1,
                       random_state=0, shuffle=False)
regr = LinearTreeRegressor(base_estimator=LinearRegression())
regr.fit(X, y)
Linear Tree Classification
from sklearn.linear_model import RidgeClassifier
from lineartree import LinearTreeClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = LinearTreeClassifier(base_estimator=RidgeClassifier())
clf.fit(X, y)
Linear Forest Regression
from sklearn.linear_model import LinearRegression
from lineartree import LinearForestRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=4,
                       n_informative=2, n_targets=1,
                       random_state=0, shuffle=False)
regr = LinearForestRegressor(base_estimator=LinearRegression())
regr.fit(X, y)
Linear Forest Classification
from sklearn.linear_model import LinearRegression
from lineartree import LinearForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = LinearForestClassifier(base_estimator=LinearRegression())
clf.fit(X, y)
Linear Boosting Regression
from sklearn.linear_model import LinearRegression
from lineartree import LinearBoostRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=4,
                       n_informative=2, n_targets=1,
                       random_state=0, shuffle=False)
regr = LinearBoostRegressor(base_estimator=LinearRegression())
regr.fit(X, y)
Linear Boosting Classification
from sklearn.linear_model import RidgeClassifier
from lineartree import LinearBoostClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = LinearBoostClassifier(base_estimator=RidgeClassifier())
clf.fit(X, y)

More examples in the notebooks folder.

Check the API Reference to see the parameter configurations and the available methods.

Examples

Show the linear tree learning path:

plot tree

Linear Tree Regressor at work:

linear tree regressor

Linear Tree Classifier at work:

linear tree classifier

Extract and examine coefficients at the leaves:

leaf coefficients

Impact of the features automatically generated with Linear Boosting:

linear_boost_importances

Comparing predictions of Linear Forest and Random Forest:

linear_forest_predictions

References

  • Regression-Enhanced Random Forests. Haozhe Zhang, Dan Nettleton, Zhengyuan Zhu.
  • Explainable boosted linear regression for time series forecasting. Igor Ilic, Berk Gorgulu, Mucahit Cevik, Mustafa Gokce Baydogan.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].