All Projects → google → yggdrasil-decision-forests

google / yggdrasil-decision-forests

Licence: Apache-2.0 license
A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

Programming Languages

C++
36643 projects - #6 most used programming language
Starlark
911 projects

Projects that are alternatives of or similar to yggdrasil-decision-forests

Chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (+12.82%)
Mutual labels:  random-forest, cart, decision-trees, gradient-boosting
Awesome Decision Tree Papers
A collection of research papers on decision, classification and regression trees with implementations.
Stars: ✭ 1,908 (+1123.08%)
Mutual labels:  random-forest, cart, gradient-boosting
decision-trees-for-ml
Building Decision Trees From Scratch In Python
Stars: ✭ 61 (-60.9%)
Mutual labels:  random-forest, cart, gradient-boosting
cheapml
Machine Learning algorithms coded from scratch
Stars: ✭ 17 (-89.1%)
Mutual labels:  random-forest, gradient-boosting
supervised-machine-learning
This repo contains regression and classification projects. Examples: development of predictive models for comments on social media websites; building classifiers to predict outcomes in sports competitions; churn analysis; prediction of clicks on online ads; analysis of the opioids crisis and an analysis of retail store expansion strategies using…
Stars: ✭ 34 (-78.21%)
Mutual labels:  random-forest, decision-trees
R-stats-machine-learning
Misc Statistics and Machine Learning codes in R
Stars: ✭ 33 (-78.85%)
Mutual labels:  random-forest, decision-trees
interpretable-ml
Techniques & resources for training interpretable ML models, explaining ML models, and debugging ML models.
Stars: ✭ 17 (-89.1%)
Mutual labels:  decision-trees, interpretability
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Stars: ✭ 110 (-29.49%)
Mutual labels:  ml, interpretability
rfvis
A tool for visualizing the structure and performance of Random Forests 🌳
Stars: ✭ 20 (-87.18%)
Mutual labels:  random-forest, decision-trees
aws-machine-learning-university-dte
Machine Learning University: Decision Trees and Ensemble Methods
Stars: ✭ 119 (-23.72%)
Mutual labels:  random-forest, decision-trees
handson-ml
도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.
Stars: ✭ 285 (+82.69%)
Mutual labels:  random-forest, gradient-boosting
Bike-Sharing-Demand-Kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Stars: ✭ 33 (-78.85%)
Mutual labels:  random-forest, decision-trees
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+1920.51%)
Mutual labels:  random-forest, decision-trees
stackgbm
🌳 Stacked Gradient Boosting Machines
Stars: ✭ 24 (-84.62%)
Mutual labels:  decision-trees, gradient-boosting
Infiniteboost
InfiniteBoost: building infinite ensembles with gradient descent
Stars: ✭ 180 (+15.38%)
Mutual labels:  random-forest, gradient-boosting
goscore
Go Scoring API for PMML
Stars: ✭ 85 (-45.51%)
Mutual labels:  random-forest, decision-trees
ProtoTree
ProtoTrees: Neural Prototype Trees for Interpretable Fine-grained Image Recognition, published at CVPR2021
Stars: ✭ 47 (-69.87%)
Mutual labels:  decision-trees, interpretability
Machine Learning Models
Decision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (+2.56%)
Mutual labels:  random-forest, decision-trees
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (+10.9%)
Mutual labels:  random-forest, decision-trees
deep-explanation-penalization
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584
Stars: ✭ 110 (-29.49%)
Mutual labels:  ml, interpretability

Yggdrasil Decision Forests (YDF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models. The library is developed in C++ and available in C++, CLI (command-line-interface, i.e. shell commands) and in TensorFlow under the name TensorFlow Decision Forests (TF-DF).

Developing models in TF-DF and productionizing them (possibly including re-training) in C++ with YDF allows both for a flexible and fast development and an efficient and safe serving.

Usage example

Train, evaluate and benchmark the speed of a model in a few shell lines with the CLI interface:

# Training configuration
echo 'label:"my_label" learner:"RANDOM_FOREST" ' > config.pbtxt
# Scan the dataset
infer_dataspec --dataset="csv:train.csv" --output="spec.pbtxt"
# Train a model
train --dataset="csv:train.csv" --dataspec="spec.pbtxt" --config="config.pbtxt" --output="my_model"
# Evaluate the model
evaluate --dataset="csv:test.csv" --model="my_model" > evaluation.txt
# Benchmark the speed of the model
benchmark_inference --dataset="csv:test.csv" --model="my_model" > benchmark.txt

(see the examples/beginner.sh for more details)

or use the C++ interface:

auto dataset_path = "csv:/train@10";
// Training configuration
TrainingConfig train_config;
train_config.set_learner("RANDOM_FOREST");
train_config.set_task(Task::CLASSIFICATION);
train_config.set_label("my_label");
// Scan the dataset
DataSpecification spec;
CreateDataSpec(dataset_path, false, {}, &spec);
// Train a model
std::unique_ptr<AbstractLearner> learner;
GetLearner(train_config, &learner);
auto model = learner->Train(dataset_path, spec);
// Export the model
SaveModel("my_model", model.get());

(see the examples/beginner.cc for more details)

or use the Keras/Python interface of TensorFlow Decision Forests:

import tensorflow_decision_forests as tfdf
import pandas as pd
# Load the dataset in a Pandas dataframe.
train_df = pd.read_csv("project/train.csv")
# Convert the dataset into a TensorFlow dataset.
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_df, label="my_label")
# Train the model
model = tfdf.keras.RandomForestModel()
model.fit(train_ds)
# Export a SavedModel.
model.save("project/model")

(see TensorFlow Decision Forests for more details)

Documentation & Resources

The following resources are available:

Installation from pre-compiled binaries

Download one of the build releases, and then run examples/beginner.{sh,bat}.

Installation from Source

On linux, install Bazel and run:

git clone https://github.com/google/yggdrasil-decision-forests.git
cd yggdrasil_decision_forests
bazel build //yggdrasil_decision_forests/cli:all --config=linux_cpp17 --config=linux_avx2

# Then, run the example:
examples/beginner.sh

See the installation page for more details, troubleshooting and alternative installation solutions.

Yggdrasil was successfully compiled and run on:

  • Linux Debian 5
  • Windows 10
  • MacOS 10
  • Raspberry Pi 4 Rev 2

Inference of Yggdrasil models is also available on:

  • [Experimental; No support] Arduino Uno R3 (see project)

Note: Tell us if you were able to compile and run Yggdrasil on any other architecture :).

Long-time-support commitments

Inference and serving

  • The serving code is isolated from the rest of the framework (i.e., training, evaluation) and has minimal dependencies.
  • Changes to serving-related code are guaranteed to be backward compatible.
  • Model inference is deterministic: the same example is guaranteed to yield the same prediction.
  • Learners and models are extensively tested, including integration testing on real datasets; and, there exists no execution path in the serving code that crashes as a result of an error; Instead, in case of failure (e.g., malformed input example), the inference code returns a util::Status.

Training

  • Hyper-parameters' semantic is never modified.
  • The default value of hyper-parameters is never modified.
  • The default value of a newly-introduced hyper-parameter is set in such a way that the hyper-parameter is effectively disabled.

Quality Assurance

The following mechanisms will be put in place to ensure the quality of the library:

  • Peer-reviewing.
  • Unit testing.
  • Training benchmarks with ranges of acceptable evaluation metrics.
  • Sanitizers.

Contributing

Contributions to TensorFlow Decision Forests and Yggdrasil Decision Forests are welcome. If you want to contribute, make sure to review the user manual, developer manual and contribution guidelines.

Credits

TensorFlow Decision Forests was developed by:

  • Mathieu Guillame-Bert (gbm AT google DOT com)
  • Jan Pfeifer (janpf AT google DOT com)
  • Sebastian Bruch (sebastian AT bruch DOT io)
  • Arvind Srinivasan (arvnd AT google DOT com)

License

Apache License 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].