All Projects → yandex → Rep

yandex / Rep

Licence: other
Machine Learning toolbox for Humans

Projects that are alternatives of or similar to Rep

Kalman
Some Python Implementations of the Kalman Filter
Stars: ✭ 619 (-2.37%)
Mutual labels:  jupyter-notebook
Tensorflow Workshop
This repo contains materials for use in a TensorFlow workshop.
Stars: ✭ 628 (-0.95%)
Mutual labels:  jupyter-notebook
Bokeh Notebooks
Interactive Web Plotting with Bokeh in IPython notebook
Stars: ✭ 629 (-0.79%)
Mutual labels:  jupyter-notebook
Mina
Mina is a new cryptocurrency with a constant size blockchain, improving scaling while maintaining decentralization and security.
Stars: ✭ 617 (-2.68%)
Mutual labels:  jupyter-notebook
David Silver Reinforcement Learning
Notes for the Reinforcement Learning course by David Silver along with implementation of various algorithms.
Stars: ✭ 623 (-1.74%)
Mutual labels:  jupyter-notebook
Mxnet Notebooks
Notebooks for MXNet
Stars: ✭ 629 (-0.79%)
Mutual labels:  jupyter-notebook
Deeplearning Assignment
深度学习笔记
Stars: ✭ 619 (-2.37%)
Mutual labels:  jupyter-notebook
Sklearn Deap
Use evolutionary algorithms instead of gridsearch in scikit-learn
Stars: ✭ 633 (-0.16%)
Mutual labels:  jupyter-notebook
Kmcuda
Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Stars: ✭ 627 (-1.1%)
Mutual labels:  jupyter-notebook
Falcon
Brushing and linking for big data
Stars: ✭ 627 (-1.1%)
Mutual labels:  jupyter-notebook
Tutorials
A series of machine learning tutorials for Torch7
Stars: ✭ 621 (-2.05%)
Mutual labels:  jupyter-notebook
Cvnd exercises
Exercise notebooks for CVND.
Stars: ✭ 622 (-1.89%)
Mutual labels:  jupyter-notebook
Fastai2
Temporary home for fastai v2 while it's being developed
Stars: ✭ 630 (-0.63%)
Mutual labels:  jupyter-notebook
Ebookmlcb
ebook Machine Learning cơ bản
Stars: ✭ 619 (-2.37%)
Mutual labels:  jupyter-notebook
Ai Fundamentals
Code samples for AI fundamentals
Stars: ✭ 631 (-0.47%)
Mutual labels:  jupyter-notebook
Ml notes
机器学习算法的公式推导以及numpy实现
Stars: ✭ 618 (-2.52%)
Mutual labels:  jupyter-notebook
Anchor
Code for "High-Precision Model-Agnostic Explanations" paper
Stars: ✭ 629 (-0.79%)
Mutual labels:  jupyter-notebook
Ml course
EPFL Machine Learning Course, Fall 2019
Stars: ✭ 634 (+0%)
Mutual labels:  jupyter-notebook
Mining The Social Web 3rd Edition
The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)
Stars: ✭ 633 (-0.16%)
Mutual labels:  jupyter-notebook
Toolkitten
A toolkit for #1millionwomentotech community.
Stars: ✭ 630 (-0.63%)
Mutual labels:  jupyter-notebook

Reproducible Experiment Platform (REP)

Join the chat at https://gitter.im/yandex/rep Build Status PyPI version Documentation CircleCI

REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way.

Main features:

  • unified python wrapper for different ML libraries (wrappers follow extended scikit-learn interface)
    • Sklearn
    • TMVA
    • XGBoost
    • uBoost
    • Theanets
    • Pybrain
    • Neurolab
    • MatrixNet service(available to CERN)
  • parallel training of classifiers on cluster
  • classification/regression reports with plots
  • interactive plots supported
  • smart grid-search algorithms with parallel execution
  • research versioning using git
  • pluggable quality metrics for classification
  • meta-algorithm design (aka 'rep-lego')

REP is not trying to substitute scikit-learn, but extends it and provides better user experience.

Howto examples

To get started, look at the notebooks in /howto/

Notebooks can be viewed (not executed) online at nbviewer
There are basic introductory notebooks (about python, IPython) and more advanced ones (about the REP itself)

Examples code is written in python 2, but library is python 2 and python 3 compatible.

Installation with Docker

We provide the docker image with REP and all it's dependencies. It is a recommended way, specially if you're not experienced in python.

Installation with bare hands

However, if you want to install REP and all of its dependencies on your machine yourself, follow this manual: installing manually and running manually.

Links

License

Apache 2.0, library is open-source.

Minimal examples

REP wrappers are sklearn compatible:

from rep.estimators import XGBoostClassifier, SklearnClassifier, TheanetsClassifier
clf = XGBoostClassifier(n_estimators=300, eta=0.1).fit(trainX, trainY)
probabilities = clf.predict_proba(testX)

Beloved trick of kagglers is to run bagging over complex algorithms. This is how it is done in REP:

from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=XGBoostClassifier(), n_estimators=10)
# wrapping sklearn to REP wrapper
clf = SklearnClassifier(clf)

Another useful trick is to use folding instead of splitting data into train/test. This is specially useful when you're using some kind of complex stacking

from rep.metaml import FoldingClassifier
clf = FoldingClassifier(TheanetsClassifier(), n_folds=3)
probabilities = clf.fit(X, y).predict_proba(X)

In example above all data are splitted into 3 folds, and each fold is predicted by classifier which was trained on other 2 folds.

Also REP classifiers provide report:

report = clf.test_on(testX, testY)
report.roc().plot() # plot ROC curve
from rep.report.metrics import RocAuc
# learning curves are useful when training GBDT!
report.learning_curve(RocAuc(), steps=10)  

You can read about other REP tools (like smart distributed grid search, folding and factory) in documentation and howto examples.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].