All Projects → udellgroup → oboe

udellgroup / oboe

Licence: BSD-3-Clause license
An AutoML pipeline selection system to quickly select a promising pipeline for a new dataset.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to oboe

Book recommend
基于协同过滤的书籍推荐系统
Stars: ✭ 149 (+96.05%)
Mutual labels:  collaborative-filtering
Recommendationsystem
Book recommender system using collaborative filtering based on Spark
Stars: ✭ 244 (+221.05%)
Mutual labels:  collaborative-filtering
disentangled graph collaborative filtering
Disentagnled Graph Collaborative Filtering, SIGIR2020
Stars: ✭ 118 (+55.26%)
Mutual labels:  collaborative-filtering
Implicit
Fast Python Collaborative Filtering for Implicit Feedback Datasets
Stars: ✭ 2,569 (+3280.26%)
Mutual labels:  collaborative-filtering
Recommender System
A developing recommender system in tensorflow2. Algorithm: UserCF, ItemCF, LFM, SLIM, GMF, MLP, NeuMF, FM, DeepFM, MKR, RippleNet, KGCN and so on.
Stars: ✭ 227 (+198.68%)
Mutual labels:  collaborative-filtering
matrix-completion
Lightweight Python library for in-memory matrix completion.
Stars: ✭ 94 (+23.68%)
Mutual labels:  collaborative-filtering
Recotour
A tour through recommendation algorithms in python [IN PROGRESS]
Stars: ✭ 140 (+84.21%)
Mutual labels:  collaborative-filtering
nitroml
NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (AutoML) pipelines.
Stars: ✭ 40 (-47.37%)
Mutual labels:  automl
Recommender
A C library for product recommendations/suggestions using collaborative filtering (CF)
Stars: ✭ 238 (+213.16%)
Mutual labels:  collaborative-filtering
mindsdb native
Machine Learning in one line of code
Stars: ✭ 34 (-55.26%)
Mutual labels:  automl
Polara
Recommender system and evaluation framework for top-n recommendations tasks that respects polarity of feedbacks. Fast, flexible and easy to use. Written in python, boosted by scientific python stack.
Stars: ✭ 205 (+169.74%)
Mutual labels:  collaborative-filtering
Tutorials
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial
Stars: ✭ 204 (+168.42%)
Mutual labels:  collaborative-filtering
LR-GCCF
Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach, AAAI2020
Stars: ✭ 99 (+30.26%)
Mutual labels:  collaborative-filtering
Collaborativememorynetwork
Collaborative Memory Network for Recommendation Systems, SIGIR 2018
Stars: ✭ 170 (+123.68%)
Mutual labels:  collaborative-filtering
RecommendationEngine
Source code and dataset for paper "CBMR: An optimized MapReduce for item‐based collaborative filtering recommendation algorithm with empirical analysis"
Stars: ✭ 43 (-43.42%)
Mutual labels:  collaborative-filtering
Rsparse
Fast and accurate machine learning on sparse matrices - matrix factorizations, regression, classification, top-N recommendations.
Stars: ✭ 145 (+90.79%)
Mutual labels:  collaborative-filtering
AutoSpeech
[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang
Stars: ✭ 195 (+156.58%)
Mutual labels:  automl
NiaAML
Python automated machine learning framework.
Stars: ✭ 25 (-67.11%)
Mutual labels:  automl
managed ml systems and iot
Managed Machine Learning Systems and Internet of Things Live Lesson
Stars: ✭ 35 (-53.95%)
Mutual labels:  automl
slopeone
PHP implementation of the Weighted Slope One rating-based collaborative filtering scheme.
Stars: ✭ 85 (+11.84%)
Mutual labels:  collaborative-filtering

The Oboe systems

This bundle of libraries, Oboe and TensorOboe, are automated machine learning (AutoML) systems that use collaborative filtering to find good models for supervised learning tasks within a user-specified time limit. Further hyperparameter tuning can be performed afterwards.

The name comes from the musical instrument oboe: in an orchestra, oboe plays an initial note which the other instruments use to tune to the right frequency before the performance begins. Our Oboe systems play a similar role in AutoML: we use meta-learning to select a promising set of models or to build an ensemble for a new dataset. Users can either directly use the selected models or further fine-tune their hyperparameters.

On a new dataset:

  • Oboe searches for promising estimators (supervised learners) by matrix factorization and classical experiment design. It requires a pre-processed dataset: one-hot encode categorical features and then standardize all features to have zero meanand unit variance. For a complete description, refer to our paper OBOE: Collaborative Filtering for AutoML Model Selection at KDD 2019.

  • TensorOboe searches for promising pipelines, which are directed graphs of learning components here, including imputation, encoding, standardization, dimensionality reduction and estimation. Thus it can accept a raw dataset, possibly with missing entries, different types of features, not-centered features, etc. For a complete description, refer to our paper AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space at KDD 2020.

This bundle of systems is still under developement and subjects to change. For any questions, please submit an issue. The authors will respond as soon as possible.

Installation

The easiest way is to install using pip:

pip install oboe

Alternatively, if you want to customize the source code, you may install in the editable mode by first git clone this respository, and then do

pip install -e .

in the cloned directory. Note this will download some large (about 100MB in total) files to warm-start TensorOboe fitting, so that the setup time (in minutes) can be saved at the cost of disk space and network data usage.

It is recommended to install within an isolated environment (a conda virtual environment, for example) to avoid conflicting dependency versions.

Dependencies with verified versions

The Oboe systems work on Python 3.7 or later. The following libraries are required. The listed versions are the versions that are verified to work. Older versions may work but are not guaranteed.

  • numpy (1.16.4)
  • scipy (1.4.1)
  • pandas (0.24.2)
  • scikit-learn (0.22.1)
  • tensorly (0.6.0)
  • OpenML (0.9.0)
  • mkl (>=1.0.0)

Examples

For more detailed examples, please refer to the Jupyter notebooks in the example folder. A basic classification example using Oboe:

method = 'Oboe'  # 'Oboe' or 'TensorOboe'
problem_type = 'classification'

from oboe import AutoLearner, error  # This may take around 15 seconds at first run.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

data = load_iris()
x = np.array(data['data'])
y = np.array(data['target'])
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

m = AutoLearner(p_type=problem_type, runtime_limit=30, method=method, verbose=False)
m.fit(x_train, y_train)
y_predicted = m.predict(x_test)

print("prediction error (balanced error rate): {}".format(error(y_test, y_predicted, 'classification')))    
print("selected models: {}".format(m.get_models()))

Warm-start meta-training

The large_files folder includes some large numpy arrays that are intermediate results of previous meta-training. This folder is not included in the pip installation, and the files within it can be manually downloaded from this GitHub repository.

The default functionality in TensorOboe is to skip the step of imputing missing entries in the error tensor, and directly use the pre-imputed error tensor. If users desire to impute the error tensor by themselves, the original non-imputed error tensor can be found at large_files/error_tensor_f16_compressed.npz, and the TensorOboe initialization can be done by setting the original_error_tensor_dir argument to the path of this .npz file, and setting mode to 'initialize' when creating the AutoLearner instance: m = AutoLearner(..., method='TensorOboe', mode='initialize', path_to_imputed_error_tensor=<path_to_this_npy_file>).

References

[1] Chengrun Yang, Yuji Akimoto, Dae Won Kim, Madeleine Udell. OBOE: Collaborative filtering for AutoML model selection. KDD 2019.

[2] Chengrun Yang, Jicong Fan, Ziyang Wu, Madeleine Udell. AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space. KDD 2020.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].