All Projects → fidelity → mabwiser

fidelity / mabwiser

Licence: Apache-2.0 license
[IJAIT 2021] MABWiser: Contextual Multi-Armed Bandits Library

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to mabwiser

Agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Stars: ✭ 2,135 (+1679.17%)
Mutual labels:  multi-armed-bandits, contextual-bandits
contextual
Contextual Bandits in R - simulation and evaluation of Multi-Armed Bandit Policies
Stars: ✭ 72 (-40%)
Mutual labels:  multi-armed-bandits, contextual-bandits
onn
Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)
Stars: ✭ 139 (+15.83%)
Mutual labels:  contextual-bandits
rlberry
An easy-to-use reinforcement learning library for research and education.
Stars: ✭ 124 (+3.33%)
Mutual labels:  multi-armed-bandits
Vowpal wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Stars: ✭ 7,815 (+6412.5%)
Mutual labels:  contextual-bandits
MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
Stars: ✭ 15 (-87.5%)
Mutual labels:  contextual-bandits
sinkhorn-policy-gradient.pytorch
Code accompanying the paper "Learning Permutations with Sinkhorn Policy Gradient"
Stars: ✭ 36 (-70%)
Mutual labels:  contextual-bandits

ci PyPI version fury.io PyPI license PRs Welcome Downloads

MABWiser: Parallelizable Contextual Multi-Armed Bandits

MABWiser (IJAIT 2021, ICTAI 2019) is a research library written in Python for rapid prototyping of multi-armed bandit algorithms. It supports context-free, parametric and non-parametric contextual bandit models and provides built-in parallelization for both training and testing components.

The library also provides a simulation utility for comparing different policies and performing hyper-parameter tuning. MABWiser follows a scikit-learn style public interface, adheres to PEP-8 standards, and is tested heavily.

For Bandit-based Recommenders, see also our Mab2Rec library built on top of MABWiser.

MABWiser is developed by the Artificial Intelligence Center of Excellence at Fidelity Investments. Documentation is available at fidelity.github.io/mabwiser.

Quick Start

# An example that shows how to use the UCB1 learning policy
# to choose between two arms based on their expected rewards.

# Import MABWiser Library
from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy

# Data
arms = ['Arm1', 'Arm2']
decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
rewards = [20, 17, 25, 9]

# Model 
mab = MAB(arms, LearningPolicy.UCB1(alpha=1.25))

# Train
mab.fit(decisions, rewards)

# Test
mab.predict()

Available Bandit Policies

Available Learning Policies:

  • Epsilon Greedy [1, 2]
  • LinGreedy [1, 2]
  • LinTS [3]. See [11] for a formal treatment of reproducibility in LinTS
  • LinUCB [4]
  • Popularity [2]
  • Random [2]
  • Softmax [2]
  • Thompson Sampling (TS) [5]
  • Upper Confidence Bound (UCB1) [2]

Available Neighborhood Policies:

  • Clusters [6]
  • K-Nearest [7, 8]
  • LSH Nearest [9]
  • Radius [7, 8]
  • TreeBandit [10]

Installation

MABWiser requires Python 3.6+ and can be installed from PyPI using pip install mabwiser or by building from source as shown in installation instructions.

Support

Please submit bug reports and feature requests as Issues.

Citation

If you use MABWiser in a publication, please cite it as:

    @article{DBLP:journals/ijait/StrongKK21,
      author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
      title     = {{MABWiser:} Parallelizable Contextual Multi-armed Bandits},
      journal   = {Int. J. Artif. Intell. Tools},
      volume    = {30},
      number    = {4},
      pages     = {2150021:1--2150021:19},
      year      = {2021},
      url       = {https://doi.org/10.1142/S0218213021500214},
      doi       = {10.1142/S0218213021500214},
    }

    @inproceedings{DBLP:conf/ictai/StrongKK19,
    author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
    title     = {MABWiser: {A} Parallelizable Contextual Multi-Armed Bandit Library for Python},
    booktitle = {31st {IEEE} International Conference on Tools with Artificial Intelligence, {ICTAI} 2019, Portland, OR, USA, November 4-6, 2019},
    pages     = {909--914},
    publisher = {{IEEE}},
    year      = {2019},
    url       = {https://doi.org/10.1109/ICTAI.2019.00129},
    doi       = {10.1109/ICTAI.2019.00129},
    }

License

MABWiser is licensed under the Apache License 2.0.

References

  1. John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits
  2. Volodymyr Kuleshov and Doina Precup. Algorithms for multi-armed bandit problems
  3. Agrawal, Shipra and Navin Goyal. Thompson sampling for contextual bandits with linear payoffs
  4. Chu, Wei, Li, Lihong, Reyzin Lev, and Schapire Robert. Contextual bandits with linear payoff functions
  5. Osband, Ian, Daniel Russo, and Benjamin Van Roy. More efficient reinforcement learning via posterior sampling
  6. Nguyen, Trong T. and Hady W. Lauw. Dynamic clustering of contextual multi-armed bandits
  7. Melody Y. Guan and Heinrich Jiang, Nonparametric stochastic contextual bandits
  8. Philippe Rigollet and Assaf Zeevi. Nonparametric bandits with covariates
  9. Indyk, Piotr, Motwani, Rajeev, Raghavan, Prabhakar, Vempala, Santosh. Locality-preserving hashing in multidimensional spaces
  10. Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik, A practical method for solving contextual bandit problems using decision trees
  11. Doruk Kilitcioglu, Serdar Kadioglu, Non-deterministic behavior of thompson sampling with linear payoffs and how to avoid it

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].