All Projects β†’ jxbz β†’ madam

jxbz / madam

Licence: other
πŸ‘© Pytorch and Jax code for the Madam optimiser.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to madam

soar-php
SQL optimizer and rewriter. - SQL δΌ˜εŒ–γ€ι‡ε†™ε™¨(θΎ…εŠ© SQL θ°ƒδΌ˜)。
Stars: ✭ 140 (+204.35%)
Mutual labels:  optimizer
adamwr
Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1506.01186 for PyTorch framework
Stars: ✭ 130 (+182.61%)
Mutual labels:  optimizer
AdaBound-tensorflow
An optimizer that trains as fast as Adam and as good as SGD in Tensorflow
Stars: ✭ 44 (-4.35%)
Mutual labels:  optimizer
ShinRL
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives (Deep RL Workshop 2021)
Stars: ✭ 30 (-34.78%)
Mutual labels:  jax
portfolio-optimizer
A library for portfolio optimization algorithms with python interface.
Stars: ✭ 19 (-58.7%)
Mutual labels:  optimizer
Post-Tweaks
A post-installation batch script for Windows
Stars: ✭ 136 (+195.65%)
Mutual labels:  optimizer
uvadlc notebooks
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2022/Spring 2022
Stars: ✭ 901 (+1858.7%)
Mutual labels:  jax
fedpa
Federated posterior averaging implemented in JAX
Stars: ✭ 38 (-17.39%)
Mutual labels:  jax
jaxfg
Factor graphs and nonlinear optimization for JAX
Stars: ✭ 124 (+169.57%)
Mutual labels:  jax
ToyDB
A ToyDB (for beginner) based on MIT 6.830 and CMU 15445
Stars: ✭ 25 (-45.65%)
Mutual labels:  optimizer
efficientnet-jax
EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax
Stars: ✭ 114 (+147.83%)
Mutual labels:  jax
dm pix
PIX is an image processing library in JAX, for JAX.
Stars: ✭ 271 (+489.13%)
Mutual labels:  jax
ada-hessian
Easy-to-use AdaHessian optimizer (PyTorch)
Stars: ✭ 59 (+28.26%)
Mutual labels:  optimizer
Cleaner
The only storage saving app that actually works! :D
Stars: ✭ 27 (-41.3%)
Mutual labels:  optimizer
lookahead tensorflow
Lookahead optimizer ("Lookahead Optimizer: k steps forward, 1 step back") for tensorflow
Stars: ✭ 25 (-45.65%)
Mutual labels:  optimizer
GPJax
A didactic Gaussian process package for researchers in Jax.
Stars: ✭ 159 (+245.65%)
Mutual labels:  jax
wax-ml
A Python library for machine-learning and feedback loops on streaming data
Stars: ✭ 36 (-21.74%)
Mutual labels:  jax
postcss-clean
PostCss plugin to minify your CSS with clean-css
Stars: ✭ 41 (-10.87%)
Mutual labels:  optimizer
robustness-vit
Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).
Stars: ✭ 78 (+69.57%)
Mutual labels:  jax
chef-transformer
Chef Transformer 🍲 .
Stars: ✭ 29 (-36.96%)
Mutual labels:  jax

Madam optimiser

Jeremy Bernstein   Β·   Jiawei Zhao   Β·   Markus Meister
Ming‑Yu Liu   Β·   Anima Anandkumar   Β·   Yisong Yue

Getting started

from madam import Madam
optimizer = Madam(net.parameters(), lr=0.01, p_scale=3.0, g_bound=10.0)

To understand what the different hyperparameters do, note that the typical Madam update to a parameter w is:

w --> w exp(Β± lr).

The largest possible Madam update to a parameter is:

w --> w exp(Β± g_bound x lr).

And finally the parameters are clipped to lie within the range Β± init_scale x p_scale.

An initial learning rate of lr = 0.01 is the recommended default. The algorithm converges to a solution which "jitters" around the true solution, at which point the learning rate should be decayed. We didn't experiment much with g_bound, but g_bound = 10 was a good default. p_scale controls the size of the optimisation domain, and it was worth tuning this in the set [1.0, 2.0, 3.0].

About this repository

This repository was built by Jeremy Bernstein and Jiawei Zhao to accompany the following paper:

Learning compositional functions via multiplicative weight updates.

We're putting this code here so that you can test out our optimisation algorithm in your own applications, and also so that you can attempt to reproduce the experiments in our paper.

If something isn't clear or isn't working, let us know in the Issues section or contact [email protected].

Repository structure

.
β”œβ”€β”€ pytorch/                # Pytorch code to reproduce experiments in the paper.
β”œβ”€β”€ jax/                    # A Jax demo notebook.
β”œβ”€β”€ LICENSE                 # The license on our algorithm.
└── README.md               # The very page you're reading now.

Acknowledgements

Citation

If you find Madam useful, feel free to cite the paper:

@inproceedings{madam,
  title={Learning compositional functions via multiplicative weight updates},
  author={Jeremy Bernstein and Jiawei Zhao and Markus Meister and Ming-Yu Liu and Anima Anandkumar and Yisong Yue},
  booktitle = {Neural Information Processing Systems},
  year={2020}
}

License

We are making our algorithm available under a CC BY-NC-SA 4.0 license. The other code we have used obeys other license restrictions as indicated in the subfolders.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].