GLambard / Adamw_keras
AdamW optimizer for Keras
Stars: ✭ 106
Programming Languages
python
139335 projects - #7 most used programming language
Labels
Projects that are alternatives of or similar to Adamw keras
Heimer
Heimer is a simple cross-platform mind map, diagram, and note-taking tool written in Qt.
Stars: ✭ 380 (+258.49%)
Mutual labels: optimizer
simplu3D
A library to generate buildings from local urban regulations.
Stars: ✭ 18 (-83.02%)
Mutual labels: optimizer
rethinking-bnn-optimization
Implementation for the paper "Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization"
Stars: ✭ 62 (-41.51%)
Mutual labels: optimizer
Gpx Simplify Optimizer
Free Tracks Optimizer Online Service
Stars: ✭ 61 (-42.45%)
Mutual labels: optimizer
Avmf
🔩 Framework and Java implementation of the Alternating Variable Method
Stars: ✭ 16 (-84.91%)
Mutual labels: optimizer
Adamp
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights (ICLR 2021)
Stars: ✭ 306 (+188.68%)
Mutual labels: optimizer
Vtil Core
Virtual-machine Translation Intermediate Language
Stars: ✭ 738 (+596.23%)
Mutual labels: optimizer
Stacer
Linux System Optimizer and Monitoring - https://oguzhaninan.github.io/Stacer-Web
Stars: ✭ 7,405 (+6885.85%)
Mutual labels: optimizer
sam.pytorch
A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization
Stars: ✭ 96 (-9.43%)
Mutual labels: optimizer
Viz torch optim
Videos of deep learning optimizers moving on 3D problem-landscapes
Stars: ✭ 86 (-18.87%)
Mutual labels: optimizer
Jhc Components
JHC Haskell compiler split into reusable components
Stars: ✭ 55 (-48.11%)
Mutual labels: optimizer
Marsnake
System Optimizer and Monitoring, Security Auditing, Vulnerability scanner for Linux, macOS, and UNIX-based systems
Stars: ✭ 16 (-84.91%)
Mutual labels: optimizer
Fixing Weight Decay Regularization in Adam - For Keras ⚡️ 😃
Implementation of the AdamW optimizer(Ilya Loshchilov, Frank Hutter) for Keras.
Tested on this system
- python 3.6
- Keras 2.1.6
- tensorflow(-gpu) 1.8.0
Usage
Additionally to a usual Keras setup for neural nets building (see Keras for details)
from AdamW import AdamW
adamw = AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0., weight_decay=0.025, batch_size=1, samples_per_epoch=1, epochs=1)
Then nothing change compared to the usual usage of an optimizer in Keras after the definition of a model's architecture
model = Sequential()
<definition of the model_architecture>
model.compile(loss="mse", optimizer=adamw, metrics=[metrics.mse], ...)
Note that the size of a batch (batch_size), number of training samples per epoch (samples_per_epoch) and the number of epochs (epochs) are necessary to the normalization of the weight decay (paper, Section 4)
Done
- Weight decay added to the parameters optimization
- Normalized weight decay added
To be done (eventually - help is welcome)
- Cosine annealing
- Warm restarts
Source
ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION, D.P. Kingma, J. Lei Ba
Fixing Weight Decay Regularization in Adam, I. Loshchilov, F. Hutter
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].