Janus-Shiau / lookahead_tensorflow

Licence: other

Lookahead optimizer ("Lookahead Optimizer: k steps forward, 1 step back") for tensorflow

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to lookahead tensorflow

Radam

On the Variance of the Adaptive Learning Rate and Beyond

Stars: ✭ 2,442 (+9668%)

Mutual labels: optimizer, adam-optimizer

CS231n

PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"

Stars: ✭ 47 (+88%)

Mutual labels: adam-optimizer, sgd-optimizer

artificial-neural-variability-for-deep-learning

The PyTorch Implementation of Variable Optimizers/ Neural Variable Risk Minimization proposed in our Neural Computation paper: Artificial Neural Variability for Deep Learning: On overfitting, Noise Memorization, and Catastrophic Forgetting.

Stars: ✭ 34 (+36%)

Mutual labels: optimizer

adamwr

Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1506.01186 for PyTorch framework

Stars: ✭ 130 (+420%)

Mutual labels: optimizer

Hypergradient variants

Improved Hypergradient optimizers, providing better generalization and faster convergence.

Stars: ✭ 15 (-40%)

Mutual labels: adam-optimizer

XTR-Toolbox

🛠 Versatile tool to optimize Windows

Stars: ✭ 138 (+452%)

Mutual labels: optimizer

Cleaner

The only storage saving app that actually works! :D

Stars: ✭ 27 (+8%)

Mutual labels: optimizer

neth-proxy

Stratum <-> Stratum Proxy and optimizer for ethminer

Stars: ✭ 35 (+40%)

Mutual labels: optimizer

ToyDB

A ToyDB (for beginner) based on MIT 6.830 and CMU 15445

Stars: ✭ 25 (+0%)

Mutual labels: optimizer

keras-gradient-accumulation

Gradient accumulation for Keras

Stars: ✭ 35 (+40%)

Mutual labels: optimizer

portfolio-optimizer

A library for portfolio optimization algorithms with python interface.

Stars: ✭ 19 (-24%)

Mutual labels: optimizer

Optimizers-for-Tensorflow

Adam, NAdam and AAdam optimizers

Stars: ✭ 20 (-20%)

Mutual labels: optimizer

prediction gan

PyTorch Impl. of Prediction Optimizer (to stabilize GAN training)

Stars: ✭ 31 (+24%)

Mutual labels: optimizer

ML-MCU

Code for IoT Journal paper title 'ML-MCU: A Framework to Train ML Classifiers on MCU-based IoT Edge Devices'

Stars: ✭ 28 (+12%)

Mutual labels: sgd-optimizer

horoscope

horoscope is an optimizer inspector for DBMS.

Stars: ✭ 34 (+36%)

Mutual labels: optimizer

Post-Tweaks

A post-installation batch script for Windows

Stars: ✭ 136 (+444%)

Mutual labels: optimizer

AshBF

Over-engineered Brainfuck optimizing compiler and interpreter

Stars: ✭ 14 (-44%)

Mutual labels: optimizer

LAMB Optimizer TF

LAMB Optimizer for Large Batch Training (TensorFlow version)

Stars: ✭ 119 (+376%)

Mutual labels: optimizer

soar-php

SQL optimizer and rewriter. - SQL 优化、重写器(辅助 SQL 调优)。

Stars: ✭ 140 (+460%)

Mutual labels: optimizer

AdaBound-tensorflow

An optimizer that trains as fast as Adam and as good as SGD in Tensorflow

Stars: ✭ 44 (+76%)

Mutual labels: optimizer

View All Similar Projects ➔

lookahead_tensorflow

Lookahead optimizer ("Lookahead Optimizer: k steps forward, 1 step back") for tensorflow

Environment

This code is implemmented and tested with tensorflow 1.11.0. and 1.13.0.
I didn't use any special operator, so it should also work for other version of tensorflow.

Usage

I didn't directly wrap the optimizer, but make the lookahead strategy independent.
Thus, it's more flexible to decide what should be optimized with lookahead.

Please assert the class after all variable initialization, and initialize the BaseLoookAhead with all trainable variables.

import tensorflow as tf
from lookahead_opt import BaseLookAhead

"""
Build your model here
Please also include any optimizer you need.
"""

model_vars = [v for v in tf.trainable_variables()]
tf.global_variables_initializer().run()

lookahead = BaseLookAhead(model_vars, k=5, alpha=0.5)

Arguments are define as follows:

model_vars: the variables to be lookahead. [list]
k: the number of steps that fast weights go forward. [int]
alpha: The learning rate for merging slow to fast weight. [float]

Add the assign operator to training operation or directly run in session.

# Add to train_op
train_op += lookahead.get_ops()

# Or just run the Session
with tf.Session() as sess:
  _ = sess.run(lookahead.get_ops())

Implementation Details

Inject Lookahead to model and save specific variables

The Lookahead is wrapped with default variable_scope "lookahead". After calling BaseLookAhead with specific variables, the variables will be injected to lookahead.
Noted that, the lookahead class is totally separated from optimizer, please remember to add optimizer when creating training graph.

The BaseLookAhead will create duplicated tf.Variables to save the slow weight. And a counter will be automatically created to do "k steps forward, 1 step back".

Experimental results

I have conduct experiments on a many-to-many recursive task with stacked weight-dropped LSTM, proposed in "Regularizing and Optimizing LSTM Language Models".
Using lookahead with Adam, the training loss is higher than the model without lookahead. But the validation loss with lookahead is slightly better.

Contact & Copy Right

Code work by Jia-Yau Shiau [email protected].

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Janus-Shiau / lookahead_tensorflow

Programming Languages

Labels

Projects that are alternatives of or similar to lookahead tensorflow

lookahead_tensorflow

Environment

Usage

Implementation Details

Inject Lookahead to model and save specific variables

Experimental results

Contact & Copy Right