All Projects → mpyrozhok → adamwr

mpyrozhok / adamwr

Licence: MIT license
Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1506.01186 for PyTorch framework

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to adamwr

scheduler-component
A Web Component wrapper for FullCalendar library that uses Polymer version 2.0 and ES6.
Stars: ✭ 24 (-81.54%)
Mutual labels:  scheduler
Cleaner
The only storage saving app that actually works! :D
Stars: ✭ 27 (-79.23%)
Mutual labels:  optimizer
yii2-fullcalendar-scheduler
Yii 2 component for easy fullcalendar scheduler integration
Stars: ✭ 24 (-81.54%)
Mutual labels:  scheduler
synchly
Automate database backups with customizable recurring schedules.
Stars: ✭ 27 (-79.23%)
Mutual labels:  scheduler
AnnA Anki neuronal Appendix
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (-70%)
Mutual labels:  scheduler
sched
⏳ a high performance reliable task scheduling package in Go.
Stars: ✭ 46 (-64.62%)
Mutual labels:  scheduler
ld-scheduler
Schedule Launch Darkly flags on or off
Stars: ✭ 14 (-89.23%)
Mutual labels:  scheduler
portfolio-optimizer
A library for portfolio optimization algorithms with python interface.
Stars: ✭ 19 (-85.38%)
Mutual labels:  optimizer
soar-php
SQL optimizer and rewriter. - SQL 优化、重写器(辅助 SQL 调优)。
Stars: ✭ 140 (+7.69%)
Mutual labels:  optimizer
EAGO.jl
A development environment for robust and global optimization
Stars: ✭ 106 (-18.46%)
Mutual labels:  optimizer
celery-beatx
Modern fail-safety scheduler for Celery
Stars: ✭ 46 (-64.62%)
Mutual labels:  scheduler
RxSchedulerSuppress
RxSchedulerSuppress 是用于抑制 RxJava 在同一个线程池内重复调度的工具
Stars: ✭ 30 (-76.92%)
Mutual labels:  scheduler
natsu-clr
il2cpp transpiler and runtime compatible with .Net Core
Stars: ✭ 76 (-41.54%)
Mutual labels:  clr
angular-gantt-schedule-timeline-calendar-example
Angular gantt-schedule-timeline-calendar usage example
Stars: ✭ 15 (-88.46%)
Mutual labels:  scheduler
cronnit.com
A free tool for scheduling posts to Reddit.
Stars: ✭ 3 (-97.69%)
Mutual labels:  scheduler
nnCron
Advanced and very powerful scheduler, scripting tool and automation manager
Stars: ✭ 60 (-53.85%)
Mutual labels:  scheduler
chronus
Chronus是360数科技术团队基于阿里开源项目TBSchedule重写的分布式调度。
Stars: ✭ 174 (+33.85%)
Mutual labels:  scheduler
clr-loader
Loader for different .NET runtimes
Stars: ✭ 16 (-87.69%)
Mutual labels:  clr
joobq
JoobQ is a fast, efficient asynchronous reliable job queue and job scheduler library processing. Jobs are submitted to a job queue, where they reside until they are able to be scheduled to run in a computing environment.
Stars: ✭ 26 (-80%)
Mutual labels:  scheduler
rhythm
Time-based job scheduler for Apache Mesos
Stars: ✭ 30 (-76.92%)
Mutual labels:  scheduler

AdamW optimizer and cosine learning rate annealing with restarts

This repository contains an implementation of AdamW optimization algorithm and cosine learning rate scheduler described in "Decoupled Weight Decay Regularization". AdamW implementation is straightforward and does not differ much from existing Adam implementation for PyTorch, except that it separates weight decaying from batch gradient calculations. Cosine annealing scheduler with restarts allows model to converge to a (possibly) different local minimum on every restart and normalizes weight decay hyperparameter value according to the length of restart period. Unlike schedulers presented in standard PyTorch scheduler suite this scheduler adjusts optimizer's learning rate not on every epoch, but on every batch update, according to the paper.

Cyclical Learning Rates

Besides "cosine" and "arccosine" policies (arccosine has steeper profile at the limiting points), there are "triangular", triangular2 and exp_range, which implement policies proposed in "Cyclical Learning Rates for Training Neural Networks". The ratio of increasing and decreasing phases for triangular policy could be adjusted with triangular_step parameter. Minimum allowed lr is adjusted by min_lr parameter.

  • triangular schedule is enabled by passing policy="triangular" parameter.
  • triangular2 schedule reduces maximum lr by half on each restart cycle and is enabled by passing policy="triangular2" parameter, or by combining parameters policy="triangular", eta_on_restart_cb=ReduceMaxLROnRestart(ratio=0.5). The ratio parameter regulates the factor by which lr is scaled on each restart.
  • exp_range schedule is enabled by passing policy="exp_range" parameter. It exponentially scales maximum lr depending on iteration count. The base of exponentiation is set by gamma parameter.

These schedules could be combined with shrinking/expanding restart periods, weight decay normalization and could be used with AdamW and other PyTorch optimizers.

Example:

    batch_size = 32
    epoch_size = 1024
    model = resnet()
    optimizer = AdamW(model.parameters(), lr=1e-3, weight_decay=1e-5)
    scheduler = CyclicLRWithRestarts(optimizer, batch_size, epoch_size, restart_period=5, t_mult=1.2, policy="cosine")
    for epoch in range(100):
        scheduler.step()
        train_for_every_batch(...)
            ...
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            scheduler.batch_step()
        validate(...)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].