davda54 / ada-hessian

Licence: MIT license

Easy-to-use AdaHessian optimizer (PyTorch)

Programming Languages

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ada-hessian

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Stars: ✭ 2,632 (+4361.02%)

Mutual labels: hessian, second-order

Radam

On the Variance of the Adaptive Learning Rate and Beyond

Stars: ✭ 2,442 (+4038.98%)

Mutual labels: optimizer, adam

Optimizers-for-Tensorflow

Adam, NAdam and AAdam optimizers

Stars: ✭ 20 (-66.1%)

Mutual labels: optimizer, adam

Adahessian

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Stars: ✭ 114 (+93.22%)

Mutual labels: optimizer, hessian

AshBF

Over-engineered Brainfuck optimizing compiler and interpreter

Stars: ✭ 14 (-76.27%)

Mutual labels: optimizer

artificial-neural-variability-for-deep-learning

The PyTorch Implementation of Variable Optimizers/ Neural Variable Risk Minimization proposed in our Neural Computation paper: Artificial Neural Variability for Deep Learning: On overfitting, Noise Memorization, and Catastrophic Forgetting.

Stars: ✭ 34 (-42.37%)

Mutual labels: optimizer

neth-proxy

Stratum <-> Stratum Proxy and optimizer for ethminer

Stars: ✭ 35 (-40.68%)

Mutual labels: optimizer

Draftfast

A tool to automate and optimize DraftKings and FanDuel lineup construction.

Stars: ✭ 192 (+225.42%)

Mutual labels: optimizer

adamwr

Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1506.01186 for PyTorch framework

Stars: ✭ 130 (+120.34%)

Mutual labels: optimizer

Cleaner

The only storage saving app that actually works! :D

Stars: ✭ 27 (-54.24%)

Mutual labels: optimizer

LAMB Optimizer TF

LAMB Optimizer for Large Batch Training (TensorFlow version)

Stars: ✭ 119 (+101.69%)

Mutual labels: optimizer

horoscope

horoscope is an optimizer inspector for DBMS.

Stars: ✭ 34 (-42.37%)

Mutual labels: optimizer

keras-gradient-accumulation

Gradient accumulation for Keras

Stars: ✭ 35 (-40.68%)

Mutual labels: optimizer

EAGO.jl

A development environment for robust and global optimization

Stars: ✭ 106 (+79.66%)

Mutual labels: optimizer

keras gradient noise

Add gradient noise to any Keras optimizer

Stars: ✭ 36 (-38.98%)

Mutual labels: optimizer

soar-php

SQL optimizer and rewriter. - SQL 优化、重写器(辅助 SQL 调优)。

Stars: ✭ 140 (+137.29%)

Mutual labels: optimizer

prediction gan

PyTorch Impl. of Prediction Optimizer (to stabilize GAN training)

Stars: ✭ 31 (-47.46%)

Mutual labels: optimizer

XTR-Toolbox

🛠 Versatile tool to optimize Windows

Stars: ✭ 138 (+133.9%)

Mutual labels: optimizer

hesaff-pytorch

PyTorch implementation of Hessian-Affine local feature detector

Stars: ✭ 21 (-64.41%)

Mutual labels: hessian

Post-Tweaks

A post-installation batch script for Windows

Stars: ✭ 136 (+130.51%)

Mutual labels: optimizer

View All Similar Projects ➔

AdaHessian 🚀

Unofficial implementation of the AdaHessian optimizer. Created as a drop-in replacement for any PyTorch optimizer – you only need to set create_graph=True in the backward() call and everything else should work 🥳

Our version supports multiple param_groups, distributed training, delayed Hessian updates and more precise approximation of the Hessian trace.

Usage

from ada_hessian import AdaHessian
...
model = YourModel()
optimizer = AdaHessian(model.parameters())
...
for input, output in data:
  optimizer.zero_grad()
  loss = loss_function(output, model(input))
  loss.backward(create_graph=True)  # this is the important line! 🧐
  optimizer.step()
...

Documentation

`AdaHessian.init`

Argument	Description
`params` (iterable)	iterable of parameters to optimize or dicts defining parameter groups
`lr` (float, optional)	learning rate (default: 0.1)
`betas`((float, float), optional)	coefficients used for computing running averages of gradient and the squared hessian trace (default: (0.9, 0.999))
`eps` (float, optional)	term added to the denominator to improve numerical stability (default: 1e-8)
`weight_decay` (float, optional)	weight decay (L2 penalty) (default: 0.0)
`hessian_power` (float, optional)	exponent of the hessian trace (default: 1.0)
`update_each` (int, optional)	compute the hessian trace approximation only after this number of steps (to save time) (default: 1)
`n_samples` (int, optional)	how many times to sample `z` for the approximation of the hessian trace (default: 1)
`average_conv_kernel` (bool, optional)	average out the hessian traces of convolutional kernels as in the original paper (default: false)

`AdaHessian.step`

Performs a single optimization step.

Argument	Description
`closure` (callable, optional)	a closure that reevaluates the model and returns the loss (default: None)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

davda54 / ada-hessian

Programming Languages

Labels

Projects that are alternatives of or similar to ada-hessian

AdaHessian 🚀

Usage

Documentation

`AdaHessian.init`

`AdaHessian.step`

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

davda54 / ada-hessian

Programming Languages

Labels

Projects that are alternatives of or similar to ada-hessian

AdaHessian 🚀

Usage

Documentation

AdaHessian.__init__

AdaHessian.step

`AdaHessian.init`

`AdaHessian.step`