Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → nicklashansen → neural-net-optimization

nicklashansen / neural-net-optimization

Licence: MIT license

PyTorch implementations of recent optimization algorithms for deep learning.

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning pytorch optimization-algorithms

Projects that are alternatives of or similar to neural-net-optimization

Sparse Optimisation Research Code

Stars: ✭ 164 (+177.97%)

Mutual labels: optimization-algorithms

Global optimization algorithms written in C++

Stars: ✭ 43 (-27.12%)

Mutual labels: optimization-algorithms

Harris-Hawks-Optimization-Algorithm-and-Applications

Source codes for HHO paper: Harris hawks optimization: Algorithm and applications: https://www.sciencedirect.com/science/article/pii/S0167739X18313530. In this paper, a novel population-based, nature-inspired optimization paradigm is proposed, which is called Harris Hawks Optimizer (HHO).

Stars: ✭ 31 (-47.46%)

Mutual labels: optimization-algorithms

Collection of Python tools for the modeling and solution of Mixed-Integer Linear programs

Stars: ✭ 202 (+242.37%)

Mutual labels: optimization-algorithms

Mathematical optimization in pure Rust

Stars: ✭ 234 (+296.61%)

Mutual labels: optimization-algorithms

Simulated Dual Annealing for python and benchmarks

Stars: ✭ 15 (-74.58%)

Mutual labels: optimization-algorithms

MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1

Stars: ✭ 153 (+159.32%)

Mutual labels: optimization-algorithms

Official implementation of Auxiliary Learning by Implicit Differentiation [ICLR 2021]

Stars: ✭ 71 (+20.34%)

Mutual labels: optimization-algorithms

Reinforcement learning with A* and a deep heuristic

Stars: ✭ 235 (+298.31%)

Mutual labels: optimization-algorithms

A SciPy compatible super fast Python implementation for Particle Swarm Optimization.

Stars: ✭ 33 (-44.07%)

Mutual labels: optimization-algorithms

Image-processing software for cryo-electron microscopy

Stars: ✭ 219 (+271.19%)

Mutual labels: optimization-algorithms

The library contains a number of interconnected Java packages that implement machine learning and artificial intelligence algorithms. These are artificial intelligence algorithms implemented for the kind of people that like to implement algorithms themselves.

Stars: ✭ 225 (+281.36%)

Mutual labels: optimization-algorithms

A collection of algorithms for the (Resource) Constrained Shortest Path problem in Python / C++ / C#

Stars: ✭ 64 (+8.47%)

Mutual labels: optimization-algorithms

Optimizer Visualization

Visualize Tensorflow's optimizers.

Stars: ✭ 178 (+201.69%)

Mutual labels: optimization-algorithms

MIRT: Michigan Image Reconstruction Toolbox (Julia version)

Stars: ✭ 80 (+35.59%)

Mutual labels: optimization-algorithms

Bayesian Adaptive Direct Search (BADS) optimization algorithm for model fitting in MATLAB

Stars: ✭ 159 (+169.49%)

Mutual labels: optimization-algorithms

A parallel branch-and-bound engine for Python. (https://pybnb.readthedocs.io/)

Stars: ✭ 53 (-10.17%)

Mutual labels: optimization-algorithms

Nature-Inspired-Algorithms

Sample Code Collection of Nature-Inspired Computational Methods

Stars: ✭ 22 (-62.71%)

Mutual labels: optimization-algorithms

optaplanner-quickstarts

OptaPlanner quick starts for AI optimization: many use cases shown in many different technologies.

Stars: ✭ 226 (+283.05%)

Mutual labels: optimization-algorithms

An evolutionary computation framework to (automatically) build fast parallel stochastic optimization solvers

Stars: ✭ 73 (+23.73%)

Mutual labels: optimization-algorithms

View All Similar Projects ➔

Optimization for Deep Learning

This repository contains PyTorch implementations of popular/recent optimization algorithms for deep learning, including SGD, SGD w/ momentum, SGD w/ Nesterov momentum, SGDW, RMSprop, Adam, Nadam, Adam w/ L2 regularization, AdamW, RAdam, RAdamW, Gradient Noise, Gradient Dropout, Learning Rate Dropout and Lookahead.

All extensions have been implemented such that it allows for mix-and-match optimization, e.g. you can train a neural net using RAdamW with both Nesterov momentum, Gradient Noise, Learning Rate Dropout and Lookahead.

Related papers

Material in this repository has been developed as part of a special course / study and reading group. This is the list of papers that we have discussed and/or implemented:

An Overview of Gradient Descent Optimization Algorithms

Optimization Methods for Large-Scale Machine Learning

On the importance of initialization and momentum in deep learning

Aggregated Momentum: Stability Through Passive Damping

ADADELTA: An Adaptive Learning Rate Method

Adam: A Method for Stochastic Optimization

On the Convergence of Adam and Beyond

Decoupled Weight Decay Regularization

On the Variance of the Adaptive Learning Rate and Beyond

Incorporating Nesterov Momentum Into Adam

Adaptive Gradient Methods with Dynamic Bound of Learning Rate

On the Convergence of AdaBound and its Connection to SGD

Lookahead Optimizer: k steps forward, 1 step back

The Marginal Value of Adaptive Gradient Methods in Machine Learning

Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization

Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks

Curriculum Learning in Deep Neural Networks

HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Adding Gradient Noise Improves Learning for Very Deep Networks

Learning Rate Dropout

Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks

How to run

You can run the experiments and algorithms by calling e.g.

python main.py -num_epochs 30 -dataset cifar -num_train 50000 -num_val 2048 -lr_schedule True

with arguments as specified in the main.py file. The algorithms can be run on two different datasets, MNIST and CIFAR-10. For MNIST a small MLP is used for proof of concept, whereas a 808,458 parameter CNN is used for CIFAR-10. You may optionally decrease the size of the dataset and/or number of epochs to decrease computational complexity, but the arguments given above were used to produce the results shown here.

Results

Below you will find our main results. As for all optimization problems, the performance of particular algorithms is highly dependent on the problem details as well as hyper-parameters. While we have made no attempt at fine-tuning the hyper-parameters of individual optimization methods, we have kept as many hyper-parameters as possible constant to better allow for comparison. Wherever possible, default hyper-parameters as proposed by original authors have been used.

When faced with a real application, one should always try out a number of different algorithms and hyper-parameters to figure out what works better for your particular problem.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 59

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗