All Projects → harsh306 → awesome-nn-optimization

harsh306 / awesome-nn-optimization

Licence: CC-BY-4.0 License
Awesome list for Neural Network Optimization methods.

Projects that are alternatives of or similar to awesome-nn-optimization

bio ik
MoveIt kinematics_base plugin based on particle optimization & GA
Stars: ✭ 104 (+166.67%)
Mutual labels:  optimization, convex-optimization, non-convex-optimization
Deep-Learning-Optimization-Algorithms
Visualization of various deep learning optimization algorithms using PyTorch automatic differentiation and optimizers.
Stars: ✭ 47 (+20.51%)
Mutual labels:  convex-optimization, non-convex-optimization
BifurcationInference.jl
learning state-space targets in dynamical systems
Stars: ✭ 24 (-38.46%)
Mutual labels:  continuation, dynamical-systems
osqp
The Operator Splitting QP Solver
Stars: ✭ 929 (+2282.05%)
Mutual labels:  optimization, convex-optimization
opfunu
A collection of Benchmark functions for numerical optimization problems (https://opfunu.readthedocs.io)
Stars: ✭ 31 (-20.51%)
Mutual labels:  convex-optimization, non-convex-optimization
ProxSDP.jl
Semidefinite programming optimization solver
Stars: ✭ 69 (+76.92%)
Mutual labels:  optimization, convex-optimization
gibbous
Convex optimization for java and scala, built on Apache Commons Math
Stars: ✭ 17 (-56.41%)
Mutual labels:  optimization, convex-optimization
Optimization
A set of lightweight header-only template functions implementing commonly-used optimization methods on Riemannian manifolds and convex spaces.
Stars: ✭ 66 (+69.23%)
Mutual labels:  optimization, convex-optimization
BifurcationKit.jl
A Julia package to perform Bifurcation Analysis
Stars: ✭ 185 (+374.36%)
Mutual labels:  continuation, bifurcation
Joint-User-Association-and-In-band-Backhaul-Scheduling-and-in-5G-mmWave-Networks
Matlab Simulation for T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah and M. Latva-aho, "Joint In-Band Backhauling and Interference Mitigation in 5G Heterogeneous Networks," European Wireless 2016; 22th European Wireless Conference, Oulu, Finland, 2016, pp. 1-6. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7499273&isnumber=7499250
Stars: ✭ 36 (-7.69%)
Mutual labels:  optimization, convex-optimization
PyOptSamples
Optimization sample codes on Python
Stars: ✭ 20 (-48.72%)
Mutual labels:  optimization, convex-optimization
rcppensmallen
Rcpp integration for the Ensmallen templated C++ mathematical optimization library
Stars: ✭ 28 (-28.21%)
Mutual labels:  optimization
DiagnoseRE
Source code and dataset for the CCKS201 paper "On Robustness and Bias Analysis of BERT-based Relation Extraction"
Stars: ✭ 23 (-41.03%)
Mutual labels:  generalization
FirstOrderSolvers.jl
Large scale convex optimization solvers in julia
Stars: ✭ 20 (-48.72%)
Mutual labels:  convex-optimization
arch-packages
Arch Linux performance important packages
Stars: ✭ 27 (-30.77%)
Mutual labels:  optimization
GPU-Pathtracer
GPU Raytracer from scratch in C++/CUDA
Stars: ✭ 326 (+735.9%)
Mutual labels:  optimization
csso-webpack-plugin
CSSO full restructuring minification files to serve your webpack bundles
Stars: ✭ 104 (+166.67%)
Mutual labels:  optimization
pyPESTO
python Parameter EStimation TOolbox
Stars: ✭ 93 (+138.46%)
Mutual labels:  optimization
pigosat
Go (golang) bindings for Picosat, the satisfiability solver
Stars: ✭ 15 (-61.54%)
Mutual labels:  optimization
siconos
Simulation framework for nonsmooth dynamical systems
Stars: ✭ 120 (+207.69%)
Mutual labels:  optimization

Content

Popular Optimization algorithms

Normalization Methods

  • BatchNorm [Link]
  • Weight Norm [Link]
  • Spectral Norm [Link]
  • Cosine Normalization [Link]
  • L2 Regularization versus Batch and Weight Normalization Link
  • WHY GRADIENT CLIPPING ACCELERATES TRAINING: A THEORETICAL JUSTIFICATION FOR ADAPTIVITY Link

On Convexity and Generalization of Neural Networks

  • Convex Neural Networks [Link]
  • Breaking the Curse of Dimensionality with Convex Neural Networks [Link]
  • UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION [Link]
  • Optimal Control Via Neural Networks: A Convex Approach. [Link]
  • Input Convex Neural Networks [Link]
  • A New Concept of Convex based Multiple Neural Networks Structure. [Link
  • SGD Converges to Global Minimum in Deep Learning via Star-convex Path [Link]
  • A Convergence Theory for Deep Learning via Over-Parameterization Link

Continuation Methods and Curriculum Learning

  • Curriculum Learning [Link]
  • SOLVING RUBIK’S CUBE WITH A ROBOT HAND Link
  • Noisy Activation Function [Link]
  • Mollifying Networks [Link]
  • Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks Link Talk
  • Automated Curriculum Learning for Neural Networks Link
  • On The Power of Curriculum Learning in Training Deep Networks Link
  • On-line Adaptative Curriculum Learning for GANs Link
  • Parameter Continuation with Secant Approximation for Deep Neural Networks and Step-up GAN Link
  • HashNet: Deep Learning to Hash by Continuation. [Link]
  • Learning Combinations of Activation Functions. [Link]
  • Learning and development in neural networks: The importance of starting small (1993) Link
  • Flexible shaping: How learning in small steps helps Link
  • Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning Link
  • RETHINKING CURRICULUM LEARNING WITH INCREMENTAL LABELS AND ADAPTIVE COMPENSATION Link
  • Parameter Continuation Methods for the Optimization of Deep Neural Networks Link
  • Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection [Link (https://www.aclweb.org/anthology/W18-6314.pdf)
  • Reinforcement Learning based Curriculum Optimization for Neural Machine Translation Link
  • EVOLUTIONARY POPULATION CURRICULUM FOR SCALING MULTI-AGENT REINFORCEMENT LEARNING Link
  • ENTROPY-SGD: BIASING GRADIENT DESCENT INTO WIDE VALLEYS Link
  • NEIGHBOURHOOD DISTILLATION: ON THE BENEFITS OF NON END-TO-END DISTILLATION Link
  • LEARNING TO EXECUTE Link
  • Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing Link
  • Data Parameters: A New Family of Parameters for Learning a Differentiable Curriculum Link
  • Breaking the Curse of Space Explosion: Towards Effcient NAS with Curriculum Search Link
  • Continuation Methods and Curriculum Learning for Learning to Rank Link

On Loss Surfaces and Generalization of Deep Neural Networks

  • Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Link
  • QUALITATIVELY CHARACTERIZING NEURAL NETWORK OPTIMIZATION PROBLEMS[Link]
  • The Loss Surfaces of Multilayer Networks [Link]
  • Visualizing the Loss Landscape of Neural Nets [Link]
  • The Loss Surface Of Deep Linear Networks Viewed Through The Algebraic Geometry Lens [Link]
  • How regularization affects the critical points in linear networks.[Link]
  • Local minima in training of neural networks [Link]
  • Necessary and Sufficient Geometries for Gradient Methods Link
  • Fine-grained Optimization of Deep Neural Networks Link
  • SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS Link

Dynamics, Bifurcations and RNNs difficulty to train

  • Deep Equilibrium Models Link
  • Bifurcations of Recurrent Neural Networks in Gradient Descent Learning [Link]
  • On the difficulty of training recurrent neural networks [Link]
  • Understanding and Controlling Memory in Recurrent Neural Networks [Link]
  • Dynamics and Bifurcation of Neural Networks [Link]
  • Context Aware Machine Learning [Link]
  • The trade-off between long-term memory and smoothness for recurrent networks [Link]
  • Dynamical complexity and computation in recurrent neural networks beyond their fxed point [Link]
  • Bifurcations in discrete-time neural networks : controlling complex network behaviour with inputs [Links]
  • Interpreting Recurrent Neural Networks Behaviour via Excitable Network Attractors [Link]
  • Bifurcation analysis of a neural network model Link
  • A Differentiable Physics Engine for Deep Learning in Robotics Link
  • Deep learning for universal linear embeddings of nonlinear dynamics Link
  • Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations Link
  • Analysis of gradient descent learning algorithms for multilayer feedforward neural networks Link
  • A dynamical model for the analysis and acceleration of learning in feedforward networks Link
  • A bio-inspired bistable recurrent cell allows for long-lasting memory Link
  • Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation [Link (https://www.frontiersin.org/articles/10.3389/fncom.2017.00024/full)

Poor Local Minima? and Sharp Minima

  • Adding One Neuron Can Eliminate All Bad Local Minima Link
  • Deep Learning without Poor Local Minima Link
  • Elimination of All Bad Local Minima in Deep Learning Link
  • How to escape saddle points efficiently. Link
  • Depth with Nonlinearity Creates No Bad Local Minima in ResNets Link
  • Sharp Minima Can Generalize For Deep Nets Link
  • Asymmetric Valleys: Beyond Sharp and Flat Local Minima Link
  • A Reparameterization-Invariant Flatness Measure for Deep Neural Networks Link
  • A Simple Weight Decay Can Improve Generalization Link
  • Finding Critical and Gradient-Flat Points of Deep Neural Network Loss Functions Link
  • The Loss Surface Of Deep Linear Networks Viewed Through The Algebraic Geometry Lens Link
  • Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization Link
  • Flatness is a False Friend Link
  • Are_Saddles_Good_Enough_for_Deep_Learning Link

Initialization of Neural Network

  • Deep learning course notes Link
  • On the importance of initialization and momentum in deep learning Link
  • The Break-Even Point on Optimization Trajectories of Deep Neural Networks Link
  • THE EARLY PHASE OF NEURAL NETWORK TRAINING Link
  • One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers Link
  • PCA-Initialized Deep Neural Networks Applied To Document Image Analysis Link
  • Understanding the difficulty of training deep feedforward neural networks Link
  • Unitary Evolution of RNNs Link

Batch size Optimiation

  • ON LARGE-BATCH TRAINING FOR DEEP LEARNING: GENERALIZATION GAP AND SHARP MINIMALink
  • Revisiting Small Batch Training for Deep Neural Networks Link
  • LARGE BATCH TRAINING OF CONVOLUTIONAL NETWORKS Link
  • Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Link
  • DON’T DECAY THE LEARNING RATE, INCREASE THE BATCH SIZE Link

Degeneracy of Neural Networks

  • Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Link
  • Avoiding pathologies in very deep networks Link
  • Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice Link
  • SKIP CONNECTIONS ELIMINATE SINGULARITIES Link
  • How degenerate is the parametrization of neural networks with the ReLU activation function? Link
  • Theory of Deep Learning III: explaining the non-overfitting puzzle Link
  • Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks Link
  • Understanding Deep Learning: Expected Spanning Dimension and Controlling the Flexibility of Neural Networks Link
  • The Loss Surface Of Deep Linear Networks Viewed Through The Algebraic Geometry Lens Link
  • PYHESSIAN: Neural Networks Through the Lens of the Hessian Link

Convergencec Analysis in Deep Learning

  • A CONVERGENCE ANALYSIS OF GRADIENT DESCENT FOR DEEP LINEAR NEURAL NETWORKS Link
  • A Convergence Theory for Deep Learning via Over-Parameterization Link
  • Convergence Analysis of Homotopy-SGD for Non-Convex Optimization Link

Multi-Task Learning with curricula

  • Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning. Link
  • Learning a Multitask Curriculum for Neural Machine Translation. Link
  • Self-paced Curriculum Learning. Link
  • Curriculum Learning of Multiple Tasks. Link

Constrained Optimization for Deep Learning

  • A Primal-Dual Formulation for Deep Learning with Constraints Link

Reinforcement Learning and Curriculum

  • Object-Oriented Curriculum Generation for Reinforcement Learning Link
  • Teacher-Student Curriculum Learning Link

Tutorials, Surveys and Blogs

  • Curriculum Learning: A Survey Link
  • A Comprehensive Survey on Curriculum Learning Link
  • https://www.offconvex.org/
  • An overview of gradient descent optimization algorithms [Link]
  • Review of second-order optimization techniques in artificial neural networks backpropagation Link
  • Linear Algebra and data Link
  • Why Momentum really works?[Blog]
  • Optimization [Book]
  • Optimization for deep learning: theory and algorithms Link
  • Generalization Error in Deep Learning Link
  • Automatic Differentiation in Machine Learning: a Survey Link
  • Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey Link
  • Automatic Curriculum Learning For Deep RL: A Short Survey Link
  • The Generalization Mystery: Sharp vs Flat Minima Link

Contributing

If you've found any informative resources that you think belong here, be sure to submit a pull request or create an issue!

If you find this helpful, I can enjoy a coffee donation :)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].