Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ajbrock → Freezeout

ajbrock / Freezeout

Accelerate Neural Net Training by Progressively Freezing Layers

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning machine-learning pytorch neural-networks densenet vgg16 memes

Projects that are alternatives of or similar to Freezeout

This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.

Stars: ✭ 124 (-36.73%)

Mutual labels: neural-networks, densenet

Skin Lesions Classification DCNNs

Transfer Learning with DCNNs (DenseNet, Inception V3, Inception-ResNet V2, VGG16) for skin lesions classification

Stars: ✭ 47 (-76.02%)

Mutual labels: densenet, vgg16

Chainer Cifar10

Various CNN models for CIFAR10 with Chainer

Stars: ✭ 134 (-31.63%)

Mutual labels: neural-networks, densenet

MemeGen is a web application where the user gives an image as input and our tool generates a meme at one click for the user.

Stars: ✭ 57 (-70.92%)

Mutual labels: memes, neural-networks

Deep Learning With Python

Deep learning codes and projects using Python

Stars: ✭ 195 (-0.51%)

Mutual labels: neural-networks, vgg16

A repository of state-of-the-art deep learning methods in computer vision

Stars: ✭ 176 (-10.2%)

Mutual labels: neural-networks

Easy whole-brain modeling for computational neuroscientists 🧠💻👩🏿‍🔬

Stars: ✭ 188 (-4.08%)

Mutual labels: neural-networks

All about attention in neural networks. Soft attention, attention maps, local and global attention and multi-head attention.

Stars: ✭ 175 (-10.71%)

Mutual labels: neural-networks

Awesome Deep Learning Music

List of articles related to deep learning applied to music

Stars: ✭ 2,195 (+1019.9%)

Mutual labels: neural-networks

Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

Stars: ✭ 195 (-0.51%)

Mutual labels: neural-networks

Neural Localization

Train an RL agent to localize actively (PyTorch)

Stars: ✭ 193 (-1.53%)

Mutual labels: neural-networks

Source Separation Wavenet

A neural network for end-to-end music source separation

Stars: ✭ 185 (-5.61%)

Mutual labels: neural-networks

Brain Modeling Toolkit

Stars: ✭ 177 (-9.69%)

Mutual labels: neural-networks

Aspect Based Sentiment Analysis using End-to-End Memory Networks

Stars: ✭ 189 (-3.57%)

Mutual labels: neural-networks

Video Platform for Action Recognition and Object Detection in Pytorch

Stars: ✭ 175 (-10.71%)

Mutual labels: neural-networks

Coursera Deep Learning Specialization

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Networks; (v) Sequence Models

Stars: ✭ 188 (-4.08%)

Mutual labels: neural-networks

Introductions to key concepts in quantum machine learning, as well as tutorials and implementations from cutting-edge QML research.

Stars: ✭ 174 (-11.22%)

Mutual labels: neural-networks

A cleaner way to build neural networks for PyTorch.

Stars: ✭ 184 (-6.12%)

Mutual labels: neural-networks

Automatic Speech Recognition

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)

Stars: ✭ 192 (-2.04%)

Mutual labels: neural-networks

Neural Cryptography Tensorflow

Neural Networks that invent their own encryption 🔑

Stars: ✭ 181 (-7.65%)

Mutual labels: neural-networks

View All Similar Projects ➔

FreezeOut

A simple technique to accelerate neural net training by progressively freezing layers.

This repository contains code for the extended abstract "FreezeOut."

FreezeOut directly accelerates training by annealing layer-wise learning rates to zero on a set schedule, and excluding layers from the backward pass once their learning rate bottoms out.

I had this idea while replying to a reddit comment at 4AM. I threw it in an experiment, and it just worked out of the box (with linear scaling and t_0=0.5), so I went on a 96-hour SCIENCE binge, and now, here we are.

The exact speedup you get depends on how much error you can tolerate--higher speedups appear to come at the cost of an increase in error, but speedups below 20% should be within a 3% relative error envelope, and speedups around 10% seem to incur no error cost for Scaled Cubic and Unscaled Linear strategies.

Installation

To run this script, you will need PyTorch and a CUDA-capable GPU. If you wish to run it on CPU, just remove all the .cuda() calls.

Running

To run with default parameters, simply call

python train.py

This will by default download CIFAR-100, split it into train, valid, and test sets, then train a k=12 L=76 DenseNet-BC using SGD with Nesterov Momentum.

This script supports command line arguments for a variety of parameters, with the FreezeOut specific parameters being:

how_scale selects which annealing strategy to use, among linear, squared, and cubic. Cubic by default.
scale_lr determines whether to scale initial learning rates based on t_i. True by default.
t_0 is a float between 0 and 1 that decides how far into training to freeze the first layer. 0.8 (pre-cubed) by default.
const_time is an experimental setting that increases the number of epochs based on the estimated speedup, in order to match the total training time against a non-FreezeOut baseline. I have not validated if this is worthwhile or not.

You can also set the name of the weights and the metrics log, which model to use, how many epochs to train for, etc.

If you want to calculate an estimated speedup for a given strategy and t_0 value, use the calc_speedup() function in utils.py.

Notes

If you know how to implement this in a static-graph framework (specifically TensorFlow or Caffe2), shoot me an email! It's really easy to do with dynamic graphs, but I believe it to be possible with some simple conditionals in a static graph.

There's (at least) one typo in the paper where it defines the learning rate schedule, there should be a 1/2 in front of alpha.

Acknowledgments

DenseNet code stolen in a daring midnight heist from Brandon Amos: https://github.com/bamos/densenet.pytorch
Training and Progress code acquired in a drunken game of SpearPong with Jan Schlüter: https://github.com/Lasagne/Recipes/tree/master/papers/densenet
Metrics Logging code extracted from ancient diary of Daniel Maturana: https://github.com/dimatura/voxnet
WideResNet code summoned using an incantation from Xternalz: https://github.com/xternalz/WideResNet-pytorch

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 196

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗